18-348 Lab #5

Spring 2015

NOTE: Lab 5 consists of two components (Lab 5 Part A and Lab 5 Part B).

Relevant lectures:
- Part A: Lecture 8. Memory and Memory Bus
- Part B: Lecture 9. Economics and Code Optimization

Links to all files referenced in the lab and prelab can be found in the Files section at the end of this document.

Caution -- this lab has the most challenging hardware construction portion of the course -- start early on building your hardware!!

Pre-Lab 5 - Part A:

Goal: To familiarize you with the operation of a bidirectional memory bus.

Discussion:
A bidirectional memory bus provides an interface between a CPU and a block of memory. It is bidirectional because it allows the CPU to both write values to the memory and read values from the memory. The bus is operated using a pair of control lines - the Addr.H/Data.L line and the Write.H/Read.L line. Asserting Addr.H indicates that a memory address is being driven on the bus, while asserting Data.L indicates that data is being driven on the bus. Asserting Write.H indicates the CPU is driving data on the bus to write a memory address, while asserting Read.L indicates the CPU will read the data asserted on the bus by the memory or I/O device.

Consider the memory bus schematic and the timing diagram below to answer the questions. If you need to know how a specific part of the circuit works, refer to the corresponding data sheet for that part. Data sheets are listed in the Relevant Reading section below.
For the timing diagram, assume that all timing requirements (e.g., setup time) are met.
Our circuit emulates a memory bus and two memory-mapped I/O ports. We have an 8-bit bus. Our memory space has an 8-bit address space and stores 8-bits of data at each address. However, to keep the hardware simple for you, only one address is available for reading ($9C) and only one address is available for writing ($CC). (I/O ports are often are read-only or write-only, so this shows you one way to set up such ports.) When we read from $9C, we read the value on the 8-bit switch. When we write to $CC, we store the value on the LED display and the LEDs display the inverted bit value as done in previous labs.

Figure 1: Memory Bus Timing Diagram.

Procedure:
None. This section only involves answering questions.

Questions:
For each question below, assume the memory bus is configured as shown in the memory bus schematic.

1. For Figure 1, in 15 words or fewer per item, briefly describe what is happening at each of T1 through T6, including whether that time instant is part of a read cycle or part of a write cycle.
T1:
T2:
T3:
T4:
T5:
T6:

2. Suppose the current state of the memory is reflected by the values in the table below:

Address	Value
$9C	$A4
$CC	Undefined

We want to read the value at address $9C, then write the one's complement of that value to address $CC. Based on this sequence, indicate the value on the Address/Data bus at each time (use the values listed in the timing diagram in Figure 1 above, e.g. Address1, Address2, Data1, Data2). Additionally, indicate which element in the circuit is driving the value on the Address/Data bus. You should refer to the chips by their designation in the schematic (e.g. "U14" refers to the 74LS373 latch).

	Value	Driving Element
Address1
Data1
Address2
Data2

3. Using the values you obtained in question 2, identify the values of these internal lines of the memory bus circuit at the corresponding time instance. (Note: we are asking for the pin output values, and not the value "inside" the register if it isn't being driven onto the output pins.) Use a hexadecimal notation to indicate a range (0 is the LSB). For example, if Q7, Q6 and Q3 are 1 and the rest of the values are 0, then the value you would enter is "$C8". Use a "Z" to indicate high-impedance/disabled output. Use an "X" if the value is unknown, such as when it depends on operations that occurred before the timing diagram started.

Time	U14 Q7:0 (Address Latch)	U9 Q7:0 (Data Register)	U7 B7:0 (Read Buffer)
T1
T2
T3
T4
T5
T6

4. (Bonus) Submit a revised schematic that shows the setup for decoding the addresses $A3 for read and $B3 for write. Include only connections to U14, U10, U11, U16A and U15A -- you can omit the rest because they should be the same. If you have to add an inverter or two then do so. NEATLY hand-drawn of just those chips and their connections and scanned is OK. Better is probably to edit or draw over the existing schematic. (For example, white-out connections and draw new ones.)

Pre-Lab 5 - Part B:

Goal:

To understand how to take advantage of the HC12 compiler optimizations and restructuring C code to improve code execution speed.

Discussion:

The function of a compiler is to translate high level code (in this case, C code) to assembly. In order to do this and do it efficiently, the compiler must manage resources such as stack memory, registers, and code allocation. But no compiler is perfect, and embedded system compilers are often very far from perfect. You can help the compiler by writing code that it is good at optimizing. In this lab, you will take a function which adds an array of numbers and implement a new C language version of the function so that it compiles more efficiently.

Procedure:

Part 1:

Create a new C project. Be sure to include the Full Chip Simulator as a build target. Download prelab_5b_skeleton.c. Rename it to prelab_5b_andrewid.c (with your appropriate andrew ID) and replace the main.c file in your project with it.
Implement the ptr_add( ) function using pointer math instead of array references. This course assumes you know C; if you are rusty on pointers a web search on the keywords (C pointer tutorial) will reveal many sources of information. (Note: you will have to comment out the function ptr_add2 for this part to get the code to compile cleanly.)
You are NOT permitted to use loop unrolling to speed up the code. (Loop unrolling generally involves making copies of lines of code to perform repeated operations in-line to avoid loop overhead. If you aren't sure if you are doing loop unrolling, ask a TA.)
Your code must work with any valid input and outupt parameters (e.g., with an array that is a different size). You are NOT permitted to modify the parameter inputs (the "signature") of the functions. You are NOT permitted to directly access global variables that might be passed to the routine, bypassing the parameter passing mechanism.
Verify the compiler options. Select Edit > Full-Chip Simulation Settings from the menu (you must have the full-chip simulator as the current target). Navigate to Target > Compiler for HC12 . The value in "Command Line Arguments" SHALL be " -CpuHCS12 -D__NO_FLOAT__ -Ms "
Run your ptr_add( ) function using the full-chip simulator. Verify that it gives the same result as subscr_add( ).
Use the simulator to measure the number of cycles the subroutine takes to run. Your cycle count shall include all the compiler-generated setup and cleanup code The simplest way to measure this is to place a breakpoint in the C-code at the function call, record the initial cycle count value, then Single Step Over (F10) the function call. For full credit, optimize the subroutine code in C so that it executes in fewer than 1560 cycles (this shouldn't be too tricky if you use pointer code).

Bonus (Optional):

For bonus credit, implement ptr_add2( ). You are NOT permitted to use loop unrolling to speed up the code. (Loop unrolling is a valid optimization in many contexts, but to keep this lab simple enough to fit within available time and still teach the relevant concepts, we have to impose this constraint on optimization.)
For this part, you may make any changes to code and enable additional compiler optimizations as you wish. Use the " Options" button in the Compiler for HC12 area of the simulation settings (see step 3 in part 1).
Run the code and verify that the results for ptr_add2( ) correspond to those for subscr_add( ).
Measure the cycle count to execute the function as described in step 5 of part 1. To receive bonus credit, your implementation must execute in fewer than 960 cycles.

Questions:

Note that all questions from this section MUST reflect the results you obtain with the default compiler settings, as described in Part 1, Step 3.

Record the values returned for each function and the number of cycles for execution. For reference, the values for subscr_add( ) have been included.

Function Name
Return value
# of cycles

subscr_add( )
15150
2236

ptr_add( )
Disassemble the C-code and measure the footprint (number of program instruction bytes consumed by the entire subroutine) of the code for each function (from the function label to the RTS). Record the values in the table below.

Function Name
Footprint
(decimal number of bytes)

subscr_add( )

ptr_add( )
(Bonus) For this question, you may modify compiler settings. Fill in the table below for the ptr_add2( ) function.

Function Name
Return value
# of cycles
Memory Footprint
(decimal bytes)

ptr_add2( )

Hand-in Checklist (65 + 13 points):

All non-code submissions shall be in a single PDF document.

Part A:

(20 points) Answers to the questions above.
(Bonus: 4 points) Answer the bonus question.

Part B:

(25 pts) Submit code for prelab_5b.c. Code must conform to course style sheet to obtain full credit.
(20 pts) Turn in the answers to questions 1 & 2 above
(Bonus: 9 pts) Include the implementation of ptr_add2( ) in your prelab_5b.c. Turn in the answers to the bonus question 3. The code must be present and the answers to the bonus questions must be included with your writeup to obtain any bonus credit.

The pre-lab for this assignment is intentionally simple because the lab itself will be moderately time consuming. We strongly suggest you also start wiring the circuit on Friday, and not wait until Monday to begin construction. This is the most complicated hardware you will construct for this course, and if you are not experienced it might take some time to get right.

Refer to the LAB FAQ for more information on lab hand-in procedures and file type requirements. You MUST follow these procedures or we will not accept your submissions.

Lab 5 - Part A

Goal: Demonstrate that you can integrate the MCU with an external bidirectional memory bus that you build.

Discussion:

You will implement read/write interface for the MCU. We have provided you with a project shell. It contains three files of interest - main.c, EmuIO.h, and EmuIO.c. main.c contains a simple program that reads from memory location $9C and writes the value it reads to memory location $CC. EmuIO.h contains function prototypes to interface with memory bus. EmuIO.c contains the function definitions for the functions prototyped in EmuIO.h.

Your job is to implement the emu_IO_init(), write_byte(), and read_byte() functions in EmuIO.c so that the main function will operate correctly. Use PTT as the 8-bit bidirectional bus (pin PT0 should correspond to the LSB of the data bus), PORTA0 as Addr.H/Data.L control line, and PORTB4 as the Write.H/Read.L control line.

Hints:

Refer to the timing diagram from the pre-lab to infer what signals must be manipulated during the read and write cycles in order to read and write values.
Since the bus is bidirectional, you will have to change the direction of PTT during the execution of the program (not just during initialization). Each time you change the direction of a port, a few cycles of setup time need to pass before the port will work correctly. Whenever you change the direction of a port, use the in-line assembly operation asm("nop"); to allow give enough time for setup requirements.

Procedure:

Part 1:

Make sure the project board is powered down and disconnected from the PC. Set all jumpers to default positions.
Disconnect the USER jumpers on the APS12C128 module.
Disconnect the UFEA jumpers on the project board.
Wire the board according to the memory bus schematic and the instructions above.
Follow the lab safety procedures to check your circuit before powering the project board.

Part 2:

Download the lab_5a_c.zip file. Extract the project and open it in Code Warrior.
Implement the functions in EmuIO.c with the following prototypes:
- void write_byte(unsigned char addr, unsigned char data);
- unsigned char read_byte(unsigned char addr);
Values read from the switch shall use the following convention. This may require software manipulation of data values to attain correct value polarity.
- A "1" bit shall be indicated by the corresponding switch moved to the "ON" position.
- A "0" bit shall be indicated by the corresponding switch moved to the position away from "ON".
Values written to the LED display shall use the following convention. This may require software manipulation of data values to attain correct value polarity.
- A "1" bit shall be indicated by the corresponding LED being lit.
- A "0" bit shall be indicated by the corresponding LED being unlit.
Write a program to read switch values and put the switch value out on the LEDs. Test to see that it works properly by changing one switch at a time and observing correct LED changes.

(Bonus) Part 3:

Set the switch value to $45. Check that the value $45 is placed on the LEDs. Using a logic probe, measure values listed in the following table.
Single step through the program until you reach the appropriate T1 through T6 point and measure signals at that point. (hint: this is a measurement of the values you predicted would be present in the pre-lab 5 part A).
If there are any differences between this table and your prelab predicted values, explain why (maximum 200 words)

	U14 Q7:0	U9 Q7:0	U7 B7:0
T1
T2
T3
T4
T5
T6

(Bonus) Part 4:

Write a program that has the LEDs "chase" each other (i.e., some sort of circular pattern) each other. Vary the speed to 256 different speeds using the DIP switch settings. For a DIP switch input of 0 the pattern should be stopped, and for a setting of 255 the pattern should be so fast as to be blurred.

Part A - Demo Checklist: (60 + 5 points)

(60 points) Demo the program to the TA.
(Bonus: 5 points) Demo the bonus program 2 (part 4) to the TA.
There are no demo points for Bonus 1 (part 3).

Lab 5 - Part B

There is no lab procedure for lab 5 part B. You are only required to demo the execution times of your code to the TA using the simulator in the lab.

Part B - Demo Checklist: (35 + 5 points)

To receive credit, the cycle times for your demo must be less than or equal to the values you included in your prelab writeup. If your demo takes more cycles than the values reflected in your prelab, you will not get credit for that part of the prelab or the demo. (To ensure fairness, trivial differences of fewer than 10 clock cycles difference will be ignored in enforcing this requirement so long as your prelab analysis was performed in good faith.)

(35 pts) Demo either partner's prelab_5b.c from your prelab with the default compiler options (see Part 1, Step 3 of the prelab). Demonstrate the cycle count for subscr_add( ) and ptr_add( ) to the TA. Verify that the return values match. The TA may ask you to change the values in the array.
(Bonus: 5 pts) Demo either partner's prelab_5b.c with any compiler optimizations you wish. Demonstrate the cycle count of ptr_add2( ) to the TA.

Hand-in Checklist: (90 + 13 points)

Part A:

(5 points) List any problems you encountered in the lab and pre-lab, and suggestions for future improvement of this lab. If none, then state so to get these points.
(75 points) Submit a listing of your code for Part 2. Submit only the EmuIO.c file. Code must follow the coding standard for the course to receive full credit.
(Bonus: 5 points) Complete the table in Part 3 and answer the accompanying question
(Bonus: 5 points) Submit a listing of your bonus program from Part 4. (You must successfully demo to receive the lab writeup bonus.)

Part B:

(5 points) List any problems you encountered in the pre-lab, and suggestions for future improvement of this lab. If none, then state so to get these points.
(5 points) Submit code files containing subscr_add() and ptr_add() functions. Code must be fully commented to receive full credit.
(Bonus: 3 points) Submit code files containing ptr_add2() function. Code must be fully commented to receive full credit.

NOTE: Code listings for lab 5 part B may differ from the prelab, especially for ptr_add2(), if you discover further optimizations after prelab submission.

Refer to the LAB FAQ for more information on lab hand-in procedures and file type requirements. You MUST follow these procedures or we will not accept your submissions.

Hints and Suggestions:

Part A:

The convention for buses on schematics is that the lowest bit of the bus goes to the lowest bit of a multi-bit chip. So, for example, PTT[0] is the same wire as MCU ADDR[0], is also the same wire as DATA[0], is connected to U14 pin 3 (D0), and is connected to U8 pin 18 (B0). The orientation of the labels is irrelevant but potentially confusing (but, this is the convention so we follow it). For example, even though the label "MCU ADDR[0..7]" might appear to suggest that bit 0 goes to D7 and bit 7 goes to D0, this is incorrect -- bit 0 of the wires goes to D0 regardless of label orientation.
PB4 refers to Port B bit 4, and NOT push-button 4.
Keep in mind that when the outputs of a chip are high impedance, it is possible that some OTHER source is driving those pin values. The actual wires are high impedance (completely floating) only when ALL drivers associated with them are in high impedance! And even then, pull-up resistors will pull high impedance buses up to about +5V.
Be sure to wire each chip's power and ground directly to power and ground distribution bars on the proto-board to get cleaner power and less voltage drop for power supplies. Don't "daisy-chain" power and ground wires from chip to chip.
We strongly suggest you draw diagrams of each size chip showing the pin numbers. You might even want to write pin numbers on white stickly labels and stick them onto the chips (if you are good at writing tiny numbers). It is so easy to get the pin numbers wrong and mis-wire until you've had a lot of practice! And, of course, make sure you don't put DIPS in upside down -- put the pin 1 notch to be in the same direction for all your DIPS.

Part B:

The "footprint" of a program is how many bytes of memory it takes. A bigger "footprint" takes more memory.
Note that compilers might have entirely different optimization approaches for different types of loops (e.g., for vs. while loops).
We expect you'll have to play around with the optimizer settings for the bonus question. This is what people do in real life too, so this is meant to give you a taste of what optimization involves. If you're not sure what optimizations to try, look at the generated code and try to pick an optimization that sounds relevant. The Code Warrior documention and on-line help explain different optimizations.
In a real system, loop unrolling is often too expensive because it increases the memory footprint too much. We just said "don't use it" for this lab to avoid having to put arbitrary restrictions on size for what is probably a very small program. But, when you scale up to bigger real-world programs, this becomes a significant issue. The prohibition on loop unrolling includes the optimizer flag (but we've found the compiler's loop unrolling isn't very good anyway).
While you can save a little space by exploiting the fact that some variables and constants are globally defined -- don't do it! We require that you pass all variables in and out as parameters rather than directly setting globals. Sure, you could get away with exploiting the situation in this toy program, but it would make your subroutine useless in a real program that had to be called with multiple different sets of parameters. (We've seen industry code that references globals in subroutines and then has to have many versions of the subroutine depending on which globals need to be processed. It wasn't a pretty sight and it all had to be rewritten.)
There was one report that the -Oc option generated a warning message under certain optimization settings. If this happens to you, either change the optimization settings or just delete the -Oc option for final compilation of the bonus section.

FILES for this lab:

Part A:

Part B:

Relevant reading:

Part A:

Part B:

HC12 Compiler Reference Manual

Also, see the course materials repository page.

Change notes for 2015:

Function Name	Return value	# of cycles
subscr_add( )	15150	2236
ptr_add( )

Function Name	Footprint (decimal number of bytes)
subscr_add( )
ptr_add( )