18-348 Lab #4
Spring 2015
NOTE: Lab 4 consists of two components (Lab 4 Part A and Lab 4 Part
B).
Relevant lectures:
- Part A: Lecture 6. Embedded Language Use
- Part B: Lecture 7. Coding Tricks; Multiprecision Math; Reviews
NOTE for Lab Part A: Recall from lecture that neither
the one's nor two's checksums can detect all two-bit errors. Whether the
checksums can detect the errors depends on BOTH the data values (string) and
which bits are flipped. For example, try the string "ECE348" to see
the difference in detection capability between one's and two's checksums.
NOTE: The HC12 compiler manual can be confusing with
regards to calling conventions. Functions with a fixed number of parameters use
the Pascal calling convention, which is pushing parameters left to right and
caller removing parameters from stack. The C calling convention (which the
documentation notes is pushed right to left) is only used for functions with a
variable number of parameters, which we don't use in this lab. Our compiler
pushes parameters from left to right. When in doubt, trust what the compiler
does, not what the compiler manual says it does.
Links to all files referenced in the lab and prelab can be found in the
Files section at the end of this document. You might wish to read the lab
assignment before starting work on the Pre-lab to help with understanding how
to link C to assembly language.
Pre-Lab 4 - Part A:
Goal:
To learn programming patterns for embedded systems and interactions between
C and assembly programming.
Discussion:
Bitwise operations in C:
In this lab, you will practice using C language constructs and operators to
do bitwise operations. The most common language constructs used to
implement bitwise operations are:
Operator (in context)
|
Usage |
Description
|
&
|
a & b |
bitwise AND two values
|
|
|
a | b |
bitwise OR two values
|
~
|
~a |
bitwise invert a value
|
^
|
a ^ b |
bitwise XOR two values
|
<<
|
b << n |
shift the bits in b to the left by n
bits (the upper n bits of b are lost).
|
>>
|
b >> n |
shift the bits in b to the right by n
bits (the lower n bits of b are lost).
The shift is an arithmetic for signed integers, but it is a logical shift
right for unsigned integers.
|
The bitwise AND, OR, and INVERT
(&, |, ~) should be distinguished from the
logical AND, OR, and NOT (&&,
||, !). The logical operations produce boolean results where any zero
value is treated as FALSE and any non-zero value is treated as TRUE. The
output of a logical operation a C program conforming to standards is either
zero or one. The following examples illustrate this:
a = 0xA0;
b = 0x05;
c = a | b; //bitwise operation: c = 0xA5
d = a || b; //logical operation: d = 1
e = a & (~b); //bitwise operation: e = 0xA0
f = a && (!b); //logical operation: f = 0
In general, it is considered bad practice to mix logical and bitwise
operations in the same C statement or same line of code because of the
potential for confusion.
These operators can be combined to set and clear specific bits in a
value. For example:
Statement
|
Function
|
b = b & 0xF0;
|
clear
the lower four bits of b (AND with zero is always zero)
|
b = b | 0x0F;
|
set
the lower four bits of b (OR with one is always one)
|
b = b & (~a);
|
for each bit set to a 1 in a, clear that bit
in b (INVERT ones to zeros and then AND); leave other bits in b as they are.
|
Writing C call stack compatible assembly subroutines
To correctly write an assembly subroutine that interfaces with C code, you
must consider each of the following aspects of the call. The list below
refers to the example function "unsigned int foo(unsigned int bar,
unsigned char bar2, unsigned int bar3)". For more information, refer
to "Call Protocol and Calling Conventions on page 526 of the
HC12 Compiler Reference
Manual.
- Location of parameters - The HC12 compiler pushes parameters onto the stack
from left to right. So "bar" is the first parameter, then
"bar2". The final parameter is passed in a register. In
this case, the D register is used because the value is 16 bits. The
specific register(s) used depends on the size of the parameter.
- Return values - For return values of size four bytes or fewer, the value is
returned using registers. The specific registers used depends on the
return value size. In the example return value would be expected in the D
register since it is a 16-bit value. For values larger than four bytes, a
memory address pointer is used.
- Register values - In general, save all register values and restore them so
that you don't disrupt computations in the calling routine, which might be
using the registers to hold intermediate computation values. The
exception to this rule is the register used to pass the final parameter and the
register used to pass the return value.
Procedure:
Part 1:
- Start a new C project using the HC(S)12 Project Wizard. Be sure to
include the Full-Chip Simulator as a Target. Replace the main.c file with
prelab_4a_prog1.skeleton.c.
- Replace the string value in data
with the appropriate string for your lab section and group.
- Implement the setBit and clearBit functions. Use them to invert the
case of the string (e.g. "DemO" becomes "dEMo").
- Implement the countBits function.
- Run your program in the simulator. Use the memory window to locate
data where it has been placed on the
stack. Observe that the program correctly inverts the case.
- Record the data required to answer question 1 below.
- Hand in program as prelab_4a_prog1_XX.c where "XX" is your Andrew
ID.
Part 2:
- Download the lab_4a_c_asm.zip file.
Extract the project and open it in Code Warrior. Append your andrewID to the
three prelab_4a_prog2* filenames. Be sure to edit the #include in
prelab_4a_prog2.c to point to the new header filename
- Add the following function prototypes to the prelab_4a_prog2_asm.h file.
- uint16 bitReverse(uint16 value);
- uint16 addALot(uint16 val1, uint16 val2, uint16
val3, uint16 val4, uint16 val5);
- Open prelab_4a_prog2.c. Modify the calls to bitReverse and addALot to
the appropriate value for your lab group.
- Open prelab_4a_prog2.asm.
- Implement the function bitReverse in assembly. This function takes
the passed parameter "value" and reverses the bits. The value
should be returned according to HC12 compiler convention. In order to
receive full credit, your implementation MUST use the carry bit to transfer bit
values between registers. (Hint: Look at section 5.14 in the
CPU12 reference
manual).
- Implement the function addALot in assembly. The function should add
the 5 unsigned integer parameters and return the result according to the HC12
compiler convention. For this function, overflow is allowed (i.e. ignore
the carry bit in your addition functions).
- Run the simulator on your code. Use the memory area to observe the
stack push operations and verify the functionality of your implementation of
bitReverse and addALot.
- Record the values for question 2 and 3 below.
Part A - Questions:
- Enter the information observed from Part 1 prelab_4_prog1_gxx.c in the
table below:
Parameter
|
Value
|
Original String in
data
|
|
Bit count (number of "1" bits)
|
|
Address in memory where byte 0 of
data is located.
|
|
Address in memory where byte 4 of
data is located.
|
|
Hexadecimal representation of the original
values in data (before case conversion)
|
|
Hexadecimal representation of
data after case conversion
|
|
- Draw the stack frame for the call to addALot just after the subroutine call
is made (just after the BSR or JSR completes execution). Indicate what
each value represents (e.g. "val1 low byte, val1 high byte"
etc). Indicate the location of any function parameters not located on the
stack.
- Enter the information observed from Part 2 in the table below:
Parameter
|
Value
|
ASCII characters for call to bit reverse
|
|
Hexadecimal value of result from reverse value
|
|
Hexadecimal value of the result from call to
addALot
|
|
- Bonus: Write a bit count subroutine in assembly language that uses a loop
(not a lookup table) to count the bits in an 8-bit integer register. It must
use a loop to do the actual counting. In this loop, you are only allowed to use
the following three instructions: Branch if Not Equal to Zero (BNE), Add with
carry, and Logical Shift Right (but not necessarily in that order). You can use
whichever variants of these instructions makes sense depending on the registers
you use (e.g., LSR, LSRA, LSRB, LSRD are all acceptable Logical Shift Right
variants). Other instructions can be used before and after the loop, but only
these three may be used within the loop that counts bits. When the loop
terminates, some register has the number of bits. Write a test program in C to
confirm that the subroutine works properly. You will only receive credit for
meeting all the requirements of this question -- no partial credit. If you use
an instruction other than the three specified within the loop -- then no
credit for you!
Pre-Lab 4 - Part B:
Goals:
- To learn about integer operations and multiprecision arithmetic
- To perform simple peer reviews
Discussion:
Refer to the lecture notes for information on integer division.
Procedure:
Part 1:
- Complete the table below. The second column is the binary
representation of the number in the first column. The third column is the
number in the first column divided by 2 using integer division. The
fourth column is the binary representation of the numbers in the third
column. Some values have been filled in as examples
Signed
4-Bit Integer (decimal representation)
|
Signed
4-bit Integer (binary representation
|
Integer
Division (discard remainder)
|
i/2
(decimal representation)
|
i/2
(binary representation)
|
7
|
0111
|
3
|
0011
|
6
|
|
|
|
5
|
|
|
|
4
|
|
|
|
3
|
|
|
|
2
|
|
|
|
1
|
|
|
|
0
|
0000
|
|
|
-1
|
1111
|
0
|
0000
|
-2
|
|
|
|
-3
|
|
|
|
-4
|
|
|
|
-5
|
|
|
|
-6
|
|
|
|
-7
|
|
|
|
-8
|
1000
|
-4
|
1100 |
Part 2:
- Create a new assemble project using the
project stationery.
Download the prelab_4b_skeleton.asm file
and rename it prelab_4b_gXX_andrewid.asm. Replace the main file with
prelab_4b_gXX_andrewid.asm.
- The main section of the code is marked off with comments that tell you not
to modify the code. DO NOT MODIFY THIS CODE. Any
modification of the code in this section will result in no credit being given
for this part of the assignment. You are allowed to modify the code in
the divByTwo subroutine.
- Use the values in the table to develop a short description of the algorithm needed to implement a 16-bit signed divide by 2. (Note that you should use shift and bit set / test instructions; no variation of the DIV instruction is allowed.) Your description should be less than 100 words.
- Implement the divByTwo subroutine using the algorithm you described above.
- Set the target for the project to the full-chip simulator. Run the
simulator.
- Step through the code to get a feel for what it is doing. The code
runs the divByTwo subroutine, and does the same operation using the IDIVS
instruction, then compares the results.
- In the simulator, set a breakpoint at BP1 and BP2. Run the
simulation. If the simulation reaches the BP2 breakpoint without reaching
the BP1 breakpoint, then your divByTwo performs the expected function for all
values between 0x0000 and 0xFFFF.
Part 3:
Write, but DO NOT SIMULATE and DO NOT DEBUG code for this section. We
EXPECT that your hand-in will have bugs, and do NOT expect
that it will run properly! But it should be syntactically correct so that it
will "make" and so that all the parts have something that is
"close" to right. (We expect you will make a good faith attempt to
actually write a reasonable program, and not submit something laughable.) Just
to be clear -- your grade will be based on whether the code looks like an OK
first draft with bugs, is commented properly, and so on, but NOT on whether it
is bug-free! There might be at most one person in the class who can
write bug-free code without executing something, but probably not. So don't
worry about it.
Each partner will implement one distinct version of 32-bit x 32-bit ->
64-bit unsigned multiply. You should choose from the three methods described in
the lecture materials:
- Shift-and-add
- Partial sums using built-in MUL functions
- Partial sums using table lookups for multiply.
Note: You will end up doing all three implementations in the lab, but
you are only required to do two (one for each partner) for the pre-lab. This
means each partner must do a different implementation. Do this work
independently from your partner. You will use this code to conduct reviews
in the Lab section.
- DO NOT simulate, run, or use a
debugger on the code you have written before doing the review. We want
you to do the review first so you can try to find bugs in the implementation.
In most cases this saves a lot of time compared to debugging using the
simulator. So, put another way, we expect you will have bugs in your
code going into the review. Finding them more efficiently with help from your
partner is the point of the exercise. In previous years, it was common for
these reviews to turn up 3 to 5 defects with only a few minutes of effort. For
future projects it will be OK to do a preliminary simulate and debug session
before review -- but this project is simple enough that we want to make sure
you have bugs to find in the review, so do the review before debugging.
- Plan on your board being wired as described in Part A of Lab 4.
- Create a new assembly project using the
project stationery.
Download the lab_4b_skeleton.asm file and
rename it lab_4b_gXX_andrewid.asm. Replace the main file in the project
with this file.
- Implement one of the above mul64 subroutines to meet the following
requirements:
- The subroutine shall multiply the 32-bit data stored in arg1 to
arg2 and store the 64-bit sum in result
Additionally,
- The subroutine shall preserve all register values and restore them before
returning control to the main loop.
- The subroutine may use 8-bit or 16-bit operations to implement the 64-bit
operation.
- We recommend that you use the IDX2 addressing mode to read and write the
arguments and results, but any method you choose is acceptable so long as it
uses a loop.
- For the demo, the main program shall use the pushbuttons to display the
results, as described in the code comments. But you are NOT required to
implement this for the prelab.
Part B - Questions:
- Complete the table from part 1 and hand it in.
- Include a description of your signed divide-by-two algorithm (from part 2). Limit your description to 100 words.
- Bonus: Give the number of instruction cycles for the IDIVS
instruction. Give the number of instruction cycles for the longest path
through your divByTwo subroutine (include all cycles from just before the BSR
to just after the RTS). Which is faster and by how much?
Prelab Hand-in Checklist: (90 + 18 points)
All non-code submissions shall be in a single PDF document.
Part A
- (15 pts) Submit your code listing for Part 1 as prelab_4a_prog1_XX.c where
XX is your Andrew ID. Submit only the C file. Code must be fully commented to
receive full credit.
- (15 pts) Submit the entire project.
for Part 2. Your project should be in a folder/subdirectory of your
hand-in directory. Name the folder "prelab_4a_asm_c_XX" where
XX is your Andrew ID. All files necessary to open the project in Code
Warrior and invoke the simulator must be present to receive full credit.
Code must be fully commented to receive full credit.
- (15) Submit the answers to the questions 1-3 above.
- (BONUS 9 points) Submit code for the bonus question. Submit the entire project
ready to open and build/execute with the Code Warrior simulator. Name the
folder "prelab_4a_bonus_XX" where XX is your Andrew ID.
Part B
- (20 pts) Answers the questions 1 & 2 above.
- (BONUS 9 points) Answer the bonus question
- (15 pts) Submit the prelab_4b_gXX_andrewid.asm file. Code must be
fully commented for full credit. Code should work properly and be bug-free.
- (10 pts) Submit the lab_4b_gXX_andrewid.asm file. Code must be fully
commented for full credit. Code is EXPECTED to have bugs and you will not be
penalized for them.
Refer to the LAB FAQ for more information on lab
hand-in procedures and file type requirements. You MUST follow these
procedures or we will not accept your submissions.
Lab 4 - Part A
Goal:
To practice combining C code with assembly using the HC12 compiler.
Discussion:
Mixing C and Assembly with the HC12 Compiler
This section discusses the techniques for mixing C and assembly.
Remember that the stack frame for a subroutine call represents a contract
between the calling code and the subroutine code. In this case, the
compiler has a specific format for the stack frame. In order to write
compatible C code, you must make sure that your code conforms to this format.
There are numerous compiler options and PRAGMA options that can be used to
modify the behavior of the compiler with respect to call stacks. A full
discussion is beyond the scope of this course. The discussion below and
the lab assignments refer to the compiler behavior using the default settings.
To have the CodeWarrior environment integrate C and assembly:
- Create a new project using the HC(S)12 New Project Wizard.
- Select both C and Assembly
- Follow normal procedures for selecting the rest of the wizard options
This procedure gives you 3 source files, which are described below.
- main.c - This file contains the main function that is called when the
program is executed. It is just like any other C file, except that it may
include references to functions defined in the main_asm.h file.
- main_asm.h - This is a C-style include file where you can define the
C-style functional prototypes for your assembly subroutines. In the
default project files, the main_asm() definition gives an example of this.
- main.asm - This file contains the implementations of the assembly
subroutines. They should be started with a label that is the same as the
function name and ended with the RTS instruction[Note 1]. In addition to
the function definition, the "XDEF fcn_name" directive must be
included. This exports the symbol for the function so that the linker can
combine the assembly and C code. In the default project files, the
main_asm function demonstrates these features.
Note: a function defined using the __far directive should
return with RTC (3-byte return value). A full discussion of this is
beyond the scope of this course. For the labs, assume all functions are
called using __near, so they use RTS to return.
To correctly write an assembly subroutine that interfaces with C code, you
must consider each of the following aspects of the call. The list below
refers to the example function "unsigned int foo(unsigned int bar,
unsigned char bar2, unsigned int bar3)". For more information, refer
to "Call Protocol and Calling Conventions on page 526 of the
HC12 Compiler Reference
Manual
- Location of parameters - The HC12 compiler pushes parameters onto the stack
from left to right. So "bar" is the first parameter, then
"bar2". The final parameter is passed in a register. In
this case, the D register is used because the value is 16-bits. The
specific register(s) used depends on the size of the parameter.
- Return values - For return values of size four bytes or less, the value is
returned using registers. The specific registers used depends on the
return value size. In the example return value would be expected in the D
register since it is a 16-bit value. For values larger than four bytes, a
memory address pointer is used.
- Register values - In general, save all register values and restore
them. The exception to this rule is the register used to pass the final
parameter and the register used to pass the return value.
Checksum Computation
A checksum is an error detection code used by many different embedded and
enterprise applications. It is commonly used to provide redundancy for
network messages and data storage. On networks, it allows the receiver to
check for transmission errors. In the case of storage, it allows a system
to verify that the stored data has not changed (e.g. due to file system
corruption or soft errors in memory).
To check the correctness of a message + checksum pair, the system
recomputes the checksum and compares it to the recorded one. If the two
checksums do not match, then the system knows that there is an error somewhere
in the message. If the two checksums do match, then the message is
presumed to be correct. Note that just because the message appears to be
consistent with the checksum does not guarantee that the message is the same as
the original one. With all checksums, it is possible to get errors that
modify the message or the stored checksum in such a way that they are still
consistent. This is called an undetected error. Note that the
error detection provided by checksums depends on both the value being checked
(i.e., the number of errors) as well as the location of the errors (e.g., some
2-bit errors may be caught while others are undetected due to their location).
This effect will be seen in the lab.
A two's complement checksum is computed by simply doing integer addition on
each "chunk" of data in a set of data. For our lab, this means doing
an integer addition of all the characters in a data string using 8-bit
addition. Overflows are ignored, and the 8-bit result of the addition is the
checksum. This checksum has the nice property of detecting all one-bit errors
in the data, and many other errors as well. But, some two-bit errors are
undetected.
A one's complement checksum is computed similarly, but using one's
complement arithmetic (remember that from 18-240?). To refresh your memory, in
one's complement arithmetic, the value "$FF" treated as equal to the
value "$00" -- they are both zero. So, when performing addition, you
need to check whether the sum will cross over the "$FF" to
"$00" boundary, and add one if it does so that both representations
of zero end up being equivalen in value. This can be done with a conditional
branch that checks whether either of the following two conditions holds true
for signed values and adds one to the resultant sum whenever either
condition is met:
- Exactly one input is negative and the resultant sum is greater than or equal to zero
- (Hint: using a bitwise OR followed by a bitwise XOR allows you to this with
only one value comparison)
- Both inputs are negative, regardless of the resultant sum value
- (Hint: using a bitwise AND allows you to do this with only one value
comparison)
In general, Cyclic Redundancy Codes (CRCs) provide much stronger error
detection properties than arithmetic checksums. A full discussion of the
details of the CRC algorithm is beyond the scope of this course, and the code
is a little too complex for this lab. But they are similar to other checksums
in that they involve "summing" up values across the length of
multiple bytes or words of data. We put this note here simply so that you do
not think that a one's complement checksum is the best you can do!
In your lab, you should repeat the computation for each byte in the string,
starting with a value of zero and the first byte, ending with the last non-zero
byte of the string. (This means that you should initilize the checksum
value to 0 before processing the first byte of the message).
Reference values to help you test your programs -- make sure you get these
results!
Input A |
Input B |
Two's complement A+B |
One's complement A+B |
$FF |
$FF |
$FE |
$00 |
$FE |
$83 |
$81 |
$82 |
$75 |
$A7 |
$1C |
$1D |
$B3 |
$56 |
$09 |
$0A |
$36 |
$42 |
$78 |
$78 |
$00 |
$00 |
$00 |
$00 |
$FF |
$00 |
$FF |
$00 |
String
|
Two's complement checksum |
One's complement checksum |
Bert Ernie
|
$A0 |
$A3 |
Ray Koopman
|
$21 |
$25 |
Procedure:
Part 1:
- Wire your board with port T as output and port AD as input according to the
following table:
MCU Pin
|
Project board connection
|
Port Configuration
|
AD0
|
PB1
|
input |
AD1
|
PB2
|
input |
AD2
|
PB3
|
input |
AD3
|
PB4
|
input |
AD4
|
PB5
|
input |
AD5
|
PB6
|
input |
AD6
|
PB7
|
input |
AD7
|
PB8
|
input |
PT0
|
LED1
|
output |
PT1 |
LED2 |
output |
PT2 |
LED3 |
output |
PT3 |
LED4 |
output |
PT4 |
LED5 |
output |
PT5 |
LED6 |
output |
PT6 |
LED7 |
output |
PT7 |
LED8 |
output |
- Create a project with a C main program called lab_4a_gXX that will contain
both C and assembly language files. Put your C code in the file
"main.c" and your assembly code in the file "main.asm". The
parts of the procedure below will guide you in creating a program that computes
checksums in multiple ways. In the end, all the programs must co-exist in a
single project (including the bonus if you choose to do it) with this single
hand-in directory.
- Take a look at the questions before working on the other parts of this
procedure so that you are sure to record the necessary data for the lab
writeup.
- Create four 8-bit integer variables: TwoSumC, OneSumC, OneSumAsm, and
OneSumOpt. Just initialize them to constants for now -- we'll tell you how to
compute them below.
- If no button is pressed, the LEDS shall be turned off.
- Pressing PB1 shall cause TwoSumC to be displayed on the LEDS.
- Pressing PB2 shall cause OneSumC to be displayed on the LEDS.
- Pressing PB3 shall cause OneSumAsm to be displayed on the LEDS.
- Pressing PB4 shall cause OneSumOpt to be displayed on the LEDS. --
optional; this only applies to the bonus section.
- Pressing PB7 shall cause bottom bits in first two characters to be flipped
(see later descriptions)
- Pressing PB8 shall cause top bits in first two characters to be flipped
(see later descriptions)
Values shall be displayed uninverted (i.e., an "ON" LED is 1, and an
"OFF" LED is 0). These button definitions will let you demo all
capabilities of your program on a single string without recompiling.
Part 2:
- Comment out references to the main_asm() function (you'll use it later in
part 4 and the bonus).
- Add the following declaration to main() function in main.c. Replace
LN1 and LN2 with the last names of the your group members.
- char myString[]="LN1 LN2";
- Implement an 8-bit two's complement checksum calculation using C. In the
main program, call this function and put the result in the variable
"TwoSumC". Compute the checksum over myString[] from the first
character until (but not including) the null byte at the end of the string. Use
the following prototype for your function:
unsigned char chk_two_c(char * string);
- Run this program and record the hexadecimal output as displayed on the
LEDs. Confirm that it is the correct value per hand computation. Also, use the
simulator to compute the number of clock cycles taken by the subroutine
chk_two_c from BSR/JSR to RTS.
- Add code to flip ("invert") the bottom bit in each of the first
two characters when PB7 is pressed so that the value is corrupted to put an
error in the value. (Re-iterating: this involves flipping bit 0 of the first
byte, and bit 0 of the second byte, resulting in two bytes, each with a
single-bit error in the lowest bit position.) Run the program and record the
output. Did the checksum detect this error?
- Modify the program so that the top bit in each of the first two characters
is flipped when PB8 is presssed, again putting an error in the value.
(Re-iterating: this involves flipping bit 7 of the first byte, and bit 7 of the
second byte, resulting in two bytes, each with a single-bit error in the
highest bit position.) Run the program and record the output. Did the checksum
detect this error? (It shouldn't detect the error -- the two flipped bits
cancel each other out in terms of effect on the checksum. This is a shortcoming
of two's complement addition checksums.)
Part 3:
- Implement an 8-bit one's complement checksum calculation using C. In the
main program, call this function and put the result in the variable
"OneSumC".
unsigned char chk_one_c(char * string);.
- Note: there are some very clever ways to speed up this computation -- but
you will receive full credit so long as it works correctly and is
understandable based on comments. You are not required to be
super-clever for this program!
- Run this program and record the hexadecimal output as displayed on the
LEDs. Confirm that it is the correct value per hand computation. Also, use the
simulator to compute the number of clock cycles taken by the subroutine
chk_one_c from BSR/JSR to RTS.
- Use PB7 to to flip ("invert") the bottom bit in each of the first
two characters so that the value is corrupted to put an error in the value. Run
the program and record the output. Did the checksum detect this error?
- Use PB8 to flip the top bit in each of the first two characters, again
putting an error in the value. Run the program and record the output. Did the
checksum detect this error? (It should -- which is why one's complement
checksums are usually better.)
Part 4:
- Write a new, similar, program that computes an 8-bit one's complement
checksum using assembly language with the calling program in C. In the
main program, call this function and put the result in the variable
"OneSumAsm"
unsigned char chk_one_asm(char * string);
- Note: there are some very clever ways to speed up this computation -- but
you will receive full credit so long as it works correctly and is
understandable based on comments. You are not required to be
super-clever for this program!
- Run this program with the specified test string and record the hexadecimal
output as displayed on the LEDs. Confirm that it is the correct value per hand
computation. Also, use the simulator to compute the number of clock cycles
taken by the subroutine chk_one_asm from BSR/JSR to RTS.
- Use PB7 to to flip ("invert") the bottom bit in each of the first
two characters so that the value is corrupted to put an error in the value. Run
the program and record the output. Did the checksum detect this error? (If not,
fix the problem.)
- Use PB8 to flip the top bit in each of the first two characters, again
putting an error in the value. Run the program and record the output. Did the
checksum detect this error? (It should -- which is why one's complement
checksums are usually better.)
- Verify that the assembly and C subroutines produce identical outputs for at
least four more-or-less randomly chosen different additional strings.
Part 5: (Bonus)
- Optimize chk_one_asm (still compute the one's complement checksum). You
must use a loop (you may use a conditional branch instruction) and use only an
8-bit sum register (not a 16-bit sum). (Hint: this involves using the carry-out
of the addition.) Put the result in variable "OneSumOpt".
- Record the timing for this optimized chk_one_asm. How much faster is it
than the previous assembly language? (If it isn't faster, then you probably
aren't doing this part right unless you found a very cool optimization for Part
4. Our result takes the same time inside the loop as a two's complement
checksum and minimal overhead outside the loop as well.)
- Record the checksum values with bottom and top bits flipped of the first
two characters as you did in previous parts.
Part A - Questions
- Record the results of your experiments above in the table below:
Routine
|
Checksum value with no bits flipped |
Checksum with two
bottom bits flipped |
Checksum with two
top bits flipped |
Part 2: TwoSumC
|
|
|
|
Part 3: OneSumC
|
|
|
|
Part 4: OneSumAsm
|
|
|
|
Part 5: (bonus) OneSumOpt
|
|
|
|
- Use the simulator feature of the CW development environment to obtain
execution times for the various versions of your program to process the data
string containing your last names. Record the values you measure in the table
below. Enter the total time to execute chk_one_asm() (including call and return
overhead) or other similar function depending on the table row being filled in.
Routine
|
# of Cycles
|
Part 2: C two's complement
|
|
Part 3: C one's complement
|
|
Part 4: ASM one's complement |
|
Part 5: (Bonus) optimized ASM |
|
Part A - Demo Checklist: (20 + 4)
- (20 points) Demo your Checksum project to the TA. The TA will ask you
to run the program with a different string, and show the resultant computation
values with various PB combinations pressed. The TA may also ask you to show a
timing calculation with the simulator.
- (Bonus: 4 points) Demo your optimized Checksum project to the TA.
Lab 4 - Part B
Goal:
- To implement multi-precision adds, subtracts, and multiplies in assembly.
- To perform simple peer reviews
Discussion:
Refer to the lecture notes for information on multiprecision
add/subtract/multiply. Refer to the lecture notes for information on reviews.
Procedure:
Part 1:
In Part one you will implement 64-bit add and subtract.
- Plan on your board being wired as described in Part A of Lab 4.
- Create a new assembly project using the
project stationery.
Download the lab_4b_addsub_skeleton.asm file and copy
it to both lab_4b_add_gXX.asm and lab_4b_sub_gXX.asm. Replace the
main.asm file with these files. Note: this means you will need to create a
separate project for add and subtract (or you may use the same project and
add/remove the files so that only one is in use at a time)
- Remove references to the function that you are NOT implementing from your
file.
Implement both
the add64 or sub64 subroutine to meet the following requirements:
- add64
- The subroutine shall add the 64-bit data stored in arg1 to
arg2 and store the 64-bit sum in result.
- sub64
- The subroutine shall subtract the 64-bit data stored in arg1 from
arg2 and store the 64-bit difference in result.
Additionally, for both subroutines:
- The subroutine shall preserve all register values and restore them before
returning control to the main loop.
- The subroutine shall use a loop to traverse the argument data and store the
result.
- The subroutine may use 8-bit or 16-bit operations to implement the 64-bit
operation.
- We recommend that you use the IDX2 addressing mode to read and write the
arguments and results, but any method you choose is acceptable so long as it
uses a loop.
- For the demo, the main program shall use the pushbuttons to display the
results, as described in the code comments. But you are NOT required to
implement this for the prelab.
Part 2:
- DO NOT simulate, run, or use a
debugger on the code you have written before doing the review. We want
you to do the review first so you can try to find bugs in the implementation.
In most cases this saves a lot of time compared to debugging using the
simulator. So, put another way, we expect you will have bugs in your
code going into the review. Finding them more efficiently with help from your
partner is the point of the exercise. In previous years, it was common for
these reviews to turn up 3 to 5 defects with only a few minutes of effort. For
future projects it will be OK to do a preliminary simulate and debug session
before review -- but this project is simple enough that we want to make sure
you have bugs to find in the review, so do the review before debugging.
In this part, you will do a review of each of the multiplication code files
(one generated by each team member). For the review, both team members
should be present. The person whose code is being reviewed is the
developer and the other person is the reviewer. When you do the second
review, these roles will be reversed. Complete the information
below. You must submit a complete writeup for BOTH reviews.
- Developer Name:
- Reviewer Name:
- File Name:
- Date of reviews:
- Length of review (in hours, with understanding that 2 people were involved
during that time):
- Number of (non-blank) lines of code reviewed:
- Lines reviewed per hour:
- Defects found per hour (scaled if review is less than an hour, which is
likely):
- List of defects found: (actually describe each defect in 2 or 3 lines of
text; can be as a text list if drawing boxes is too hard)
Defect #
|
Line #
|
Description of Defect
|
...
|
|
|
It is understood that line numbers might change as the code is fixed --
don't worry about it and don't go back to fix up line #s if they change.
Part 3:
For this part, for both programs work together to get demos working. It is
fine to collaborate on this portion of the lab and help each other with
debugging, etc. Keep the following data as you do this on a per-program basis
(i.e., two sets of information -- one set per program):
- Number of person-hours spent after the review doing debugging (if both of
you work together that is 2 person-hours per hour of elapsed time; if you work
separately then just count each person's individual hours)
- Number of defects found after the review:
- Defects found per hour after the review:
- List the defects found by debugging (after the review is completed):
Defect #
|
Line #
|
Description of Defect
|
...
|
|
|
Part 4:
Implement the third version of 32-bit x 32-bit -> 64-bit multiply.
- Plan on your board being wired as described in Part A of Lab 4.
- Create a new assembly project using the
project stationery.
Download the lab_4b_skeleton.asm file and
rename it lab_4b_mul3_gXX.asm. Replace the main file in the project with
this file.
- Implement the mul64 version your group did not use for reviews to meet the
following requirements:
- The subroutine shall multiply the 32-bit data stored in arg1 to
arg2 and store the 64-bit sum in result
- Follow the same restrictions as in Prelab B part 3.
Part 5 (Bonus - optional):
This section is optional and not that easy. You may do these
exercises to earn extra credit and get better understanding of multiprecision
math. If you are running over 12 hours per week on average for the course, you
should NOT be attempting this section!
Implement a 64 bit dividend / 32 bit divisor=> 32 bit quotient; 32 bit
remainder in assembly language. Implement one of either restoring or
non-restoring division. Save the implementations as lab_4b_div_GXX.asm.
Part 6 (Bonus - optional):
Perform a review of your third multiplication implementation or your
division implementation. Follow the formats used in Parts 2 and 3.
Part B - Demo Checklist: (35 + (5 or 10) points)
- (15 points) Demo both the multiprecision add and subtract programs to
the TA.
- (20 points) Demo all three multiplication implementations.
- Bonus: (Either 5 points or 10 points) Demo one of the following: restoring
division worth 5 points; OR non-restoring division worth 10 points.
Lab - Hand-in Checklist: (150 + 19 + (5 or 10))
Part A
- (5 points) List any problems you encountered in the lab and pre-lab, and
suggestions for future improvement of this lab. If none, then state so to get
these points.
- (40 points) Submit the entire
project for all parts. Your project should be in a folder called
"lab_4a_gXX". All files necessary to open the project in code
warrior and invoke the simulator must be present to receive full credit.
Code must be fully commented to receive full credit.
- (20 points) Answers to the questions above
- (13 points) Bonus -- provide code and fill in tables for questions for the
optimized assembly version of one's complement checksum.
Part B
- (5 points) List any problems you encountered in the lab and pre-lab, and
suggestions for future improvement of this lab. If none, then state so to get
these points.
- (10 points) Submit a listing of the code for lab_4b_add_gXX.asm and
lab_4b_sub_gXX.asm
- (30 points) Reviews for the both lab partners' code.
- (30 points) Corrected and working code for both lab partners,
lab_4b_gXX_andrewID1.asm and lab_4b_gXX_andrewID2.asm. Code must conform
to the coding style sheet to receive
full credit.
- (10 points) Submit lab_4b_mul3_gXX.asm with your 64bit multiply subroutine.
- (Either 5 or 10 points) Bonus -- Submit only one of the following:
restoring division worth 5 points; OR non-restoring division worth 10 points
- (6 points) Bonus -- Submit a review, including review metrics as well
as development metrics (using formats for parts 2 & 3) for the third
implementation of 64-bit multiply or the 64-bit divide you developed . 3
pts for each DISTINCT implementation review, up to 6 total points.
Refer to the LAB FAQ for more information on lab
hand-in procedures and file type requirements. You MUST follow these
procedures or we will not accept your submissions.
Hints and Suggestions:
Part A
- Note that the LEDs on the board are active-high, as opposed to the bar
graphs you have been using that are active-low.
- If you have problems implementing loops or iterations, please see a TA for
guidance during office hours.
- Some students find the TFR instruction helpful (if you don't know what it
is, this is a good time to look it up).
- The DBNE instruction can be very helpful for implementing loops, although
for string processing you want to look for the terminating null character.
- Note that managing the bit flips can be a little tricky. We recommend you
copy the string to a temporary string variable, flip bits, then perform the
checksum computation so that the original string value is not corrupted for the
next path through the loop. Trying to flip bits in the original string and
un-flip them when a button is released is just asking for bugs. One cute way to
do this is to use another routine from main to actually do the computations so
that it dynamically initializes the string each time it is called.
Part B
- Be careful to check if instructions in your loops affect the carry bit!
Especially watch out for using a compare instruction within your loop.
- If you are short on time, do other parts of your weekly work before
attempting the division exercise.
FILES for this lab:
Part A
Part B
Relevant reading:
Also, see the course materials
repository page.
Change notes for 2015:
- 2/3/2015: Changed prelab part A.2 bit reverse to use 16 bit input/output instead of 8 bit. --John
- 2/12/2015: Updated one's complement checksum hints to be more accurate (add one when result greater than or equal to zero, not just greater than). --John