CMU 18-100 S'15 L20-1 © 2015 J. C. Hoe # 18-100 Lecture 20: AVR Programming, Continued James C. Hoe Dept of ECE, CMU April 2, 2015 Today's Goal: You will all be ace AVR hackers! Announcements: Midterm 2 can be picked up in lab and at the HUB Office Hours: Wed 12:30~2:30 Handouts: HW8 (on Blackboard, due next Thursday) Lab10 (on Blackboard) CMU 18-100 S'15 L20-3 © 2015 J. C. Hoe # **AVR Data Types** - The 8-bit data word (in register or memory) can represent - a collection of 8 bits - an "unsigned" integer value between 0 and 255 - a "signed" integer value between -128 and 127 - Questions - how do you represent integer values as bits? - what if I want to represent a value greater than 255? - why -128 but only 127 in option 3? - if I showed you an 8-bit data word in a register, how do you know which interpretation to apply? - what about fractions and real numbers? ## Electrical & Computer ENGINEERING CMU 18-10 S'15 L20-4 © 2015 J. C. Hoe # Representing Unsigned Integers - How to represent 0~255 with 8-bits - 8 bits has 28 = 256 different patterns of eight 1s and 0s - any one-to-one assignment of integer values to patterns is "correct" - A conventional mapping based on counting is convenient in most cases CMU 18-100 S'15 L20-6 © 2015 # **Unsigned Integer Representation** - In general, let b<sub>n-1</sub>b<sub>n-2</sub>...b<sub>2</sub>b<sub>1</sub>b<sub>0</sub> represent an n-bit unsigned integer - its value is $\sum_{i=0}^{n-1} 2^{i} b_{i}$ weight of the i'th digit - a finite representation between 0 and 2<sup>n</sup>-1 - e.g., $1011_2 = 8_{10} + 2_{10} + 1_{10} = 11_{10}$ - Often written in hexadecimal for compactness - to convert, starting from the right, map 4 binary digits at a time into a corresponding hex digit {0~9,a~f}; and vice versa - e.g., 1010\_1011<sub>two</sub>=ab<sub>hex</sub> CMU 18-100 S'15 L20-7 © 2015 L.C. Hoe # **Representing Signed Integers** - How to represent -128~127 with 8-bits - any one-to-one assignment of integer values to the 256 patterns is "correct" - probably makes sense that $0_{10}$ to $127_{10}$ should be $00000000_2$ to $011111111_2$ - What about -1 to -128? - The logical extension is to count down from 0, ``` \begin{array}{lll} -1_{10} = 111111111_2; & -2 = 111111110_2; & -3_{10} = 111111101_2; \\ -4_{10} = 111111100_2; & -5_{10} = 111111011_2; & -6_{10} = 111111010_2; \end{array} ``` $-126_{10} = 10000010_2$ ; $-127_{10} = 10000001_2$ ; $-128_{10} = 10000000_2$ This scheme is called "2's complement" CMU 18-100 S'15 L20-9 © 2015 # 2's-Complement Number Representation - ◆ Let b<sub>n-1</sub>b<sub>n-2</sub>...b<sub>2</sub>b<sub>1</sub>b<sub>0</sub> represent an n-bit signed integer - its value is $$-2^{n-1}b_{n-1} + \sum_{i=0}^{n-2} 2^i b_i$$ - a finite representation between $-2^{n-1}$ and $2^{n-1}-1$ - e.g., assume 4-bit 2's-complement - To negate a 2's-complement number - add 1 to the bit-wise complement - assume 4-bit 2's-complement ``` (-b'1011) = b'0100 + 1 = b'0101 = 5 (-b'0101) = b'1010 + 1 = b'1011 = -5 (-b'1111) = b'0000 + 1 = b'0001 = 1 (-b'0000) = b'1111 + 1 = b'0000 = 0 ``` - ◆ 'V', 'N', 'Z', 'C' are arithmetic flags automatically updated after each ALU-class instructions - Z: set if the last result was zero, - N: set if the last result was negative (2's complement) - V: set if the last op caused an overflow (2's comp) - C: set if the last op caused a carry (unsigned) Page 11 Atmel 8-bit AVR ATmega8 Databook CMU 18-100 S'15 L20-11 © 2015 J. C. Hoe # What about larger integers - A 16-bit integer can be held using two 8-bit registers - Add/sub need to be emulated using multiple native 8bit operations - Suppose 16-bit A's upper/lower bytes are in r17/r16 and 16-bit B's upper/lower bytes are in r19/r18 ``` add r16, r18 ; sets C flag if carry adc r17, r19 ; incorporates C flag in sum ``` Extensible to larger integers but commensurately more expensive Electrical & Computer ENGINEERING CMU 18-100 S'15 L20-12 © 2015 J. C. Hoe ## **General Instruction Classes** - Arithmetic and logical operations - fetch operands from specified locations - compute a result as a function of the operands - store result to a specified location - update PC to the next sequential instruction - Data movement operations - fetch operands from specified locations - store operand values to specified locations - update PC to the next sequential instruction - Control flow operations - fetch operands from specified locations - compute a branch condition and a target address - if "branch condition is true" then $PC \leftarrow$ target address else PC ← next seq. instruction CMU 18-100 S'15 L20-21 © 2015 J. C. Hoe # Store Instruction (Absolute Addr) ## STS - Store Direct to Data Space ## Description: Stores one byte from a Register to the data space. A 16-bit address must be supplied. Memory access is limited to the current data segment of 64K bytes. #### Operation: (i) (k) ← Rr Syntax: STS k,Rr $\begin{aligned} & \text{Operands:} \\ & 0 \leq r \leq 31, \ 0 \leq k \leq 65535 \end{aligned}$ ## Program Counter: $PC \leftarrow PC + 2$ ## 32-bit Opcode: | 1001 | 001d | dddd | 0000 | |------|------|------|------| | kkkk | kkkk | kkkk | kkkk | ## Note: -32-bit instruction with 16-bit immediate specifying an "absolute address" Page 150 Atmel 8-bit AVR Instruction Set Manual -Recall memory SRAM is 1K words ## Electrical & Computer ENGINEERING CMU 18-100 S'15 L20-22 © 2015 # **Assembly Programming 301** • E.g. High-level Code $$A[8] = h + A[0]$$ where **A** is an array of integers (1-byte each here) - Assembly Code - suppose A is at location 0x100; h is r<sub>h</sub> - suppose r<sub>temp</sub> is a free register ``` lds r_{\text{temp}}, 0x100 ; r_{\text{temp}} = A[0] add r_{\text{temp}}, r_{\text{h}} ; r_{\text{temp}} = h + A[0] sts 0x108, r_{\text{temp}} ; A[8] = r_{\text{temp}} ``` ``` Assembly Programming 302 • High-level Code | sum = A[0]+A[1]+A[2]+A[3] (case 1) | | where A is an array of integers (1-byte each) | | Vs. | sum=0; (case 2) | | for (i=0; i < 100; i=i+1) | | sum = sum+ A[i]; | | Assembly Code | | suppose A is at location 0x100; sum and i are in r<sub>sum</sub>, r<sub>i</sub> | | suppose r<sub>temp</sub> is a free register ``` - "self-modifying code": load the instruction word; - or, allow addresses to come from a data register where increment by 1; store it back they can be manipulated like data CMU 18-100 S'15 L20-25 © 2015 L.C. Hop # Load Instruction (Indirect Addr) ## LD – Load Indirect from Data Space to Register using Index X ## Description: Loads one byte indirect from the data space to a register. The data location is pointed to by the X (16 bits) Pointer Register in the Register File. Memory access is limited to the current data segment of 64K bytes. | | Operation: | | Comment: | | |-----|---------------------|--------------------|------------------------|--| | (i) | $Rd \leftarrow (X)$ | | X: Unchanged | | | | Syntax: | Operands: | Program Counter: | | | (i) | LD Rd, X | $0 \leq d \leq 31$ | $PC \leftarrow PC + 1$ | | | | 16-bit Opcode: | | | | dddd #### Moto: - X is R26 (low byte) and R27 (high byte) viewed together as a 16-bit address 1100 - LD supports two variants that perform arithmetic on X, pre-increment LD Rd, X+ and post-decrement LD Rd, -X - -Also works with Y (R28,R29) and Z (R30, R31) Page 87 Atmel 8-bit AVR Instruction Set Manual CMU 18-100 S'15 L20-26 © 2015 # Store Instruction (Indirect Addr) ## ST - Store Indirect From Register to Data Space using Index X #### Description: Stores one byte indirect from a register to data space. The data location is pointed to by the X (16 bits) Pointer Register in the Register File. Memory access is limited to the current data segment of 64K bytes. # Operation: (i) (X) ← Rr Syntax: Operands: (i) ST X, Rr 0 ≤ r ≤ 31 16-bit Opcode : #### Note: - Like LD, supports two variants that perform arithmetic on X, pre-increment ST X+, Rd and post-decrement ST –X, Rd - Also works with Y (R28,R29) and Z (R30, R31) Page 144 Atmel 8-bit AVR Instruction Set Manual ``` Electrical & Computer ENGINEERING Assembly Programming 303 E.g. High-level Code sum=0; for(i=0; i < 100; i=i+1) sum = sum + A[i]; where A is an array of integers (1-byte each) Assembly Code - suppose A is at location 0x100; sum and i are in r_{sum}, r_{i} ldi r26,0x00 ld r<sub>temp</sub>, X ldi r27,0x01 add r_{sum}, r_{temp} inc r<sub>i</sub> ldi r_{sum}, 0x0 adiw r26, 1 ldi r_i, 0x0 rjmp test test: cpi r<sub>i</sub>, 100 done: brpl done ``` ``` Electrical & Computer ENGINEERING Assembly Programming 304 High-level Code for(i=0; i < 100; i=i+1) sum = sum + A[i]; __addr of A[100] Optimized Assembly Code 1di r26, lo8(0x0100) ldi r27, hi8(0x0100) ldi r_{sum}, 0x0 ldi r_{upper}, hi8(0x0164) loop: ; auto-increment X ld r_{temp}, X+ add r_{\text{sum}}, r_{\text{temp}} cpi r26, lo8(0x0164); i never calculated cpc r27, r<sub>upper</sub> ; explicitly brne loop ; transformed while loop ``` ``` Electrical & Computer ENGINEERING Is it better? By how much? "Literal Version" Basic metrics of goodness ldi r26,0x0 ; 1 cyc - static inst count= 11 ldi r27,0x1 ; 1 cyc (how many you see) ldi r_{sum}, 0x0 ; 1 cyc 1di r_i, 0x0 ; 1 cyc - dynamic inst count= test: cpi r, 100 ; 1 cyc 4+100x7+2=706 brpl done ; 1 cyc (how many executed) ld r<sub>temp</sub>, X ; 1 cyc add r_{sum}, r_{temp}; 1 cyc inc r_i ; 1 cyc - cycles=4+100x9+3=907 adiw r26, 1 ; 2 cyc rjmp test Atmel ISA manual specifies done: effective delay in cycles for the instructions (simulator tells you too) note: brpl=2 cyc if taken ``` ## Electrical & Computer ENGINEERING Is it better? By how much? "Hacker Version" ldi r26, lo8(0x100); 1 cyc ldi r27, hi8(0x100); 1 cyc $1di r_{sum}$ , 0x0; 1 cyc ldi $r_{upper}$ , hi8(0x164); 1 cyc loop: ld r<sub>temp</sub>, X+ static inst count=9 ; 2 cyc ; 1 cyc add $r_{sum}$ , $r_{temp}$ ~20% reduction cpi r26, lo8(0x164) ; 1 cyc dynamic inst count= cpc r27, r<sub>upper</sub> ; 1 cyc 4+100x5=504 brne loop ; 2 cyc ~30% reduction cycles=4+100x7-1=703 ~22% reduction FYI, a good compiler will be note: brpl=1 cyc if not taken closer to Hacker than Literal CMU 18-100 S'15 L20-33 © 2015 J. C. Hoe ## To Wrap up - To be a hacker, you also need to know - subroutine calls, in particular recursive calls - exception handling - I/O - how to hand optimize code Feel free to read the AVR documents on Blackboard - Big picture to keep in mind - you will see assembly programming in much greater detail in 213/240/34x/447 etc. - most of you will not code in assembly for a living; this is more about understanding the underlying concepts - once you learn one ISA you can learn the rest - some ISAs are uglier than others - AVR is not a pretty one. Electrical & Computer ENGINEERING CMU 18-100 S'15 L20-34 © 2015 L.C. Hoe # **Terminologies** - Instruction Set Architecture - the machine behavior as observable and controllable by the programmer - Instruction Set - the set of commands understood by the computer - Machine Code - instructions encoded in binary format - directly consumable by the hardware - Assembly Code - instructions expressed in "textual" format e.g. add r1, r2 - converted to machine code by an assembler - one-to-one correspondence with machine code (mostly true: compound instructions, address labels ....)