Due Wednesday October 14, 1998
Multilevel Caches
Problem 1:
You have a computer with two levels of cache memory and the following specifications:
CMDLINE: dinero -b32 -i8K -d8K -a1 -ww -An -W8 -B8 CACHE (bytes): blocksize=32, sub-blocksize=0, wordsize=8, Usize=0, Dsize=8192, Isize=8192, bus-width=8. POLICIES: assoc=1-way, replacement=l, fetch=d(1,0), write=w, allocate=n. CTRL: debug=0, output=0, skipcount=0, maxcount=10000000, Q=0. Metrics Access Type: (totals,fraction) Total Instrn Data Read Write Misc ----------------- ------ ------ ------ ------ ------ ------ Demand Fetches 10000000 7362210 2637790 1870945 766845 0 1.0000 0.7362 0.2638 0.1871 0.0767 0.0000 Demand Misses 52206 8466 43740 36764 6976 0 0.0052 0.0011 0.0166 0.0196 0.0091 0.0000 Words From Memory 180920 ( / Demand Fetches) 0.0181 Words Copied-Back 766845 ( / Demand Writes) 1.0000 Total Traffic (words) 947765 ( / Demand Fetches) 0.0948
1) What is the available (as opposed to used) sustained bandwidth:
2) How long does an average instruction take to execute (in ns), assuming 1 clock cycle per instruction in the absence of memory hierarchy stalls, no write buffering at the L1 cache level, and 0% L2 miss rate?
3) A design study is performed to examine replacing the L2 cache with a victim cache. Compute a measure of speed for each alternative and indicate which is the faster solution. Assume the performance statistics are:
System Level Effects
Problem 2:
1) A Ph.D. student has snuck onto the course machines to run a long simulation. That task is suspended while a '548 student runs a cache-wiping homework problem, casing all data from the simulation to be expelled from cache. What is the approximate time penalty, in clocks, associated with refilling the caches when the simulation resumes execution? A restating of this same question is: assuming that the simulation runs to completion after it is restarted, how much longer (in clocks charged to that particular task) will it take to run than if it had not been interrupted?
L1 Cache | L2 Cache | L3 Cache | |
Organization | split | unified | unified |
Size | 8KB data + 8 KB instr. | 96 KB | 8 MB |
Associativity | direct mapped | 3-way set | direct mapped |
Blocks per sector | 2 | 2 | 2 |
Words per block | 4 | 4 | 4 |
Write policy | write through | write back | write back |
Write allocation | no | yes | yes |
Hit time | 1 clock | 4 clocks | 12 clocks |
Total miss time | 4 clocks | 12 clocks | 90 clocks |
Local miss ratio | 0.13 (same for D & I) | 0.04 | 0.02 |
18-548/15-548 home page.