Due November 11, 1998
Problem 1: SCSI Bus Performance
The SCSI protocol consists of several phases for each data request and reply. The table below gives a breakdown of bus activity by phase (measured using a SCSI bus analyser) on a system with a number of fast disks transferring sequential blocks to a single host. Each individual request is for a 64 K block of data and in the multiple-disk cases, requests are issued in round-robin order (disk 1, disk 2, disk 3, disk 1, disk 2, disk 3, disk 1, etc.).
phase | 1 disk | 2 disks | 3 disks |
ARBITRATE | 1% | 1% | 1% |
SELECT | 1% | 1% | 5% |
MESSAGE | 3% | 10% | 12% |
COMMAND | 1% | 1% | 1% |
DATA | 27% | 55% | 79% |
STATUS | 1% | 1% | 1% |
BUS FREE | 66% | 31% | 1% |
a) Given a maximum bus throughput of 20 MB/s at 64K requests, what is the data transfer rate into the host in each case (with 1 disk, with 2 disks, with 3 disks)?
b) What would the transfer rate be if we added a fourth disk (remember that a single SCSI bus can hold up to 7 devices)?
If we reduce the request size to 8K each instead of 64K, we achieve the following utilization:
phase | 1 disk (8K) |
ARBITRATE | 1% |
SELECT | 15% |
MESSAGE | 14% |
COMMAND | 1% |
DATA | 27% |
STATUS | 1% |
BUS FREE | 41% |
c) Assuming the same sequential workload, but with 8K requests, does it make sense for me to add a second disk to this system? A third disk?
d) What if I changed my workload to random requests (where prefetching would no longer work, and seek time becomes an issue)? Would I benefit from a second disk? A third disk?
Problem 2: Vector Architecture
A particular vector computer design has the characteristics & assumptions given below. Some of the assumptions such as ignoring bank conflicts are made to make the problem easier and are obviously not realistic.
Latency | Clock ticks of latency |
Vector instruction dispatch | 1 |
VAG setup | 1 |
Address reaches memory bank | 3 |
DRAM read latency (ignore time to complete cycle) | 4 |
Data returns from memory bank after access via bus | 3 |
VDS delay | 1 |
Adder delay (starting when both operands available) | 4 |
VDS delay | 1 |
Result sent to memory bank via bus (address & data) | 3 |
Data written in to DRAM (ignore time to complete cycle) | 4 |
A 4-element vector addition takes 4 clock cycles to issue ("vload", "vload", "vadd", "vstore"). What is the elapsed time for a 4-element vector addition in clock ticks? Provide a spreadsheet printout or other table diagram illustrating how you got this solution (similar to the spreadsheets in lecture 16, but using columns and latencies appropriate to the table above).
18-548/15-548 home page.