### 18-447 Lecture 26: Interconnects

James C. Hoe Department of ECE Carnegie Mellon University

18-447-S20-L26-S1, James C. Hoe, CMU/ECE/CALCM, ©2020

### Housekeeping

- Your goal today
  - get an overview of parallel processing interconnect topics—whether it is on-a-chip or around-the-world
- Notices
  - HW 5 past due, Lab 4 due Friday 5/1
  - Midterm 3, Thursday, 5/7, 5:30pm~6:25pm
- Readings
  - P&H Ch 6
  - The CONNECT Network-on-Chip Generator, 2015 (optional)

### **Connecting Things "Systematically"**

# **Broadcast Bus** 0 1 2 3 4 5 6 7

- Simple and cheap
- Everyone sees everyone else's transactions (good for ordering and cache coherence)
- But
  - bandwidth cannot scale with system size, N
  - latency suffer terribly under load
  - electrically challenging as speed and  ${\sf N}$  grow

Physical extent by itself is not necessarily an issue, e.g., IEEE 802.3 CSMA/CD and ALOHAnet

#### **Other Extreme: All-to-All Point-to-Point**







- Concurrent sends to non-conflicting destinations
- Still expensive to scale, O(N) wires but O(N<sup>2</sup>) Xs

18-447-S20-L26-S6, James C. Hoe, CMU/ECE/CALCM, ©2020

#### **Multistage Circuit Switched**



- More restrictions on concurrent Tx-Rx pairs
- More scalable, e.g., O(N logN) cost for Butterfly

18-447-S20-L26-S7, James C. Hoe, CMU/ECE/CALCM, ©2020

#### **Packet Switched**



- Packetized send and forget operation
- Packets "hop" from router to router, pending availability of the next-required switch and buffer

18-447-S20-L26-S8, James C. Hoe, CMU/ECE/CALCM, ©2020

### From a Distance: Performance Characteristics







### **Test Traffic Patterns**

- Ideally, know the traffic and perf. requirement
- If not, resort to "test traffic patterns"
  - capture average, best, worst case scenarios
  - stress and highlight hotspots and weaknesses
  - like "benchmarks" for CPUs
- Random: non/uniform, {all-to-all, 1-to-all, all-to-1}
- Bit permutations
  - each source has 1 destination
  - dest ID is a bit permutation of source ID
  - e.g. transpose, shuffle, complement, reverse, …
- Other synthetic: tornado, nearest neighbor, ...
- Playback of real/synthestic workload traces

#### **Load-Delay Curve**



## A Little Closer Now: Different Topologies to Meet Different Requirements

#### **Unidirectional Ring**



- Simplest topology and implementation
  - O(N) cost
  - O(1) worst-case bisection BW (left-right halves),
     but O(N) best-case bisection BW(odd-even halves)
  - N/2 average hops; latency depends on utilization
     Simplicity allows very high-freq router and link



- Bi-directional links; travel left or right to go from src to dest; N/3 average hops
- "Torus" wraps around nodes 0 and (N-1) for N/4 avg hops; physically interleaved to avoid long links





- 2D layout scales easily as system-area network or network-on-chip; O(N<sup>0.5</sup>) bisection bandwidth
- Dimensional routing: first route to col in fewest hops then route in 2<sup>nd</sup> dimension
- Generalizable to higher dimensional mesh networks

#### Higher Dimensional Topologies: e.g., Butterfly & Hypercube



**5D Hypercube** 



- Fewer hops; higher bisection bandwidth
- Hard to physically place wires in high dimensions
- Hypercube switch complexity grows as log(N)



- Like a tree, 2log(n) hops for a neighborhood of n nodes; 2log(N) worst-case hops across a system
- Unlike a simple tree, fat-tree adds an alternate uproute at each router at each level: O(N) bisection BW
- Random-up, deterministic-down routing

18-447-S20-L26-S20, James C. Hoe, CMU/ECE/CALCM, ©2020

### Of all things, why a lowly ring?



[https://software.intel.com/en-us/articles/intel-xeon-processorscalable-family-technical-overview]

#### **Traffic, Scale & Cost Dictates**



## Up Close and Personal: Packets and Routers

#### **Network Packets**

#### **CM-5** Packets



#### **Ethernet Packets**



- Header
  - dest ID or route bits
  - src ID, priority, packet type, etc.
- Data payload
  - large vs. small
  - fixed vs. variable



- Checksum
  - redundancy coding (e.g., CRC)
  - most cases only for detection not correction

#### **A Basic Router**



- Packet enters on an Rx-link and choose a Tx-link to exit
  - route table maps dest-ID to Tx-link; OR
  - a fixed fxn of dest-ID or route-bits; OR
  - adaptive for congestion or fault
- Packets wait in buffer until
  - next router has buffer space; AND
  - Tx-link/crossbar is free

#### **Packets vs. Flits**



- A "packet" is made up of 1 or more fixed-size "flits"
  - route packets
  - flow-control flits
- Credit-based flow control
  - Tx logic hold credits for downstream Rx buffer
  - Tx logic deduct 1 credit
     when sending 1 flit; stop
     when out of credit
  - Rx logic return a credit token when a flit advances out of its buf

# Virtual Networks



- Time-multiplex same physical links over multiple sets of packet buffers
- Effectively multiple independent networks
  - to provide different priority packet classes
  - to get around blockage
  - to avoid deadlocks

18-447-S20-L26-S27, James C. Hoe, CMU/ECE/CALCM, ©2020



| Parameter                          | Value                              | Preview ( hide endpoints) |
|------------------------------------|------------------------------------|---------------------------|
| Network Topology                   |                                    |                           |
| Topology 🕕                         | Double Ring                        |                           |
| Number of Endpoints                | 8 -                                |                           |
| Network and Router Options         |                                    |                           |
| Router Type 🛈                      | Virtual Channel (VC)               | R4 Z R6                   |
| Number of VCs 🛈                    | 2 -                                |                           |
| Flow Control Type 🔔                | Credit-Based Flow Control          |                           |
| Flit Data Width 🕕                  | 64 💌                               | 44 74                     |
| Advanced Options (click to expand) |                                    |                           |
| Contact and Delivery Info          |                                    | R2 R0                     |
| Name                               | First Last                         |                           |
| Affiliation                        |                                    | and and                   |
| Email 🛈                            | Valid email required               |                           |
| I have read, understood            | , and I agree to the license terms | click to enlarge          |
| Generate Network                   | - click here to generate network   | (                         |