A Term Project for Teaching Superscalar Out-of-Order

Revival since 2019

I am reviving this project as my extra-curricular activity. You can follow my (very slow) work-in-progress.

So far, I developed an RTL-exact solution to the 4-inst OOO design problem. While the datapath is modeled in structure and timing with exactitude with respect to the Yeager R10K paper, the “RTL” solution is in the form of C++ objects. I could have coded RTL in Verilog or SystemC, but, as a learning aide, I hope many more people could read this model in C, and trace the model's “synchronous register-transfer” operation easily in a C debugger.

I am working on a set of slides to present this topic to the serious-minded.

  • meeting 0–general consumption overview (covered in the regular 18447 lecture sequence for all students)
  • meeting 1–high-level concepts (superscalar, speculative, out-of-order) behind register micro-dataflow (Tomasulo with ROB from H&P)
  • meeting 2–logic and timing of register micro-dataflow using centralized bookkeepping (Metaflow DRIS, simplifying but impractical)
  • meeting 3–logic and timing of register micro-dataflow using distributed bookkeepping (MIPS R10K)
  • meeting 4–additional discussions of memory dataflow

The purpose of meetings 1 and 2 is to understand meeting 3. Since 2019, I have run meetings 1~4 (2.5 hours each) as an extracurricular workshop series (aka, The Superscalar Club) in the late spring on top of the background context of 18-447 Introduction to Computer Architecture lectures.

Original Materials from 2003

I developed this project for CMU 18-744 Hardware Systems Engineering. Students have to support very few instructions but get to work out many of the intricate and subtle details in real superscalar designs. It is quite a bit of work for the students but the course seems to be popular.

  • Overview Slides presented at the 2003 Workshop for Computer Architecture Education