Milestone Report


Benoit Hudson (bhudson@cs.cmu.edu)
Stephen Somogyi (ssomogyi@ece.cmu.edu)
Shobha Venkataraman (shobha@cs.cmu.edu)

November 19, 2003
Web Location: http://www.ece.cmu.edu/~ssomogyi/15745

Major changes

We had originally wanted a PTAS to solve the instruction scheduling problem for Itanium processors.  However, we found no PTAS for the problem.  In fact, once even a very high level of bundle constraints was introduced, we found no approximation algorithm for the problem. Therefore we decided to formulate a branch and bound any-time algorithm for the problem, based on list scheduling.

In the basic list scheduling algorithm, there are a few places where we currently choose the first valid option we come to. To get an optimal schedule, we instead need to loop over all choices (modulo pruning) at these points:
Here is now the branch-and-bound anytime algorithm:

Notation:
triple - set of three operations, all ready, possibly including NOPs, that are scheduled into one bundle (we take account of functional unit and bundle constraints here). 
R - list of instructions that are ready
F - list of instructions that are in-flight
S - list of instructions that are yet to finish (i.e., all instructions in flight, on the ready list, and not yet ready)
T - table of scheduled instructions, indexed by functional unit and time.

schedule(R, F, S, T, time)
    if(S is empty)
        if time < best-time,
                best-schedule = T
                best-time     = time
    else
        time++
        S' <- S - finished(F, time)
        F' <- F - finished(F, time)
        R' <- R + ready(F, time)
        for each triple of ready operations we can schedule,
            T' <- T + <time,triple>
            schedule(R'-triple, F'+triple, S', T', time)


We note that the machine and bundle constraints only matter within a timestep, so if two choices end up scheduling the same jobs that timestep, we can treat them as the same. We expect to memoize some of the recursive invocations and add heuristics to speed the search up as we progress.

Accomplished

We have a base list scheduler which outputs a legal schedule, integrated with the ORC.  A more detailed summary along with our revised schedule follows:
 
Week
Activities
Status
1-4
  1. find the Itanium research compiler and documentation
  2. formulate scheduling on the Itanium as a particular kind of job-shop problem
  3. find a PTAS or formulate a good anytime search algorithm for that problem
  4. design an interface for describing the job-shop problem (one that deals only in terms of jobs, machines, and prerequisites)
  5. write code in the compiler architecture to transfer from their IR to a job-shop problem
  6. write code to implement the job-shop algorithm, conforming to the interface we designed
  7. run the compiler on some toy test cases
  1. Done.  We chose the Open Research Compiler by Intel and Chinese Academy of Sciences.
  2. Done: it doesn't quite fit into standard frameworks, so we needed to formulate it on our own.
  3. 80%: no PTAS exists; we formulated our own algorithm. Heuristics and search pruning still need to be added.
  4. 80%: bundle interface is a simplification of the true complexity.  See 'surprises'
  5. Done.  If we change the bundle interface we will need to revisit this. 
  6. 40% Done. We have the base list scheduling algorithm implemented and working.
  7. Done.  Hello, world!  (And we've done some other, more exciting small programs).
5


  1. write a method to read a file of function names and epsilon values so the compiler will use those values to determine how much to optimize
  2. implement a set of scripts to quickly run and check test cases
  3. implement gathering profile information
  1. Shobha, Benoit
  1. Shobha

  2. Benoit
  3. Stephen
6
  1. implement automatically using profile information to drive the choice of epsilon
  2. run experiments: SPEC or some other relatively standard benchmark.
  3. Compare with ORC's default instruction scheduler
  1. Stephen, Benoit
  2. All.
  3. All.



Milestone

A little behind, because of Shobha's and Stephen's paper deadlines.  Note that (f) is slightly more than we'd promised for the milestone: we produce legal code, as promised.

Surprises

Revised Schedule

See above.

Resources

We have been able to use Intel's ORC and received accounts on the Itanium machine, which we have been using for testing.