Stack Computers: 7.1 THE IMPORTANCE OF FAST SUBROUTINE CALLS

Chapter 7. Software Issues

7.1 THE IMPORTANCE OF FAST SUBROUTINE CALLS

"Programmers have learnt to avoid procedure calls and parameter passing for reasons of efficiency. These are, however, an important tool in the design of well organized programs and stack architectures carry the potential for very efficient invoking of procedures." (Schulthess & Mumprecht 1977, p. 25)

The sentiment that expensive procedure calls lead to poorly structured programs by inhibiting programmers with efficiency considerations is echoed by software stylists as well as both RISC and CISC advocates (Atkinson & McCreight 1987, Ditzel & McLellan 1982, Parnas 1972, Sequin & Patterson 1982, Wilkes 1982). In fact, Lampson (1982) goes so far as to say that procedure calls should be made to run as fast as unconditional jumps.

7.1.1 The importance of small procedures

The use of a large number of small procedures when writing a program reduces the complexity of each piece that must be written, tested, debugged, and understood by the programmer. Lower software complexity implies lower development and maintenance costs, as well as better reliability. Why then, would programmers not make extensive use of small procedures?

Most application programs are written in general purpose languages such as FORTRAN, COBOL, PL/1, Pascal, and C. The early high level programing languages such as FORTRAN were direct extensions of the philosophy of the machines they were run on: sequential von Neumann machines with registers. Consequently, these languages and their general usage have developed to emphasize long sequences of assignment statements with only occasional conditional branches and procedure calls.

In recent years, however, the complexion of software has begun to change. The currently accepted best practice in software design involves structured programing using modular designs. On a large scale, the use of modules is essential for partitioning tasks among members of programming teams. On a smaller scale, modules control complexity by limiting the amount of information that a programmer must deal with at any given time.

More advanced languages such as Modula-2 and Ada are designed specifically to promote modular design. The one hardware innovation that has resulted from the increasing popularity of modular, structured languages has been a register used as a stack pointer into main memory. With the exception of this stack pointer and a few complex instructions (which are not always usable by compilers), CISC hardware has not added much support for subroutine calls over the years. Because of this, the machine code output of optimizing compilers for modern languages still tends to look a lot like output from earlier, non-structured languages. Herein lies the problem. Conventional computers are still optimized for executing programs made up of streams of serial instructions. Execution traces for most programs show that procedure calls make up a rather small proportion of all instructions, which, of course, is partially attributable to the fact that programmers avoid using them. Conversely, modern programing practices stress the importance of non-sequential control flow and small procedures. The clash between these two realities leads to a suboptimal, and therefore costly, hardware/software environment on today's general purpose computers.

This does not mean that programs have failed to become more organized and maintainable using structured languages, but rather that efficiency considerations and the use of hardware that encourages writing sequential programs has prevented modular languages from achieving all that they might. Although the current philosophy is to break programs up into very small procedures, most programs still contain fewer, larger, and more complicated procedures than they should.

7.1.2 The proper size for a procedure

How many functions should a typical procedure have? Miller gives evidence that the number seven, plus or minus two, applies to many aspects of thinking (Miller 1967). The way the human mind copes with complicated information is by chunking groups of similar objects into fewer, more abstract objects. In a computer program, this means that each procedure should contain approximately seven fundamental operations, such as assignment statements or other procedure calls, in order to be easily grasped. If a procedure contains more than seven distinct operations, it should be broken apart by chunking related portions into subordinate procedures to reduce the complexity of each portion of the program. In another part of his book, Miller shows that the human mind can only grasp two or three levels of nesting of ideas within a single context. This strongly suggests that deeply nested loops and conditional structures should be arranged as nested procedure calls, not as convoluted, indented structures within a procedure.

7.1.3 Why programmers don't use small procedures

The only question now is, why don't most programmers follow these guidelines?

The most obvious reason that programmers avoid small, deeply nested procedures is the cost in speed of execution. Subroutine parameter setup and the actual procedure calling instructions can swamp the execution time of a program if used too frequently. All but the most sophisticated optimizing compiler cannot help if procedures are deeply nested, and even those optimizations are limited. As a result, efficient programs tend to have a relatively shallow depth of procedure nesting.

Another reason that procedures are not used more is that they are difficult to program. Often times the effort to write the pro-forma code required to define a procedure makes the definition of a small procedure too burdensome. When this awkwardness is added to the considerable documentation and project management obstacles associated with creating a new procedure in a big project (engendered by rules such as: each procedure must have a separate management control document), it is no wonder that average procedure sizes of one or two pages instead of one or two lines are considered appropriate.

There is an even deeper cause why procedures are difficult to create in modern programing languages, and why they are used less frequently than the reader of a book on structured programing might expect: conventional programing languages and the people who use them are steeped in the traditions of batch processing. Batch processing gives little reward in testability or convenience for working with small procedures. Truly interactive processing (which does not mean doing batch-oriented edit-compile-link-execute-crash-debug cycles from a terminal) is only available in a few environments, and is not taught in most undergraduate computer courses.

As a result of all these factors, today's programing languages provide only moderately useful capabilities for efficient modular programing. Today's hardware and programing environments unnecessarily restrict the usage of modularity, and therefore unnecessarily increase the cost of providing computer-based solutions to problems.

7.1.4 Architectural support for procedures

The problems that arise from poor performance on subroutine calls have been dealt with by computer architects in a variety of ways. RISC designers have taken two different approaches. The Stanford MIPS team uses compiler technology to expand procedures as in-line code wherever possible. The MIPS compiler then does very clever register allocation to avoid saving and restoring registers for procedure calls. The statistics that support the choice for this strategy were taken from programs that follow the traditional software design methods, with fairly large and not deeply nested procedures. While the MIPS approach appears to work well on existing software, it may create the same stifling effect on better software development strategies that we saw in the CISC machines.

The second RISC approach, one originally advocated by the Berkeley RISC I team, uses register windows to form a register frame stack. A pointer into the register stack can be moved to push or pop a group of registers quickly, supporting quick subroutine calls. This approach has many of the same advantages as the stack machine approach. The detailed implementation questions, which in real life may determine the success or failure of a product, become ones of single vs. multiple stacks, fixed- vs. variable-sized register frames, spill management strategies, and overall machine complexity.

NEXT NEXT SECTION

Phil Koopman -- koopman@cmu.edu