<! -- Keywords (to help out non-meta searches): middleware pSOS HP-UX HPUX Mach; fault tolerance; fault management; operating systems; software reliability; -->
Commercial computer systems have escaped the scrutiny for fault-tolerance
typically reserved for mission critical systems. As computer systems become an
integral part of daily activities people are beginning to depend on and expect
fault-free behavior. The implementation of a fault-management middleware layer
to an existing operating system can prove to be an effective way to quickly add
fault-management features to commercial computer systems.
This paper evaluates and defines a taxonomy of the implementations of four fault-management middleware layers in three commercial off-the-shelf Operating Systems: pSOS (embedded), Mach 3.0 (micro-kernel) and HP-UX (monolithic kernel).
The middleware development process for HP-UX is described and analyzed for performance and system overhead. Adding assertions shows the ease of implementing fault-management features to the HP-UX middleware. As a demonstration, assertions are used to protect an application from incorrect kernel behavior exposed in the unmodified operating system through running Robustness Benchmarks [Dingman96].