-


Safe and Reliable
- Systems must be safe to protect people & property
  - "Mission-critical" systems -- if electronics fail, someone
       could die or lose lots of money
  - Software & hardware must anticipate electronic & non-
       electronic failure modes to at least fail "safe"
- Traditional fault-tolerant techniques work, but are expensive
  - Replicated hardware (e.g., triplex modular redundancy)
  - Distributed consensus
  - High availability ("up-time") may come at the cost of
       poor reliability (more things to break over the long term)
Design challenges:
  - Realistic reliability predictions with commercial components
  - Low-cost reliability -- without brute force redundancy
       (probably requires a system-level approach)