18-749 Reading List Spring 2003
Course Home Page
Required:
Note: Read Wallace & Kuhn before reading Sullivan &
Chillarege.
- A. Avizienis, J.-C. Laprie and B. Randell, Fundamental Concepts of
Dependability, Research Report N01145, LAAS-CNRS, April 2001. (Citeseer |
local) / 21 pages.
This paper is a review and was required reading for 18-549; read it to brush
up on terminology and re-orient yourself to the big picture.
- Updated paper:
Avizienis, Laprie, Randell & Landwehr, "Basic Concepts and Taxonomy of
Dependable and Secure Computing," TDSC, Jan 04.
- D. Wallace and D. R. Kuhn, (NIST), "Lessons from 342 Medical Device
Failures ", HASE99, p. (123-31)(Citeseer |
local) / 9 pages.
Analysis of FDA data for non-lethal software recalls.
- M. Sullivan, R. Chillarege, (IBM Watson), "Software Defects and their
Impact on System Availability A Study of Field Failures in Operating
Systems," FTCS-21, 1991. (Citeseer |
local) / 8 pages.
The seminal paper for Orthogonal Defect Classification (ODC).
Supplemental:
Required:
- ESA, "Ariane 501 - Presentation of Inquiry Board report," press
release N° 33-1996, (WWW |
local) / 2 pages.
Summary of Ariane 501 board of inquiry report. For full report see
supplemental reading below.
- Weinstock, C.B., "SIFT: System Design and Implementation,"
Fault-Tolerant Computing 1995, Highlights from Twenty-Five Years.,
Twenty-Fifth International Symposium on (originally FTCS 1980), (IEEE |
local) / 3 pages
- F. Cristian, "Understanding fault-tolerant distributed systems,"
Communications of the ACM, Vol. 34 No. 2, February 1991, pp. 56 - 78 (ACM |
local) / 23 pages
- Gray, 1990, a census of tandem system availability, IEEE Trans.
reliability, 39(4), 409-418, Oct 1990. (IEEE |
local) / 10 pages.
Supplemental:
- Bartlett & Spainhower, "Commercial Fault Tolerance: a tale of two
systems," TDSC, Jan 04.
- Garman, "The bug heard 'round the world," ACM Sigsoft software
engr. notices 6(5), pp. 3-10, oct 81 (local)
- J. Gray, Why do computers stop and what can be done about it?,
in Proc. 5th Symp. on Reliability in Distributed Software and Database Systems,
(Los Angeles, CA, USA), pp.3-12, IEEE Computer Society Press, January 1986. (Tech. report |
local TR |
local) / 9 pages. (ILL requested
12/19/02)
- L. Hatton, "Software failures-follies and fallacies," IEE
Review, Volume: 43 Issue: 2, 20 March 1997, pp. 49-52, (IEEE |
local) / 4 pages.
Thoughts about Ariane 5 and other failures -- why can't we get this stuff
right?
- Hoyme, K.; Driscoll, K.; "Safebus," Digital Avionics Systems
Conference, 1992. Proceedings., IEEE/AIAA 11th , 5-8 Oct 1992 Page(s): 68 -73
(IEEE |
local)
- I. Lee and R. K. Iyer, "Faults, Symptoms, and Software Fault Tolerance
in the Tandem GUARDIAN90 Operating System", IEEE 1993, pp. 20-29. (IEEE
| local) / 10 pages.
- N. Leveson and C. Turner, (U. Washington; U.C. Irvine) "An
Investigation of the Therac-25 Accidents," IEEE Computer, Vol. 26,
No. 7, July 1993, pp.18-41. (IEEE |
local) / 24 pages.
- Powell, D.; Bonn, G.; Seaton, D.; Verissimo, P.; Waeselynck, F., "The
Delta-4 Approach to Dependability in Open Distributed Computing Systems ,"
Fault-Tolerant Computing, 1995, Highlights from Twenty-Five Years.,
Twenty-Fifth International Symposium (originally FTCS 1988) (IEEE |
local) / 6 pages.
- D. Powell (LAAS-CNRS), "Distributed Fault Tolerance Lessons Learnt
from Delta-4", Workshop on Fault-Tolerant Architectures, 1994. (Citeseer |
local) / 16 pages.
Case study of distributed fault tolerance implemented in software with
mostly-off-the-shelf hardware.
- B. Nuseibeh, "Ariane 5: Who Dunnit?" IEEE Software, Vol.
14 No. 3, May-June 1997, pp. 15 -16 (IEEE |
local) / 2 pages.
- S. Shrivastava, "Lessons Learned from Building and Using the Arjuna
Distributed Programming System," 1995. (Citeseer |
local) / 15 pages.
- J. Lions, Ariane 501 Inquiry Board Report, July 1996. (WWW |
local) / 60 pages.
- The space shuttle primary computer system Alfred Spector , David Gifford
Communications of the ACM September 1984 Volume 27 Issue 9 (ACM |
local)
- Architecture of the space shuttle primary avionics software system Gene D.
Carlow Communications of the ACM September 1984 Volume 27 Issue 9 (ACM | local)
- Hennebert, C. & Guiho, G., "SACEM: a fault tolerant system for
train speed control", 1993. (local)
- S. Webber, J. Beirne, "The Stratus Architecture," FTCS 21, 1991.
(IEEE
| local) / 7 pages.
Required:
- W. Bouricius., W. Carter, D. Jessep, P. Schneider, & A. Wadia, (IBM)
"Reliability modeling for fault tolerant computers,"
Fault-Tolerant Computing, 1995, Highlights from Twenty-Five Years.,
Twenty-Fifth International Symposium, (originally FTCS 1971) (IEEE |
local) / 4 pages.
- D. Bossen & M. Hsiao, (IBM) "ED/FI: A Technique for Improving
Computer System RAS," Fault-Tolerant Computing, 1995, Highlights from
Twenty-Five Years., Twenty-Fifth International Symposium, (originally FTCS
1981). (IEEE |
local) / 6 pages.
- A. Reibman & M. Veeraraghavan, (Bell Labs) "Reliability modeling:
an overview for system designers," Computer, Vol. 24, No. 4, April
1991, pp. 49-57. (IEEE |
local) / 9 pages.
- Dugan, "Dependability modeling for fault-tolerant software" (Ch
5) In: Lyu, Ed., Software Fault Tolerance, Wiley & Sons, 1995. (local) / 15 pages.
Supplemental:
- Abraham, J., & Siewiorek, D., "An algorithm for the accurate
reliability evaluation of triple modular redundancy networks," IEEE Trans.
Computers, July 1974 (local)
- P. Agrawal, "Fault-Tolerance in Microprocessor Systems without
Dedicated Redundancy," IEEE Transactions on Computers, Vol. 37, no. 3,
March 1988. (IEEE |
local) / 5
pages.
- Balkovich et al., "VAXcluster availability modeling", Digital
Technical Journal, 1987. (local)
- D. Barbara, H. Garcia-Molina, "The Reliability of Voting
Mechanisms," IEEE Transactions on Computers, Vol. C-36, No. 10, October
1987. (local
- Cullyer, W.J.; "Implementing high integrity systems: the VIPER
microprocessor" Computer Assurance, 1988. COMPASS '88 , 27 Jun-1 Jul 1988
Page(s): 56 -66 (IEEE |
local)
- Geist, Reliability estimation of fault-tolerant systems: tools and
techniques, IEEE Computer, 23(7), July 1990. (IEEE |
local) / 10 pages.
- R.D. Malhis, L.M.; Sanders, W.H.; Schlichting, "Numerical evaluation
of a group-oriented multicast protocol using stochastic activity
networks," Petri Nets and Performance Models, 1995, pp. 63 -72.
(IEEE |
local) / 10
pages.
- Nelson, Fault tolerant computing: fundamental concepts, IEEE Computer,
23(7), July 1990. (IEEE |
local) / 7 pages.
- Rai, S. et al., "Two recursive algorithms for computing the
reliability of k-out-of-n systems," IEEE Trans. Reliability, June 1987.
(local)
- Rennels, D., "Fault-tolerant computing -- concepts and examples",
IEEE Trans. Computers, Dec. 1984 (local)
- Schlichting & Schneider, "Fail-stop processors: an approach to
designing fault-tolerant computing systems," ACM Trans. Comp. Sys., v 1,
pp 222-238, Aug. 1983 (citeseer |
local) / 21 pages.
- Siewiorek, Fault tolerance in commercial computers, IEEE Computer, 23(7),
July 1990. (IEEE |
local) / 12 pages.
- Singh, Fault tolerant system intro, IEEE Computer, 23(7), July 1990. (IEEE |
local) / 3 pages.
- Sahner, R. & Trivedi, K., "Reliability modeling using
SHARPE," IEEE Trans. Reliability, June 1987. (local)
Pending:
- Derr, prediction of wiring harness reliability, SAE 870055, (in SP-696,
Feb. 1987). ( | local) / pages.
- Davis & Johri, reliability analysis of mechanical components, SAE
870052, (in SP-696, Feb. 1987) ( | local) / pages.
- [SAE84] Binroth, Coit, Desnon and Hammer. "Development Of Reliability
Prediction Models For Electronic Components In Automotive Applications",
SAE Paper 840486. ( | local) / pages.
- J. von Neumann, (1956) "Probabilistic Logic and the Synthesis of
Reliable Organisms from Unreliable Components." In: A. H. Taub, editor.
John von Neumann: Collected Works, volume V: Design of Computers, Theory of
Automata and Numerical Analysis. Pergamon Press, 1963. (local)
-
Required:
- Randell, The evolution of the recovery block concept (ch 1) In: Lyu, Ed.,
Software Fault Tolerance, Wiley & Sons, 1995. (local) / 21 book
pages.
- Xu, J., Randell, B., Roll-forward error recovery in embedded real-time
systems, Proceedings. 1996 International Conference on Parallel and Distributed
Systems (Citeseer |
local) / 8 pages.
- Chiu, J.-F.; Ge-Ming Chiu; "Placing forced checkpoints in distributed
real-time embedded systems," Computing & Control Engineering Journal ,
Volume: 13 Issue: 4 , Aug 2002 Page(s): 197 -205, (IEEE |
local) / 9 pages.
Supplemental
- B. Randell. System structures for software fault-tolerance. IEEE Trans.
Software Eng., 1, 2(June 1975), 220-232. (local)
- E. N. Elnozahy, L. Alvisi, Y. M. Wang, and D. B. Johnson, "A survey of
rollback-recovery protocols in message-passing systems," Tech. Rep. No.
CMU-CS-99-148, Dept. of Computer Science, Carnegie Mellon University, 1999. (Citeseer |
local)
- Chung-Chi Jim Li; Fuchs, W.K., "CATCH - Compiler-Assisted Techniques
for Checkpointing," Fault-Tolerant Computing, 1995, Highlights from
Twenty-Five Years., Twenty-Fifth International Symposium on (From FTCS
1990), (IEEE |
local)
- Krishna, C.M.; Singh, A.D.; Reliability of checkpointed real-time systems
using time redundancy Reliability, IEEE Transactions on , Volume: 42 Issue: 3 ,
Sep 1993 Page(s): 427 -435 (IEEE |
local). / 8 pages.
- D. K. Pradhan, N. H. Vaidya, "Roll-Forward and Rollback Recovery:
Performance-Reliability Trade-Off", FTCS 24, 1994. (Citeseer |
local)
- Pradhan, D.K.; Vaidya, N.H.; Roll-forward checkpointing scheme: a novel
fault-tolerant architecture, Computers, IEEE Transactions on , Volume: 43
Issue: 10 , Oct 1994 Page(s): 1163 -1174 (IEEE |
local)
- Koo, R. and Toueg, S., Checkpointing and rollback-recovery for distributed
systems, Trans. Software Engineering, SE-13(1):23-31, IEEE, 1987. (local) / 9 pages.
- Leu, P. and Bhargava, B., A model for concurrent checkpointing and recovery
using transactions, Proc. 9th Intl. Conf. Distr. Comp. Sys, 423-430, IEEE, 1989
. (IEEE |
local) / 8 pages.
- Strom, R.E. and Yemini, S., Optimistic recover in distributed systems,
Trans. Computer Systems, 3(3):204-226, ACM, August 1985. (ACM |
local) / 23 pages.
- Campbell & randell, 1986 error recovery in asynchronous systems, IEEE
Trans SW Eng. SE-12, 8, pp. 811-826. (local) / 16 pages.
- Kim, The distributed recovery block scheme (ch 8) In: Lyu, Ed., Software
Fault Tolerance, Wiley & Sons, 1995. (local)
Required:
- Anderson, T.; Barrett, P.A.; Balliwell, D.N.; Moulding, M.R.B., "An
Evaluation of Software Fault Tolerance in a Practical System,"
Fault-Tolerant Computing, 1995, Highlights from Twenty-Five Years.,
Twenty-Fifth International Symposium, p. 130 (Originally FTCS 1985) (IEEE |
local) / 6 pages.
- Levendel, Y., "The cost effectiveness of telecommunication service
dependability" (ch 12) In: Lyu, Ed., Software Fault Tolerance, Wiley &
Sons, 1995 (local) / 36 book
pages.
- Select one of below:
- Shen, J.P.; Wilken, K.; "Continuous signature monitoring: efficient
concurrent-detection of processor control errors;" Test Conference, 1988.
Proceedings. 'New Frontiers in Testing'., International , 12-14 Sep 1988
Page(s): 914 -925 (IEEE |
local) / 13 pages.
- S. Garg, A. van Moorsel, K. Vaidyanathan and K. S. Trivedi., "A
Methodology for Detection and Estimation of Software Aging," Int'l. Symp.
on Software Reliability Engineering, ISSRE 1998, November 1998. (IEEE |
local) / 10 pages.
Other High-Level Discussions
- Littlewood, B. & Strigini, L., "Software Reliability &
Dependability: a roadmap," Proceedings of the conference on the future
of software engineering,", May 2000. (ACM |
local)
Supplemental:
- Arlat, J.; Kanoun, K.; Laprie, J., "Dependability evaluation of
software fault-tolerance," Fault-Tolerant Computing, 1995, Highlights from
Twenty-Five Years., Twenty-Fifth International Symposium on Page(s): 194 (IEEE |
local)
- J. R. Horgan and A. P. Mathur, "Perils of software reliability
modeling," Technical Report, SERC-TR-160-P, 1995, Software Engineering
Research Center, Purdue University, W. Lafayette, IN. (Citeseer |
local)
- G. F. Sullivan, D. S. Wilson, G. M. Masson, "Certification of
Computational Results," IEEE Trans. on Computers, Vol. 44, No. 7, July
1995. (IEEE
| local)
- Y. M. Wang, Y. Huang, and W. K. Fuchs, "Progressive retry for software
error recovery in distributed systems, in Proc. IEEE Fault-Tolerant Computing
Symposium (FTCS-23), pp. 138--144, June 1993. (IEEE |
local)
- Huang, software fault tolerance in the application layer (ch 10) In: Lyu,
Ed., Software Fault Tolerance, Wiley & Sons, 1995 (local)
- Iyer, software fault tolerance in computer operating systems (ch 11) In:
Lyu, Ed., Software Fault Tolerance, Wiley & Sons, 1995 (local)
- D. J. Taylor, J. P. Black, "Principles of Data Structure Error
Correction," IEEE Trans. on Computers, Vol. C-31, No. 7, July 1982. (local)
- D. J. Taylor, D. E. Morgan, J. P. Black, "Redundancy in Data
Structures: Improving Software Fault-Tolerance," IEEE Trans. on Software
Engineering, V. SE-6, No. 6, November 1980. (local)
See also: Exception handling; Fault
Injection
Required:
- Leslie Lamport. Time, Clocks, and the Ordering of Events in a Distributed
System, Communications of the ACM, Vol. 21, No. 7 (July 1978), pp. 558-565. (ACM |
local) / 8 pages.
- Lamport, Leslie, and Melliar-Smith, P.M. "Synchronizing Clocks in the
Presence of Faults." Journal of the ACM, vol 32, no 1, January 1985, p.
53-78. (ACM |
local) / 27 pages.
- Select one of:
- Temporal composability, Kopetz, H.; Obermaisser, R.; Computing &
Control Engineering Journal , Volume: 13 Issue: 4 , Aug 2002 Page(s): 156 -162
(IEEE |
local) / 7 pages.
- Raynal, M., Singhal, M., Logical time: capturing causality in distributed
systems, Computer 29(2):49-56, IEEE, February 1996. (IEEE |
local) / 8 pages.
Supplemental:
- Kenneth P. Birman. A Response to Cheriton and Skeen's Criticism of Causal
and Totally Ordered Communication. Technical report, Cornell University,
October 1993. (Citeseer |
local)
- Fault-tolerant clock synchronization in distributed systems Butler, R.W.;
Ramanathan, P.; Shin, K.G.; Computer , Volume: 23 Issue: 10 , Oct 1990 Page(s):
33 -42 (IEEE
| local)
- David Cheriton and Dale Skeen, Understanding the Limitations of Causally
and Totally Ordered Communication, Proc. of the Symposium on Operating System
Principles (SOSP), December 1993. (ACM |
local)
- Cristian, F., "Probabilistic Clock Synchronization,"
Distributed Computing, No. 3, 1989, pp. 146-158. (local)
- D. A. Jefferson. "Virtual Time". ACM Transactions on Programming
Languages and Systems, Vol. 7, No. 3, pp. 404--425, July 1985. (ACM |
local)
- Kopetz, H., & Ochsenreiter, W., "Clock synchronization in
distributed real time systems," IEEE Trans. Computers, August 1987. (local)
- Robert H. B. Netzer and Jian Xu, "Necessary and Sufficient Conditions
for Consistent Global Snapshots," IEEE Trans. on PADS., Vol. 6, No. 2,
February 1995. (IEEE |
local)
- D. L. Palumbo, "The Derivation and Experimental Verification of Clock
Synchronization Theory," IEEE Transactions on Computers, Vol. 43, No. 6,
June 1994. (IEEE |
local)
- Raynal, M.; Singhal, M.; Mastering agreement problems in distributed
systems, IEEE Software , Volume: 18 Issue: 4 , Jul/Aug 2001 Page(s): 40 -47 (IEEE |
local)
- Shin, J. & Ramanathan, P., "Clock synchronization of a large
multiprocessor system in the presence of malicious faults," IEEE Trans.
Computers, Jan. 1987. (local)
- Synchronization of fault-tolerant clocks in the presence of malicious
failures Vasanthavada, N.; Marinos, P.N., IEEE Trans. Computers, April 1988.
Page(s): 440-448 (IEEE |
local)
Required:
- J. Goodenough, "Exception Handling: Issues and Proposed
Notation," Communications of the ACM, vol. 18(12), pp. 683-696, 1975. (ACM |
local). / 14 pages
- Romanovsky, Alexander; Xu, Jie; Randell, Brian, "Exception Handling
in Object-Oriented Real-Time Distributed Systems." First International
Symposium on Object-Oriented Real-Time Distributed Computing (ISORC '98), April
1998, p. 32-42. (Citeseer |
IEEE |
local) / 12 pages
- Vo, Kiem-Pheng; Wang, Yi-Min; Chung, P.Emerald; Huang, Yennun, "Xept:
A Software Instrumentation Method For Exception Handling." Eighth
International Symposium on Software Reliability Engineering, November 1997, p.
60-69. (Citeseer |
IEEE |
local) / 10 pages
- Still looking for a good reference for real-time and exceptions
Supplemental:
- Cristian, "Exception Handling" (Citeseer |
local)
- Cristian, F., Exception Handling and Software Fault Tolerance,
Fault-Tolerant Computing, 1995, Highlights from Twenty-Five Years.,
Twenty-Fifth International Symposium on Page(s): 120 (IEEE |
local)
- P. M. Melliar-Smith B. Randell Publisher, "reliability: The role of
programmed exception handling", Proceedings of an ACM conference on
Language design for reliable software, 1977 , Raleigh, North Carolina Software
ACM Press New York, NY, USA Pages: 95 - 100 (ACM |
local)
- P.A. Lee, "Exception Handling in C Programs," Software Practice
and Experience, Vol. 13, 1983. (local)
Other sources:
- Goodenough, J., Exception handling: issues and a proposed
notation, Communications of the ACM, 18(12): 683696, December 1975
- I. Hill, "Faults in functions, in ALGOL and FORTRAN," The
Computer Journal, 14(3): 315-316, August 1971.
- Garcia, A.F., Beder, D.M., Rubira, C.M.F., An exception handling
software architecture for developing fault-tolerant software, 5th
International Symposium on High Assurance System Engineering, 2000
- Hagen, C., Alonso, G., Flexible Exception Handling in the OPERA
Process Support System, 18th International Conference on Distributer
Computing Systems, 1998
- [Lee83] Lee, P.A., Exception Handling in C Programs, Software
Practice and Experience. Vol 13, 1983
- [Romanovsky00] Romanovsky, A., An exception handling framework for
N-version programming in object-oriented systems, Proceedings Third IEEE
International Symposium on Object-Oriented Real-Time Distributed Computing,
2000
Required:
- Lamport, L., Shostak, R., and Pease, M., The Byzantine Generals Problem,
Trans. Prog. Lang. and Sys. 4(3):382-401, ACM, July 1982. (ACM |
local) / 20 pages.
- R. Kiechafer, C. J. Walter, A. M. Finn, P. M. Thambidurai, "The MAFT
Architecture for Distributed Fault Tolerance," IEEE Trans. on Computers,
Vol. 37, No. 4, April 1988. (IEEE |
local) / 8 pages.
Supplemental:
- Frison, S.G.; Wensley, J.H., "Interactive consistency and its impact
on the design in TMR systems," Fault-Tolerant Computing, 1995, Highlights
from Twenty-Five Years., Twenty-Fifth International Symposium on Page(s): 425
(Originally FTCS 1982). (IEEE |
local) / 6 pages.
- M. Barborak, M. Malek, A. Duhbura, "The Consensus Problem in Fault
Tolerant Computing," ACM Computing Surveys, vol. 25, No. 2, June 1993. (ACM |
local)
- Degradable Byzantine agreement Pradhan, D.K.; Vaidya, N.H.; Computers, IEEE
Transactions on , Volume: 44 Issue: 1 , Jan 1995 Page(s): 146 -150 (IEEE |
local)
- K. Birman and T. Joseph. Reliable communication in the presence of
failures. ACM Trans. Computer Systems, 5(1):47--76, 1987. (ACM |
local)
- James Kistler and M. Satyanarayanan. Disconnected Operation in the Coda
File System, ACM Trans. on Computer Systems 10(1), February 1992, pp. 3-25. (Citeseer |
local)
- K. G. Shin, J. W. Dolter, "Alternative Majority-Voting Methods for
Real-Time Computing Systems,", IEEE Transactions on Reliability, V. 38,
No. 1, April 1989. (IEEE |
local)
- P. R. Lorczak, A. K. Koglayan, D. E. Eckhardt, "A Theoretical
Investigation of Generalized Voters for Redundant Systems," FTCS 19, 1989.
(IEEE |
local)
- Chandra T.D. and Toueg S., Unreliable failure detectors for reliable
distributed systems. Journal of the ACM , 43(2), pp:225--267, (March 1996). (ACM |
local)
- Cristian, F.; Aghili, H.; Strong, R.; Volev, D.; "Atomic broadcast:
from simple message diffusion to Byzantine agreement," Fault-Tolerant
Computing, 1995, Highlights from Twenty-Five Years., Twenty-Fifth International
Symposium on Page(s): 431 (Originally FTCS 1985). (IEEE |
local)
Required:
- Maffeis, S., "Adding Group Communication and Fault-Tolerance to
CORBA," Proc. USENIX Conf. on Object-Oriented Technologies, June 1995. (Citeseer |
local) / 12 pages
- Pascal A. Felber, Benoit Garbinato & Rachid Guerraoui, "The Design
of a CORBA Group Communication Service," (long version of paper in:
Proceedings of the 15th Symposium on Reliable Distributed Systems (SRDS-15)),
1996 (Citeseer |
local) / 12 pages
- Narasimhan, P.; Moser, L.E.; Melliar-Smith, P.M.; "Lessons Learned in
Building a Fault-Tolerant CORBA system," Dependable Systems and Networks,
2002, pp. 39-44. (IEEE |
local) / 6 pages.
Supplemental:
- This one is recommended, but not required reading:
- P. Narasimhan, L.E. Moser, P.M. Melliar-Smith, "Exploiting the
Internet Inter-ORB Protocol Interface to Provide CORBA with Fault
Tolerance,"Proceedings of the 3rd USENIC Conference on Object-Oriented
Technologies and Systems (COOTS),1997. (Citeseer |
local)
- OMG, FT-CORBA standard, version 3, July 2002 (Web)
- Merlin, P.M.; Randell, B., State restoration in distributed systems,
Fault-Tolerant Computing, 1995, Highlights from Twenty-Five Years.,
Twenty-Fifth International Symposium on Page(s): 207 (Originally FTCS 1978) (IEEE |
local)
- Chandy and Lamport, Distributed Snapshots: Determining the Global States of
a Distributed System, ACM TOCS, pp. 63-75, Feb. 1985. (ACM |
local) / 13 pages.
Required:
- F. Cristian, "Reaching Agreement on Processor Group Membership in
Synchronous Distributed Systems (1991) (Citeseer |
local)
- Poledna, S., "Fault tolerance in safety critical automotive
applications: cost of agreement as a limiting factor ", Fault-Tolerant
Computing, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International
Symposium on , 27-30 Jun 1995 Page(s): 73 -82 (IEEE |
local)
This is actually more of an agreement paper, but it fits better here in the
class discussions
Supplemental:
- Cristian, F., Agreeing on who is present and who is absent in a synchronous
distributed system ; Fault-Tolerant Computing, 1988. FTCS-18, Digest of
Papers., Eighteenth International Symposium on , 27-30 Jun 1988 Page(s): 206
-211 (IEEE |
local)
- Garcia-Molina, H., Elections in a distributed computer system, Trans.
Computers C-31(2):48-59, IEEE, 1982. (local) / 7 pages + appendix.
- Pease, M., R. Shostak, L. Lamport. Reaching Agreement in the Presence of
Faults. JACM 27, 2 (April 1980). (ACM |
local) Frames the
Byzantine Generals question
- Consensus with dual failure modes Meyer, F.J.; Pradhan, D.K.; Parallel and
Distributed Systems, IEEE Transactions on , Volume: 2 Issue: 2 , Apr 1991
Page(s): 214 -222 (IEEE |
local)
- Butler Lampson. How to Build a Highly Available System Using Consensus,
1996, pp 1-17. (Citeseer |
local) / 17 pages.
- Alessandro Galleni, David Powell, Consensus and Membership in Synchronous
and Asynchronous Distributed Systems (1996) (Citeseer |
local)
- L. Lamport, "The part-time parliament," ACM Transactions on
Computer Systems, Vol. 16, No. 2, May 1998, pp. 133-169. (Citeseer |
local) / 33 pages.
- Michael Fischer, Nancy Lynch, and Michael Patterson, Impossibility of
Distributed Consensus with One Faulty Processor, Journal of the ACM, vol 32, no
2, 1985. (ACM |
local)
- Kenneth P. Birman, The Process Group Approach to Reliable Distributed
Computing, Communications of the ACM 36(12), December 1993, pp. 37-53. (ACM
| local)
- Davidson, S.B., Garcia-Molina, H., Skeen, D., Consistency in Partitioned
Networks: a survey, Computing Surveys 17(3):341-370, ACM, September 1985. (ACM |
local)
- Parker, D.S., et al, Detection of mutual inconsistency in distributed
systems, Trans. Software Engineering 9(3):240-246, IEEE, 1983 (local)
- Poledna, S.; Tolerating sensor timing faults in highly responsive hard
real-time systems ; Computers, IEEE Transactions on , Volume: 44 Issue: 2 , Feb
1995 Page(s): 181 -191 (IEEE |
local)
- Chockler, G., Keidar, I., & Vitenberg, R., "Group communication
specifications: a comprehensive study," ACM Computing Surveys, V. 33 N. 4,
Dec. 2000, pp. 427-469. (ACM |
local)
Required:
- Maxion, R.A.; Olszewski, R.T.; "Eliminating exception handling errors
with dependability cases: a comparative, empirical study", IEEE
Transactions on Software Engineering, Volume: 26 Issue: 9 , Sep 2000 Page(s):
888 -906 (IEEE
| local)
Supplemental:
- Robust software - no more excuses De Vale, J.; Koopman, P. Dependable
Systems and Networks, 2002. Proceedings. International Conference on , 2002
Page(s): 145 -154 (IEEE |
local)
- Maxion98] Maxion, Roy A.; Olszewski, Robert T., "Improving Software
Robustness With Dependability Cases." Twenty-Eighth Annual International
Symposium on Fault-Tolerant Computing, June 1998, p. 346-355. (IEEE |
Citeseer |
local) (conference
version of the 2000 journal paper)
Required:
- Nelson, J. "Incremental avionics upgrades for legacy aircraft";
Digital Avionics Systems Conference, 1997. 16th DASC., AIAA/IEEE , Volume: 1 ,
26-30 Oct 1997 Page(s): 3.2 -15-23 vol.1, (IEEE |
local) / 9 pages.
- Lui Sha; Rajkumar, R.; Gagliardi, M.; "Evolving dependable real-time
systems," Aerospace Applications Conference, 1996. Proceedings., 1996 IEEE
, Volume: 1 , 3-10 Feb 1996 Page(s): 335 -346 vol.1, (IEEE |
Citeseer |
local) / 12 pages
- Arlat, J.; Jarboui, T.; Kanoun, K.; Powell, D.; ; "Dependability
assessment of GUARDS instances," Computer Performance and Dependability
Symposium, 2000. IPDS 2000. Proceedings. IEEE International , 2000 Page(s): 147
-156 (IEEE |
local) / 10 pages
(Note: GUARDS is mostly about dependable upgrade, but I haven't found any
good short papers that concentrate on that aspect.)
Supplemental:
- Cook, J.E.; Dage, J.A.; Highly reliable upgrading of components ; Software
Engineering, 1999. Proceedings of the 1999 International Conference on , 1999
Page(s): 203 -212 (IEEE |
local)
- Lyu, J.; Youngjin Kim; Yongsub Kim; Inhwan Lee; ; " A procedure-based
dynamic software update"; Dependable Systems and Networks, 2001.
Proeedings. The International Conference on , 2001 Page(s): 271 -280 (IEEE |
local)
- Powell, D.; Arlat, J.; Beus-Dukic, L.; Bondavalli, A.; Coppola, P.;
Fantechi, A.; Jenn, E.; Rabejac, C.; Wellings, A.; GUARDS: a generic upgradable
architecture for real-time dependable systems ; Parallel and Distributed
Systems, IEEE Transactions on , Volume: 10 Issue: 6 , Jun 1999 Page(s): 580
-599 (IEEE
| local)
- Romanovsky, A.; Smith, I.; "Dependable on-line upgrading of
distributed systems," Computer Software and Applications Conference, 2002.
Proceedings. 26th Annual International , 2002 Page(s): 975 -976 (IEEE |
local)
- Sha, L., Ragunathan Rajkumar, Michael Gagliardi, A Software Architecture
for Dependable and Evolvable Industrial Computing Systems,
CMU/SEI-95-TR-005, 1995. (Web
| local)
- Sha, L., "Dependable system upgrade," Real-Time Systems
Symposium, 1998. Proceedings., The 19th IEEE , 2-4 Dec 1998 Page(s): 440 -448
(IEEE |
local) / 9 pages.
- Tai, A.T.; Alkalai, L.; Chau, S.N.; Sanders, W.H.; Tso, K.S.;
"Low-cost error containment and recovery for onboard guarded software
upgrading and beyond"; Computers, IEEE Transactions on , Volume: 51 Issue:
2 , Feb 2002 Page(s): 121 -137 (IEEE |
local)
Required:
- Liming Chen; Avizienis, A., N-version programming: a fault-tolerance
approach to reliability of software operation, Fault-Tolerant Computing, 1995,
Highlights from Twenty-Five Years., Twenty-Fifth International Symposium on
Page(s): 113 (originally FTCS 1978) (IEEE |
local) / 7 pages.
- Knight, Leveson & St. Jean "A large scale experiment in N-version
programming", FTCS15, 1985, 135-139 (local) / 5 pages.
- Avizienis, A.; Lyu, M.R.; Schutz, W., In search of effective diversity: a
six-language study of fault-tolerant flight control software, FTCS 1988. (IEEE |
local) / 8 pages.
- Knight, J. & Leveson, N., "A reply to the criticisms of the Knight
& Leveson experiment," ACM SIGSOFT Software Engineering Notes, vol.
15, no. 1, pg. 24, Jan 1990. (ACM |
Web |
local) / 13 pages
Supplemental:
- Ammann, P.E.; Knight, J.C., Data diversity: an approach to software fault
tolerance Page(s): 418-425 (IEEE |
local)
- Avizienis, A., "The N-version approach to fault tolerant
software," IEEE Trans. Software Engineering, SE-11(12), December 1985, pp.
1491-1501. (local)
- Avizienis, the methodology of n-version programming (ch 2) In: Lyu, Ed.,
Software Fault Tolerance, Wiley & Sons, 1995 (local)
- Bishop, software fault tolerance by design diversity (ch 9) In: Lyu, Ed.,
Software Fault Tolerance, Wiley & Sons, 1995 (local)
- Brilliant, S.S., Knight, J.C., Leveson, N.G., "Analysis of Faults in
an N-Version Software Experiment", IEEE Transactions on Software
Engineering, 16(2): 238-47, Feb. 1990. (IEEE |
local)
- Brilliant, Knight, Leveson, "The consistent comparison problem in
N-version software", IEEE Trans. SW Eng, 15(11) 1481-1485, Nov 89. (IEEE |
local)
- J. DeVale and P. Koopman, "Comparing the Robustness of POSIX Operating
Systems," FTCS-29, 1999. (Web |
local)
- Eckhardt, D.E.; Caglayan, A.K.; Knight, J.C.; Lee, L.D.; McAllister, D.F.;
Vouk, M.A.; Kelly, J.P.J.; "An experimental evaluation of software
redundancy as a strategy for improving reliability," Software Engineering,
IEEE Transactions on , Volume: 17 Issue: 7 , Jul 1991 Page(s): 692 -702. (IEEE |
local)
- Kelly, J.P.J.; Eckhardt, D.E., Jr.; Vouk, M.A.; McAllister, D.F.;
Caglayan, A.; "A large scale second generation experiment in multi-version
software: description and early results," FTCS-18, 27-30 Jun 1988 Page(s):
9 -14, (IEEE
| local)
- J. C. Knight and N. G. Leveson, "An Experimental Evaluation of the
Assumption of Independence in Multi-version Programming", IEEE
Transactions on Software Engineering, Vol. SE-12, No. 1 (January 1986), pp.
96-109. (Citeseer |
local)
- Nancy Leveson, Stephen Cha, John Knight, and Timothy Shimeall, "The
Use of Self Checks and Voting in Software Error Detections: An Empirical
Study," IEEE Trans. on Software Engineering, Vol. SE-16, No. 4, April,
1990. (IEEE
| local)
- Littlewood, B.; Miller, D.R., A conceptual model of multi-version software,
Fault-Tolerant Computing, 1995, Highlights from Twenty-Five Years.,
Twenty-Fifth International Symposium on Page(s): 188 (originally FTCS 1987) (IEEE |
local)
- Bev Littlewood, Peter Popov and Lorenzo Strigini, "N-version design
Versus one Good," Fastabs at DSN 2000. (Citeseer |
local)
- Timothy Shimeall and Nancy Leveson, An Empirical Comparison of Software
Fault Tolerance and Fault Elimination," IEEE Trans. on Software
Engineering, Vol. SE-17, No. 2, February 1991, pp. 173-183 (IEEE |
local)
Other sources:
- Chillarege, 1995, challenges facing software fault-tolerance, IBMC 20281
- D. E. Eckhardt & L.D. Lee, "Fundamental differences in the
reliability of N-modular redundancy and N-version programming," Journal of
Systems and Software, 8(4): 313-318, Sept. 1988.
Required:
- Segall, Z.; Vrsalovic, D.; Siewiorek, D.; Ysskin, D.; Kownacki, J.; Barton,
J.; Dancey, R.; Robinson, A.; Lin, T.; "FIAT-fault injection based
automated testing environment," FTCS, 1988. (IEEE |
local) / 6 pages.
- Mei-Chen Hsueh, Timonthy K. Tsai, Ravishankar K. Iyer, "Fault
Injection Techniques and Tools," IEEE Computer, April 1997. (Citeseer |
local) / 8 pages
- Madeira, H.; Some, R.R.; Moreira, F.; Costa, D.; Rennels, D.;
"Experimental evaluation of a COTS system for space applications,"
DSN 2002 (IEEE |
local) / 6 pages
- Aidemark, J.; Vinter, J.; Folkesson, P.; Karlsson, J.; "Experimental
evaluation of time-redundant execution for a brake-by-wire application,"
DSN 2002, (IEEE |
local) / 6 pages
Supplemental:
- Arlat, J.; Crouzet, Y.; Laprie, J., "Fault injection for dependability
validation of fault-tolerant computing systems " FTCS 1989. (IEEE |
local)
- Arlat, J., Yves Crouzet, Johan Karlsson, Peter Folkesson, Günther
Leber, "Evaluation of the MARS Architecture by means of Three Physical
Fault Injection Techniques," ETDS 1995, Extended Abstract (Citeseer |
local)
- Barton, J., Czeck, E., Segall, Z., Siewiorek, D., Fault injection
experiments using FIAT, IEEE Transactions on Computers, 39(4):
57582 (IEEE |
local)
- Carreira, J.; Madeira, H.; Silva, J.G., Xception: a technique for the
experimental evaluation of dependability in modern computers, IEEE
Transactions on Software Engineering, vol.24, no.2, Feb 1998, p. 125-36 (IEEE |
Citeseer |
local)
- Chillarege, R.; Bowen, N.S.; "Understanding large system failures -- a
fault injection experiment," FTCS 1989, pp. 356-363 (IEEE |
local)
- Christmansson, J.; Chillarege, R.; "Generation of an error set that
emulates software faults based on field data", FTCS 1996. (IEEE |
local)
- Han, S., Shin, & Rosenberg, "DOCTOR: An IntegrateD SO ftware Fault
InjeC T iO n EnviR onment for Distributed Real-time Systems," ICPDS, 1995
(Citeseer |
local)
- Jenn, E.; Arlat, J.; Rimen, M.; Ohlsson, J.; Karlsson, J.; "Fault
injection into VHDL models: the MEFISTO tool," Fault-Tolerant Computing,
1994. FTCS-24. Digest of Papers., Twenty-Fourth International Symposium on ,
15-17 Jun 1994 Page(s): 66 -75 (IEEE |
local)
- G. A. Kanawati, N. A. Kanawati and J. A. Abraham, "FERRARI: A Flexible
Software-Based Fault and Error Injection System," IEEE Transactions on
Computers, vol. 44, no. 2, February 1995, pp. 248-260. (IEEE |
local)
- Karlsson, J.; Liden, P.; Dahlgren, P.; Johansson, R.; Gunneflo, U., Using
heavy-ion radiation to validate fault-handling mechanisms; Micro, IEEE ,
Volume: 14 Issue: 1 , Feb 1994 Page(s): 8 -23 (IEEE |
local)
- Koopman, P., Whats Wrong With Fault Injection As A Benchmarking
Tool? DSN Workshop on Dependability Benchmarking, 2002. (Web |
local)
- Madeira, H.; Costa, D.; Vieira, M.; "On the emulation of software
faults by software fault injection," DSN 2000 (IEEE |
local)
- Rodriguez, M.; Albinet, A.; Arlat, J.; "MAFALDA-RT: a tool for
dependability assessment of real-time systems", DSN 2002 (IEEE |
local)
- Fédédric Salles, Jean Arlat, Jean-Charles Fabre, "Can
We Rely on COTS Microkernels for Building Fault-Tolerant Systems?", 1997
(Citeseer |
local)
- Stott, D., Neil A. Speirs, Jun Xu, Saurabh Bagchi, Keith Whisnant, Zbigniew
Kalbarczyk, Ravishankar K. Iyer, "Fault Injection Based Assessment of
Fail-Silence Provided by Process Duplication versus Internal Error
Detection", FTCS, 2000. (Citeseer |
local)
- Voas, J., "Fault injection for the masses," IEEE Computer, Dec.
1997 (IEEE |
local)
Required:
- Yeh, Y.C.; " Design considerations in Boeing 777 fly-by-wire
computers", HASE 1998. (IEEE |
local)
NOTE: this is a "how we did it" case study paper, without too much
on "why". So don't waste breath on saying they didn't talk about
"why". Just read the paper to get a feel for all the stuff that has
to go into a real x-by-wire system.
Supplemental:
- Norris, G.; "Boeing's seventh wonder"; IEEE Spectrum , Volume: 32
Issue: 10 , Oct 1995 Page(s): 20 -23 (IEEE |
local)
- Buus, H.; McLees, R.; Orgun, M.; Pasztor, E.; Schultz, L.; "777 flight
controls validation process," Aerospace and Electronic Systems, IEEE
Transactions on , Volume: 33 Issue: 2 , Apr 1997 Page(s): 656 -666 (IEEE |
local)
- Driscoll, K.; Hoyme, K.; "The Airplane Information Management System:
an integrated real-time flight-deck control system"; Real-Time Systems
Symposium, 1992 , 2-4 Dec 1992 Page(s): 267 -270 (IEEE |
local)
- Gries, M.J.; "Systems engineering for the 777 Autopilot Flight
Director System," Digital Avionics Systems Conference, 1995., 14th DASC ,
5-9 Nov 1995 Page(s): 403 -409 (IEEE |
local)
- Hess, R.; "Computing platform architectures for robust operation in
the presence of lightning and other electromagnetic threats"; Digital
Avionics Systems Conference, 1997. 16th DASC., AIAA/IEEE , Volume: 1 , 26-30
Oct 1997 Page(s): 4.3 -9-16 vol.1 (IEEE |
local)
- Hoyme, K.; Driscoll, K.; "SAFEbus"; Digital Avionics Systems
Conference, 1992. Proceedings., IEEE/AIAA 11th , 5-8 Oct 1992 Page(s): 68 -73
(IEEE |
local)
- Ramohalli, G.; "The Honeywell on-board diagnostic and maintenance
system for the Boeing 777"; Digital Avionics Systems Conference, 1992.
Proceedings., IEEE/AIAA 11th , 5-8 Oct 1992 Page(s): 485 -490 (IEEE |
local)
- Yeh, Y.C.; "Triple-triple redundant 777 primary flight computer,"
Aerospace Applications Conference, 1996. Proceedings., 1996 IEEE , Volume: 1 ,
3-10 Feb 1996 Page(s): 293 -307 vol.1 (IEEE |
local)
Required:
- Meyer, J.F., "On evaluating the performability of degradable computing
systems," FTCS 1978, (IEEE |
local)
- Control reconfiguration in the presence of software failures Bodson, M.;
Lehoczky, J.; Rajkumar, R.; Sha, L.; Smith, M.; Soh, D.; Stephan, J.; Decision
and Control, 1993., Proceedings of the 32nd IEEE Conference on , 15-17 Dec 1993
Page(s): 2284 -2289 vol.3 (IEEE |
local)
- Shelton, C. & Koopman, P., "Using Architectural Properties to
Model and Measure Graceful Degradation," (to be published), 2003 (local)
Supplemental:
- Adlemo, A.; Andreasson, S.-A.; "Improved availability in manufacturing
systems through graceful degradation: case study of a machining cell,"
Robotics and Automation, 1995. Proceedings., 1995 IEEE International Conference
on , Volume: 2 , 21-27 May 1995 Page(s): 1744 -1750 vol.2 (IEEE |
local)
- Burns, A.; Punnekkat, S.; Strigini, L.; Wright, D.R. ; Probabilistic
scheduling guarantees for fault-tolerant real-time systems Dependable Computing
for Critical Applications 7, 1999 , 1999 Page(s): 361 -378 (Citeseer |
local) / 18 pages
- Herlihy & Wing, 1991, "specifying graceful degradation", IEEE
Trans. Parallel & Distr. Sys. 2(1), Jan 1991 (IEEE | local)
- Knight, J. & Sullivan, K., "On the definition of
survivability", 2000. (Citeseer |
local)
- Losq, J., "Effects of failures on gracefully degradable systems,"
7th Annual International Conference on Fault-Tolerant Computing, Los Angeles,
CA, USA; 28-30 June 1977, p. 29-34. (local)
- Ying-Wah Ng, Avizienis, A., "A reliability model for gracefully
degrading and repairable fault-tolerant systems," 7th Annual International
Conference on Fault-Tolerant Computing, Los Angeles, CA, USA; 28-30 June 1977,
p. 22-8 (local)
- S. Poledna, "Tolerating Sensor Timing Faults in Highly Responsive Hard
Real-Time Systems," IEEE Trans. on Computers, Vol. 44, No. 2, February
1995. (IEEE
| local)
- Ramanathan, P.; Graceful degradation in real-time control applications
using (m, k)-firm guarantee; Fault-Tolerant Computing, 1997. FTCS-27. Digest of
Papers., Twenty-Seventh Annual International Symposium on , 24-27 Jun 1997
Page(s): 132 -141 (IEEE |
local)
Required:
- Dawson, S.; Jahanian, F.; Mitton, T.; Teck-Lee Tung; "Testing of
fault-tolerant and real-time distributed systems via protocol fault
injection", FTCS 1996 (IEEE |
local)
- Dingman, C.P.; Marshall, J.; Siewiorek, D.P.; Measuring robustness of
a fault tolerant aerospace system, 25th International Symposium on
Fault-Tolerant Computing, June 1995. pp. 522-7 (IEEE |
local) / 6 pages.
- DeVale, J. & Koopman, P., "Robust software - no more
excuses," International Conference on Dependable Systems and Networks
(DSN), Washington DC, July 2002. (Web
| local) / 10 pages.
Supplemental:
- Carrette, G., CRASHME: Random input testing, (no formal
publication available)
http://people.delphiforums.com/gjc/crashme.html
accessed February 28, 2003.
- DeVale, J., Koopman, P., Guttendorf, D., The Ballista Software
Robustness Testing Service, 16th International Conference on Testing
Computer Software, 1999. pp. 3342. (Web |
local)
- Koopman, P.; DeVale, J., The exception handling effectiveness of
POSIX operating systems, IEEE Transactions on Software Engineering, Sept.
2000, vol.26, no.9 p. 837-48 (IEEE |
local)
- Madeira, Henrique; Diamantino Costa; Marco Vieiro; "On the Emulation
of Software Faults by Software Fault Injection," 2000 (IEEE |
Citeseer |
local)
- Miller, B., Fredriksen, L., So, B., An empirical study of the
reliability of operating system utilities, Communication of the ACM,
(33):3244, December 1990 (ACM |
Citeseer |
local)
- Miller, B., Koski, D., Lee, C., Maganty, V., Murthy, R., Natarajan, A.
& Steidl, J., Fuzz Revisited: A Re-examination of the Reliability of
UNIX Utilities and Services, Computer Science Technical Report 1268,
Univ. of Wisconsin-Madison, May 1998. (Citeseer |
local)
- Mukherjee, A., Siewiorek, D.P., Measuring software dependability by
robustness benchmarking, IEEE Transactions on Software Engineering,
Volume: 23 Issue: 6 , Jun 1997 Page(s): 366 -378 (IEEE |
local)
- Siewiorek, D., Hudak, J., Suh, B. & Segall, Z., Development of a
benchmark to measure system robustness, 23rd International Symposium on
Fault-Tolerant Computing, June 1993. pp. 88-97 (IEEE |
local)
- Vo, K-P., Wang, Y-M., Chung, P. & Huang, Y., Xept: a software
instrumentation method for exception handling, The Eighth International
Symposium on Software Reliability Engineering, Albuquerque, NM, USA; 2-5 Nov.
1997, pp. 6069 (IEEE |
local) / 10 pages.
Other sources:
- Hastings, R.; Joyce, B., Purify: fast detection of memory leaks and
access errors, Proceedings of the Winter 1992 USENIX Conference.
Required:
- Stankovic, 1988, misconceptions about real-time computing, IEEE Computer,
21(10) oct 88 pp. 10-19 (IEEE |
local)
- Sunondo Ghosh, Rami Melhem and Daniel Mosse, "Fault-Tolerant
Scheduling on a Hard Real-Time Multiprocessor System", IPPS, 1994. (Citeseer |
local)
- Kaiser, J.; Livani, M.A.; "Invocation of real-time objects in a CAN
bus-system," Object-Oriented Real-Time Distributed Computing, 1998. (ISORC
98) Proceedings. 1998 First International Symposium on , 20-22 Apr 1998
Page(s): 298 -307 (IEEE |
local)
Other High-Level Summaries:
- Kopetz, H., "Software Engineering for Real-Time: a roadmap,"
Proceedings of the conference on the future of software
engineering,", May 2000. (ACM |
local)
Supplemental:
- Cheng, Stankovic & Ramamritham, "Scheduling algorithms for hard
real-time systems: a brief survey," 1988. (local)
- Ghosh, Melhem, Mossé; Enhancing Real-Time Schedules to Tolerate
Transient Faults, Real-Time Systems Symposium, 1995. Proceedings., 16th IEEE ,
5-7 Dec 1995 Page(s): 120 -129. (IEEE |
local)
- Sunondo Ghosh, Rami Melhem, Daniel Mossé, Joydeep Sen Sarma,
"Fault-Tolerant Rate-Monotonic Scheduling", Journal of Real-Time
systems. vol 15, no. 2 September 1998 (1998). (Citeseer |
local)
- Nagarajan Kandasamy, John P. Hayes, and Brian T. Murray; Tolerating
Transient Faults in Statically Scheduled Safety-Critical Embedded
Systems"; Reliable Distributed Systems, 1999. Proceedings of the 18th IEEE
Symposium on , 1999 Page(s): 212 -221. (Citeseer |
IEEE |
local)
- Krishna, C. & Shin, K., "On scheduling tasks with a quick recovery
from failure," IEEE Trans. Computers, May 1986. (local)
- Lehoczky, J.P.; Rajkumar, R.; Sha, L.; Priority inheritance protocols: an
approach to real-time synchronization; Computers, IEEE Transactions on ,
Volume: 39 Issue: 9 , Sep 1990 Page(s): 1175 -1185 (IEEE |
local)
- Lonn, H.; Axelsson, J.; "A comparison of fixed-priority and static
cyclic scheduling for distributed automotive control applications,"
Real-Time Systems, 1999. Proceedings of the 11th Euromicro Conference on , 1999
Page(s): 142 -149 (IEEE |
local)
- Lu, Chenyang; Gang Tao; Son, S.H.; Stankovic, J.A.; The case for feedback
control real-time scheduling; Real-Time Systems, 1999. Proceedings of the 11th
Euromicro Conference on , 1999 Page(s): 11 -20 (IEEE |
local)
- Muppala, J.K.; Trivedi, K.S.; Woolet, S.P.; Real-time systems performance
in the presence of failures; Computer , Volume: 24 Issue: 5 , May 1991 Page(s):
37 -47 (IEEE
| local)
- Ramamritham, K.; Stankovic, J.A.; The Spring kernel: a new paradigm for
real-time systems; IEEE Software , Volume: 8 Issue: 3 , May 1991 Page(s): 62
-72 (IEEE |
local)
- Ramanathan, P.; Shin, K.G.;Real-time computing: a new discipline of
computer science and engineering; Proceedings of the IEEE , Volume: 82 Issue: 1
, Jan 1994 Page(s): 6 -24 (IEEE |
local)
- Minsoo Ryu, Seongsoo Hong, End-To-End Design Of Distributed Real-Time
Systems (1997) (Citeseer |
local)
- Salkind, L., Unix for Real-Time Control: problems and solutions, TR
400, NYU, September 1988. (local)
- Sha, Lui; Rajkumar Ragunathan & Shrish Sathaye (1994). Generalized
Rate-Monotonic Scheduling Theory: A Framework for Developing Real-Time Systems.
In Proceeding of the IEEE. Vol. 82. No. 1, Jan 1994,(pp. 68-82). (IEEE |
local)
- Shin, K.G.; HARTS: a distributed real-time architecture; Computer ,
Volume: 24 Issue: 5 , May 1991 Page(s): 25 -35 (IEEE |
local)
- Shin, K.G.; Zuberi, K.M.; EMERALDS: a microkernel for embedded real-time
systems; Real-Time Technology and Applications Symposium, 1996. Proceedings.,
1996 IEEE , 10-12 Jun 1996 Page(s): 241 -249 (IEEE |
local)
- John A. Stankovic; "Real-Time and Embedded Systems"; ACM
Computing Surveys (CSUR) March 1996. (Citeseer |
local)
- H. Tokuda, T. Nakajima & P. Rao, "Real-time Mach: towards a
predictable real-time system", Proc. Usenix Mach Workchop, October
1990, pp. 1-10. (Citeseer |
local)
- Lei Zhou; Rundensteiner, E.A.; Shin, K.G.; "Rate-monotonic scheduling
in the presence of timing unpredictability" Real-Time Technology and
Applications Symposium, 1998. Proceedings. Fourth IEEE , 3-5 Jun 1998 Page(s):
22 -27 (IEEE |
local)
Other Sources:
- Locke, 1992, "software architecture for hard real-time applications:
cyclic executives vs. priority executives," real-time systems 4(1):37-53,
March 1992
Required:
- Fagan, M., "Advances in software inspections," IEEE Trans.
Software Engineering, SE-12, July 1986, pp. 744-751. (local) / 8 pages.
- Umansky, Studs and Duds, The Washington Monthly, December 2001 (Web
| local) / 6 easy pages.
- Buus, H.; McLees, R.; Orgun, M.; Pasztor, E.; Schultz, L.; "777 flight
controls validation process," Aerospace and Electronic Systems, IEEE
Transactions on , Volume: 33 Issue: 2 , Apr 1997 Page(s): 656 -666 (IEEE |
local) / 11 pages.
Supplemental:
- A. F. Ackerman, "Software inspections and the cost effective
production of reliable software," in M. Dorfman & R. Thayer (Eds.),
Software Engineering, IEEE Computer Society, 1997, pp. 116-130. (local)
- Crary, K.; Harper, R.; Lee, P.; Pfenning, F.; "Automated techniques
for provably safe mobile code"; DARPA Information Survivability Conference
and Exposition, 2000. DISCEX '00. Proceedings , Volume: 1 , 2000 Page(s): 406
-419 vol.1 (IEEE
| local)
- Fagan, M., "Design and code inspections to reduce errors in program
development," IBM Systems Journal, 15(3), 1976, pp. 182-211. (local)
- R. Fujii & D. Wallace, "Software verification and validation"
in M. Dorfman & R. Thayer (Eds.), Software Engineering, IEEE
Computer Society, 1997, pp. 116-130. (local)
- Goddard, "Validating the safety of embedded real-time control systems
using FMEA," Proc. annual reliability and maintainability symp., 1993, pp.
227-230 (IEEE |
local)
(Talks about software FMEA)
- Musa, J. "Operational Profiles in Software-Reliability
Engineering." IEEE Software, March 1993. (IEEE |
local)
- Myers, G., "A controlled experiment in program testing and code
walkthroughs/experiments," CACM, September 1978. (local)
- J. Palmer, "Traceability," in: M. Dorvman & R. Thayer (Eds.),
Software Engineering, 1997, pp. 266-276. (local)
- Weinberg, G. & Freedman, D., "Reviews, walkthroughs, and
Inspections," IEEE Trans. on Software Engineering, Vol. SE-10(1), January
1984, pp. 68-72. (local)
- Whittaker, J., "What is software testing? And Why is it so
hard?", IEEE Software, Jan/Feb 2000. (IEEE |
local)
- Hong Zhu, Patrick A.V. Hall, and John H.R. May, "Software Unit Test
Coverage and Adequacy", ACM Computing Surveys (CSUR) December 1997, pages
366-427. (ACM |
local)
Supplemental Formal Methods papers:
- Anthony Hall, Seven Myths of Formal Methods, IEEE Software, September 1990
pp. 11-19 (IEEE |
local)
- Bowen, J.P.; Hinchey, M.G.; "Seven more myths of formal methods,"
IEEE Software , Volume: 12 Issue: 4 , Jul 1995 Page(s): 34 -41 (IEEE |
local)
- Edmund M. Clarke, Jeannette M. Wing "Formal Methods: State of the Art
and Future Directions," ACM Computing Surveys, 1996, (Citeseer |
local)
- Gerhart, Craigen & Ralston, Experience with formal methods in critical
systems, IEEE software, Jan 1994, pp. 21-39 (IEEE |
local)
- Ostroff, J., "Formal methods for the specification and design of
real-time safety critical systems," Journal of Systems and Software, April
1992, pp. 33-60. (local)
- John Rushby, "Formal Methods for Dependable Real-Time Systems,"
International Symposium on Real-Time Embedded Processing for Space
Applications, 1992 (Citeseer |
local)
- Wai Wong, "Formal Verification Of VIPER's ALU", 1993. (Citeseer |
local)
- Xu, J.; Randell, B.; Romanovsky, A.; Stroud, R.J.; Zorzo, A.F.; Canver, E.;
von Henke, F.; Rigorous development of an embedded fault-tolerant system based
on coordinated atomic actions; Computers, IEEE Transactions on , Volume: 51
Issue: 2 , Feb 2002 Page(s): 164 -179 (IEEE |
local)
Other sources:
- P. M. Melliar-Smith, R. L. Schwartz, "Formal Specification and
Mechanical Verification of SIFT: A Fault-Tolerant Flight Control System,"
IEEE Trans. on Computers, Vol. C-31,No. 7, July 1982.
- H.D. Mills, M. Dyer, and R.C. Linger, "Cleanroom Software
Engineering," IEEE Software, Sept. 1987, pp. 19-24
- Deck, M.D, and J. A. Whittaker, "Lessons Learned from Fifteen Years of
Cleanroom Testing," Software Testing, Analysis, and Review (STAR) '97, San
Jose, CA, May 5-9, 1997
Required:
- Rubenstein, E. & Mason, J., "An analysis of Three Mile
Island", IEEE Spectrum, November 1979, pp. 32-43 (local) / 14 pages.
- Sugarman, R., "Analysis and Assessment: Nuclear Power and the Public
Risk", IEEE Spectrum, November 1979, pp. 58-79 (local) / 22 pages.
- Lombardo, T., "Institutional constraints: the decision-makers: a
cacophony of voices", IEEE Spectrum, November 1979, pp. 81-95 (local) / 15pages. (Includes:
Christiansen, D., "TMI and the Press" sidebar and other material.)
Required:
- Rasmussen, J., "The definition of human error and a taxonomy for
technical system design," In: Rasmussen, J., Duncan, K., Leplat, J. (eds)
New Technology and Human Error, John Wiley & Sons, 1987. (local) / 8 pages.
- Rasmussen, J.; "Human factors in the high-risk systems"; Human
Factors and Power Plants, 1988., Conference Record for 1988 IEEE Fourth
Conference on , 5-9 Jun 1988 Page(s): 43 -48 (IEEE |
local) / 7 pages.
- Nancy G. Leveson, L. Denise Pinnel, Sean David Sandys, Shuichi Koga, Jon
Damon Reese. , "Analyzing Software Specifications for Mode Confusion
Potential," Workshop on Human Error and System Development, Glascow, March
1997. (Web |
local) / 16 pages
Supplemental:
- Nancy G. Leveson and Clark S. Turner. An Investigation of the Therac-25
Accidents. IEEE Computer, Vol. 26, No. 7, July 1993, pp.18-41. (Web |
IEEE |
local)
- Brown, M.L.; "Software systems safety and human errors", Computer
Assurance, 1988. COMPASS '88 , 27 Jun-1 Jul 1988 Page(s): 19 -28 (IEEE |
local) / 10 pages.
- Burns, A., "The HCI component of dependable real-time systems."
Software Engineering Journal, July 1991, vol. 6, no. 4, p. 168 174. (local)
- Peter G. Neumann; "The human element"; Communications of the ACM
November 1991 Volume 34 Issue 11 (ACM |
local)
- Rasmussen, J., "Human error mechanisms in complex work
environments" Reliability Engineering & System Safety 22, no. 1-4,
(1988) : 155-67 (local)
- William B. Rouse, "Human-Computer Interaction in the Control of
Dynamic Systems", ACM Computing Surveys (CSUR) Volume 13 , Issue 1 (March
1981), (ACM |
local)
- Boweler, Y.; Cullen, I.; Hutchinson, E.; " Enhancing the safety of
future systems"; Human Interfaces in Control Rooms, Cockpits and Command
Centres, 1999. International Conference on , 21-23 Jun 1999 Page(s): 179 -183
(IEEE |
local)
Other Reading:
- Reason & maddox, Human factors guide for aviation maintenance
http://www.galazyatl.com/hfg/c14s00.htm
- Nagel, D.C. (1988). Human error in aviation operations. In Wiener, E.L.,
and Nagel, D.C. (Eds.) Human factors in aviation (Chapter 9). San Diego, CA:
Academic Press.
Required:
- Leveson, N.G., Software safety: why, what, and how ACM Computing Surveys
(CSUR) June 1986 Volume 18 Issue 2 (ACM |
local) / 39 book pages.
- Leveson, N.G. "High-pressure steam engines and computer
software," Computer , Volume: 27 Issue: 10 , Oct. 1994 Page(s): 65 -73 (Web |
IEEE
| local)
Other High-Level Summaries:
- Lutz, R., "Software Engineering for Safety: a roadmap,"
Proceedings of the conference on the future of software
engineering,", May 2000. (ACM |
local)
Supplemental:
- Addy, E.A.; A case study on isolation of safety-critical syoftware, proc.
6th conf. computer assurance, 1991, NIST/IEEE, pp. 75-83 (IEEE |
local)
- Dalcher, D.; "Lessons for the future: safety critical systems;"
Engineering of Computer-Based Systems, 1999. Proceedings. ECBS '99. IEEE
Conference and Workshop on , 7-12 Mar 1999 Page(s): 281 -293 (IEEE |
local)
- de Lemos, R.; Saeed, A.; Anderson, T.; "Analyzing safety requirements
for process-control systems;" IEEE Software , Volume: 12 Issue: 3 , May
1995 Page(s): 42 -53 (IEEE |
local)
- Hansen, Kirsten M., Anders, P. Ravn, Stavridou, Victoria, From Safety
Analysis to Software Requirements, IEEE Transactions on Software Engineering,
Vol .24, No. 7, July 1998 (IEEE |
local)
- Herrmann, D.S.; "A methodology for evaluating, comparing, and
selecting software safety and reliability standards," COMPASS '95.
'Systems Integrity, Software Safety and Process Security', 25-29 Jun 1995
Page(s): 223 -232 (IEEE |
local)
- Knight, J.C.; Safety critical systems: challenges and directions; Software
Engineering, 2002. ICSE 2002. Proceedings of the 24rd International Conference
on , 2002 Page(s): 547 -550 (IEEE |
local)
- Leveson, N., "System safety in computer-controlled automotive
systems", SAE Congress, March 2000. (Web |
local)
- Leveson, Software safety in embedded computer systems, CACM 34(2), 1991, p.
34-46 (ACM |
local)
- Wallace, D.R.; Kuhn, D.R.; Ippolito, L.M.; "An analysis of selected
software safety standards", IEEE Aerospace and Electronics Systems
Magazine , Volume: 7 Issue: 8 , Aug 1992 Page(s): 3 -14 (IEEE |
local)
- Weiss, K.A.; Leveson, N.; Lundqvist, K.; Farid, N.; Stringfellow, M. An
analysis of causation in aerospace accidents Digital Avionics Systems, 2001.
DASC. 20th Conference , Volume: 1 , 2001 Page(s): 4A3/1 -4A3/12 vol. (IEEE |
local)
- Evaluation of safety-critical software David L. Parnas , A. John van
Schouwen , Shu Po Kwan Communications of the ACM June 1990 Volume 33 Issue 6
(ACM |
local)
- Mcdermid, "Education and training for safety-critical systems
practitioners." In wichmann (ed), software in safety-related systems, pp.
177-207, chichester: wiley, 1992
Required:
- Anderson, R.J., "Why Cryptosystems Fail," CACM, 37(11), ACM,
November 1994. (ACM |
local)
- Peter Bergstrom, Kevin Driscoll, John Kimball, "Making Home Automation
Communications Secure," Computer, Oct 2001, pp. 50-56 / 7 pages. (IEEE |
local)
- Wargo, C. & Dhas, C., "Security Considerations for the e-Enabled
Aircraft", Aerospace Conference 2003. (local)
Supplemental:
- Devanbu, P. & Stubblebine, S., "Software engineering for security:
a roadmap," Proceedings of the conference on the future of software
engineering,", May 2000. (ACM |
local)
- Dobson & Randell, "building reliable secure computing systems out
of unreliable insecure components," Proc. 1986 symp. security &
privacy, IEEE, 1986, pp. 187-193 (2001 IEEE
reprint | local) Also,
the introduction to the reprint with more context (IEEE |
local).
- Lampson, B., et al, Authentication in Distributed Systems: Theory and
Practice, Proc. 13th SOSP, ACM, October 1991 (ACM |
local)
- Landwehr, A taxonomy of computer security flaws, with examples; ACM
computing surveys 26(3), Sept. 1994 (ACM |
local)
- Stankovic, J.A.; Wood, A.D.; "Denial of service in sensor
networks"; Computer , Volume: 35 Issue: 10 , Oct 2002 Page(s): 54 -62 (IEEE |
local)
- R. C. Summers, "An overview of computer security," IBM systems
journal, vol. 23, no. 4, 1984. (local)
Required:
- Pilkington, S.D.J.; Lee, A.R.; "The development of safety cases for
mass transit signalling and control projects-Jubilee Line case study,"
Developments in Mass Transit Systems, 20-23 Apr 1998 Page(s): 254 -259 (IEEE
| local) / 6 pages
- Peter H Jesty and Keith M Hobley (University of Leeds), Richard Evans
(Rover Group Ltd), Ian Kendall (Jaguar Cars Ltd), "Safety Analysis of
Vehicle-Based Systems", Proceedings of the 8th Safety-critical Systems
Symposium, 2000. (Web |
local) / 21 pages
- Czerny, B.J.; D'Ambrosio, J.G.; Murray, B.T.; "Providing convincing
evidence of safety in X-by-wire automotive systems;" High Assurance
Systems Engineering, 2000, Fifth IEEE International Symposim on. HASE 2000 ,
2000 Page(s): 189 -192 (IEEE |
local) / 4 pages.
Supplemental:
- Bell & Reinert, "Risk and system integrity concepts for
safety-related control systems," Microprocessors & microsystems, 1993,
17(1), 3-15 (IEEE |
local)
- Betts, A.E.; Welbourne, D.; "Software safety assessment and the
Sizewell B applications", Electrical and Control Aspects of the Sizewell B
PWR, 1992., International Conference on , 14-15 Sep 1992 Page(s): 204 -207 (IEEE |
local)
- Blackwell, N.; Leinster-Evans, S.; Dawkins, S.K.; "Developing safety
cases for integrated flight systems," Aerospace Conference, 1999.
Proceedings. 1999 IEEE , Volume: 5 , 6-13 March 1999 Pages:225 - 240 vol.5 (IEEE |
local)
- Cooper, L., "Assessing risk from the stakeholder perspective,"
IEEE Aerospace Conference, March 2003. (local)
- Jones, J.A.; Marshall, J.; Newman, B.; "The reliability case in the
REMM methodology ", Reliability and Maintainability, 2004 Annual
Symposium - RAMS, 26-29 Jan. 2004 Pages:25 - 30 (IEEE |
local)
- Herrmann, D.S.; Peercy, D.E.; "Software reliability cases: the bridge
between hardware, software and system safety and reliability," Reliability
and Maintainability Symposium, 1999. Proceedings. Annual , 18-21 Jan. 1999
Pages:396 - 402. (IEEE |
local)
- Lane, M.; "Predicting the reliability and safety of commercial
software in advanced avionic systems"; Digital Avionics Systems
Conferences, 2000. Proceedings. DASC. The 19th , Volume: 1 , 2000 Page(s):
4E4/1 -4E4/8 vol.1 (IEEE |
local)
- Perera, J.S., "Risk management for the international space
station," IEEE Aerospace Conference, March 2003. (local)
- Rivett, R. "Is there a Role for Third Party Software Assessment in the
Automotive Industry?", Proceedings of the 5th Safety-critical Systems
Symposium, 1997. (Web |
local)
- S P Wilson, T P Kelly, J A McDermid, "Safety Case Development:
Current Practice, Future Prospects", Proceedings of 1st ENCRESS/12th CSR
Workshop, September 1995, Springer-Verlag. (Citeseer |
local) / 22 pages
Risk Management Tools
- Cornford, S.L.; Feather, M.S.; Hicks, K.A.; DDP-a tool for life-cycle risk
management; Aerospace Conference, 2001, IEEE Proceedings. , Volume: 1 , 2001
Page(s): 1/441 -1/451 vol.1 (IEEE |
local)
- Probabilistic Risk Assessment Procedures Guide for NASA Managers and
Practitioners, New Version 1.1 of November 12, 2002 (Web |
local)
- NPG 8715.3
NASA Safety Manual (local)
- NASA risk management web site
Web
Required:
- Schinzinger. Technology hazards and the engineer. IEEE Technology and
Society Magazine, June 1986, pp. 12-16. (local) / 5 pages
- Redmill, F.; "Some dimensions of risk not often considered by
engineers" Computing & Control Engineering Journal , Volume: 13 Issue:
6 , Dec 2002 Page(s): 268 -272 (IEEE |
local) / 5 pages
- Davis, "Safety critical systems - legal liabilities," Computing
& control, 1994, 5(1), 13-17 (IEEE |
local) / 5 pages
- John C. Knight , Nancy G. Leveson; "Licensing software engineers:
Should software engineers be licensed?" Communications of the ACM November
2002 Volume 45 Issue 11 (ACM |
local)) / 4 pages.
Supplemental:
- Jonathan Bowen, "The ethics of safety-critical systems", Comm.
ACM, Volume 43, No. 4 (Apr. 2000), Pages 91 - 97. (ACM |
local)
- Gibbs, W., "Software's chronic crisis," Scientific American,
Sept. 1994, pp. 86-95. (local)
- Gotterbarn, "How the new software engineering code of ethics affects
you," IEEE Software, Nov/Dec 1999. (IEEE | local)
- Herket, J.R.; "Ethical risk assessment: valuing public
perceptions"; IEEE Technology and Society Magazine , Volume: 13 Issue: 1 ,
Spring 1994 Page(s): 4 -10 (IEEE |
local)
- Kahn, Shulamit. Economic Estimates of the Value of Life. IEEE Technology
and Society Magazine, June 1986, pp. 24-31. (local)
- McFarland, "Ethics and the safety of computer systems," IEEE
Computer, February 1991. (IEEE |
local).
- Rodriguez-Dapena, P.; "Software safety certification: a multidomain
problem" IEEE Software, Volume: 16 Issue: 4 , Jul/Aug 1999 Page(s): 31 -38
(IEEE |
local)
Other References:
- Perrow, C., Normal Accidents, Princeton University Press, 1999.
- Wiener, Lauren. Digital Woes: why we should not depend on software.
Reading, Mass.: Addison-Wesley Pub. Co., 1993. ISBN 0201626098.
- Birsch, D. and J.H. Fielder. The ford pinto case: A study in applied
ethics, business, and technology. Albany, NY: State University of New York
Press. 1994.
- Royal society, Risk: analysis, perception, and management, London:
royal society, 1992
- Barnett, "Doctrine of manifest danger" ;ASME DE-55, reliability,
stress analysis, failure prevention,1993
- D. Okrent, "Risk Perception Versus Risk Analysis," Reliability
Engineering & System Safety, Volume 59, Number 1
- Wichmann, "Legal liability for software in safety-realted
systems," in: Wichmann (ed) software in safety-related systems,
chichester: wiley, 1992.
- Wilde, G. J. S. "The theory of risk homeostasis: Implications for
safety and health." Risk Analysis, 2:209-225, 1982.
http://www.badsoftware.com/ has
several papers that talk about UCITA, which is an attempt to regulate software
that will have an effect on embedded system software.
Required:
- Kopetz, H.; Merker, W., "The Architecture of MARS", FTCS 1985, p.
50. (IEEE |
local)
- Kopetz, H.; Grunsteidl, G.; "TTP-a protocol for fault-tolerant
real-time systems"; Computer , Volume: 27 Issue: 1 , Jan 1994 Page(s): 14
-23 (IEEE |
local)
- Hermann Kopetz, Günther Bauer "The Time-Triggered
Architecture," Proceedings of the IEEE, Jan 2003 (IEEE |
local)
Supplemental:
- Damm, A.; Kopetz, H.; Koza, C.; Mulazzani, M.; Schwabl, W.; Senft, C.;
Zainlinger, R.; "Distributed fault-tolerant real-time systems: the Mars
approach;" Micro, IEEE , Volume: 9 Issue: 1 , Feb 1989 Page(s): 25 -40 (IEEE |
local)
- Maier, Bauer, Stoger & Poledna, "Time-triggered architecture: a
consistent computing platform," IEEE Micro, July-August 2002. (IEEE |
local)
- Poledna, S.; Burns, A.; Wellings, A.; Barrett, P.; "Replica
determinism and flexible scheduling in hard real-time dependable systems;"
Computers, IEEE Transactions on , Volume: 49 Issue: 2 , Feb 2000 Page(s): 100
-111 (IEEE
| local)
Required:
- BART (San Francisco Bay Area Rapid Transit District), System Safety
specification, 1981. (local)
- Littlewood & Strigini, "Validation of ultrahigh dependability for
software-based systems." CACM, pp. 69-80, Nov. 1993 (ACM |
local)
- Myers, W. "Can software for the Strategic Defense Initiative ever be
error-free? ", Computer 19, no. 11, (Nov. 1986) : 61-7 (local)
Supplemental:
- Alger, L.S.; Harper, R.E.; Lala, J.H.; "A design approach for
ultrareliable real-time systems; Computer , Volume: 24 Issue: 5 , May 1991
Page(s): 12 -22 (IEEE |
local)
- Brooks, F., "No Silver Bullet: essence and accidents of software
engineering," IEEE Computer, 20(4): 10-19. (local)
- Butler & Finelli, "The infeasibility of experimental
quantification of life-critical software reliability,", IEEE Trans. SW
Engr. 19(1):3-12, Jan 1993. (IEEE |
local)
- Roger S. Rivett; "Emerging Software Best Practice and how to be
Compliant", Proceedings of the 6th International EAEC Congress July 97.
(Web |
local)
- Rushby, John, "Formal Methods and the Certification of Critical
Systems," SRI-CSL Technical Report, November 1993. (Citeseer |
local)
- Rushby, "Critical system properties: survey and taxonomy" (web
version), 1994 (Citeseer |
local)
- Saltzer, J.H., Reed, D.P., Clark, D.D, End-to-End Arguments in System
Design, Transactions on Computer Systems 2(4):277-288, ACM, November 1984. (ACM |
local)
- Suri, N., Walter, C. & Hugue, M., "Introduction", Advances
in ultra-dependable distributed systems, IEEE Press, 1995 (local)
- Siewiorek, Daniel P., Hsiao, M. Y., Rennels, David, Gray, James, Williams,
Thomas, Ultradependable Architectures, Annual Review of Computer Science, 1990