Carnegie Mellon University
18-849b Dependable Embedded Systems
Spring 1999
Author: John P DeVale
Proper care and maintenance of a system is an integral part of ensuring its reliable operation. When specifying maintenance procedures and replacement parts, the system designers are implicitly stating that failure to comply may result in non-reliable behavior. For instance an owner who neglect changing the oil of his or her car, or uses cheap, non-conforming oil should not be surprised when the engine fails prematurely. Complex computing systems designed for reliability are no different. The improper maintenance of systems has been correlatively linked to aircraft failure and crashes, including the crash of Northwest flight 520. Similarly, non-conforming (cheap and uncertified) parts have been linked with other failures, most notably in the aviation industry. Further, the automotive industry is plagued with end-user circumvention of fuel economy and exhaust systems that reduce fuel efficiency and increase pollution, which may result in regulatory difficulties for the manufacturer. As more computing systems are built into critical systems, making such systems as tolerant of counterfeit parts, improper maintenance, or end-user tampering becomes an ever more important issue.
Many safety critical systems have a large single point source of failure injection - the end user. Safety features can be an irritant to the end-user, especially if they are thought to limit performance or if they are deemed uncool (for example seat-belts). In such instances the end user may circumvent the system for some perceived benefit. While this may not be directly attributable to a failure to provide a safe product on the manufacturers part, there is an alarming trend for large civil awards to plaintiffs hurt by products that were used improperly, or in an unsafe, unprescribed manner.
In addition to the user circumvention of safety features, the growing cost of maintenance has spured the growth of low-cost, low quality replacement parts (shoddy spares). This is especially true for industries where maintenance costs dominate over the lifetime of equipment, most notably the aviation industry. One study indicates that the market for such spare parts reached $500 million in 1986, with a strong upward trend [cohen88]. In addition to compromising safety, use of such non-conforming replacement parts may invalidate. The problem is not limited to commercial concerns, in 1986 the president of Execuair Corp. was convicted of selling counterfeit aviation parts and equipment to the US Air Force [fortune87].
The result of such intervention and use non-conforming spare parts goes beyond safety and reliability, and impacts other areas. Systems which rely on the proper functioning of components for security or environmental performance are easily compromised when faced with an end user either maliciously or unknowingly bypassing such systems or using shoddy replacement parts. Such activity is especially prevalent in the auto industry where performance minded owners bypass emissions control hardware and use fuel wasting "Speed Chips" to enhance the power or their vehicle.
Although such events are arguably difficult to guard against, any safety
critical system designer should be aware of such problems, and work to make
their design as resistant to them as possible.
Perhaps the most obvious potential problem stemming from the use of shoddy spare parts or user circumvention of safety features is the possible compromise of overall system safety. The complex nature of todays systems by and large require the reliable, deterministic operation of all systems and components included in the design. Even small changes or defects can lead to large problems with the system as a whole.
In 1986, an Enstrom F-28 helicopter crashed killing reporter Jane Dornaker and the entire flight crew. NTSB investigation into the incident discovered that the use of counterfeit parts during routine maintenance may have contributed to the incident [fortune87]. Similarly, Northwest Flight 520 crashed during takeoff. One factor listed is that the warning system telling pilot the flaps were incorrectly positioned was disabled [NTSB 87].
Such incidents may never be causally linked to shoddy spare parts. Even so, it is clear that a system designer can only adequately model, and predict reliability and safe operation when he or she knows the exact characteristics of the components in the system.
The environmental safety (or "green"-ness) of a product can also be adversely impacted by customer circumvention and the use of non-conforming replacement parts. This is particularly true in the automotive industry, where systems are finely tuned for environmental performance and fuel efficiency. Many end users feel restricted by such efforts and look for ways to bypass or circumvent such systems. Third party vendors such as Superchips Corp. provide custom software to trade fuel efficiency and environmental performance for additional power output [superchips99]. This is typically done by reverse engineering the software provided by the manufacturer an altering it to provide a high performance solution. Unfortunately, as with every other complex engineered system, auto manufacturers are moving more and more complexity into system software. As the software complexity increases, the likelihood that making changes to the software without fully understanding the impact of the changes increases. This may in turn lead to a decrease in reliable operation, or even system safety.
Beyond reliability, systems which rely on conforming parts and properly working subsystems to provide security or authentication are susceptible to circumvention or degraded capability due to shoddy spare parts. Consider the Sony Playstation(tm) as a typical consumer device which provides some level of security/authentication. In this case, it is copyright protection of software and regional control over which software titles may be played in the machine (via country codes). Despite the efforts of the manufacturer, several companies offer solutions for consumers wishing to bypass such restrictions, including:
A newer technology which faces a similar challenge is DIVX, a digital movie format which purports to be secure, and consumer friendly. Created by Digital Video Express LP, the DIVX format allows users to purchase DIVX movies for a low cost, and then pay for additional viewings past the first 48 hours of having first viewed the movie. The system relies on a modem to connect to the Digital Video Express computer system which handles billing and authentication[DIVX99]. The system is not yet widely used and is currently the source of much controversy with strong opponents and proponents. History tells us that if the system begins to enjoy wide spread use, companies and individuals will attempt to reverse engineer and bypass the security/authentication mechanisms. Unless the DIVX designers were extremely careful, it seems likely that the system will be compromised with little effort.
The problem of user circumvention is a difficult one to solve. After all, if the physical security of an operational system can not be maintained, then it becomes difficult to prevent user circumvention of safety or security systems. Use of strong encryption technologies to validate system components and parts can help, but in the end, any system like that can be bypassed if the reward is great enough.
One potential method of preventing the use of shoddy spare parts is the use
of technologies like taggant identification particles can help validate part
authenticity[Strassberg96]. The technology embeds particles with unique
magnetic signatures in a product. A secure hash is used on the id's digital
value. The part's authenticity can then be verified by checking the hashed
signature[microtrace99]. This method seems to be most appropriately used in
situation where the system in question is subject to periodic re-examination by
a regulatory body such as those found in the medical, aviation, and nuclear
industries.
The use of shoddy spares will most likely compromise the reliability of the system because they will not meet the stringent quality requirements of the specified part.
Circumvention of performance limiting systems (like auto fuel controllers) will allow the machine to operate in ranges outside of expected, accelerating wear out of components engineered to operate at a different set point.
Systems which depend on regularly scheduled maintenance for proper operation will most likely experience more failures if shoddy spares are used during maintenance. Many cheap parts from "Alternative Vendors" do not meet wear or output specification and can adversely affect not only performance but also other parts in the system. For example a cheaper capacitor user as a replacement part may experience dielectric breakdown more rapidly and closer to its operating condition than a more expensive part. This may cause systems depending on filtered output of the capacitor to experience fluxuations outside of their rated operating conditions, causing a cascade failure.
Regulatory agencies which require periodic recertification of equipment can help mitigate the problem of counterfeit spare parts.
There are no easy answers to the problems posed here. The corporate quest
toward profit can too easily justify taking shortcuts to further the bottom
line. For safety critical systems however, this type of cost cutting can have
severe repercussions. A design philosophy which makes components highly
interdependent and made as difficult as possible to reverse engineer or replace
with other functionally equivalent components will help to increase to cost of
doing so. In the final analysis, your efforts should be proportional to the
expected impact of allowing the intervention or use of counterfeit parts,
because if the potential profit is large enough, end-users and/or
counterfeiters will work diligently to circumvent such safeguards.
[cohen88]
|
Cohen, Monica., Bogus Buys: The Counterfeit Crisis, Equipment Management, Lincolnwood, May 1988 This article discusses part counterfeiting, and its prevelance and impact on industry. |
[DIVX99]
|
Digital Video Express Corp., http://www.divx.com/index_about_divx.htm, accessed 5/10/99. Corporate page describing DIVX. Provided as background to a technology discussed which has potential end-user circumvention troubles. |
[fortune87]
|
Isgro, A.C., The Hidden Threat to Air Safety, Fortune, Apr 13, 1987 p.81, Chicago Discussion of part counterfeiters in the aviation industry. |
[microtrace99] | Microtrace Corp., http://www.microtaggant.com, accessed 5/10/99 Corporate page describing their taggant technologies. |
[NTSB87]
|
National Transportation Safety Board, Incident Report, NTSB Identification: DCA87MA046, August 16, 1987 NTSB incident replort on the crash of Northwest Flight 255 out of Detroit. |
[Strassberg96]
|
Strassberg, Dan., Sensors and coded particles foil counterfeiters, EDN, Aug 15, 1996, Boston Article discussing part authentication through the use of taggant identification materials. |
[superchips99]
|
Superchips Corp., http://www.superchips.com/, accessed 5/10/99 Corporate page describing their replacement parts for engine controllers to enhance engine performance. |