Human Interface/Human Error

Carnegie Mellon University
18-849b Dependable Embedded Systems
Spring 1999

Author: Charles P. Shelton


Abstract:

Human operators are one of the biggest sources of errors in any complex system. Many operator errors are attributed to a poorly designed human-computer interface (HCI). However, human beings are often needed to be the fail-safe in an otherwise automated system. Even the most highly trained and alert operators are prone to boredom when they are usually not needed for normal operation, and panic when an unusual situation occurs, stress levels are raised, and lives are at stake. The HCI must give appropriate feedback to the operator to allow him or her to make well informed decisions based on the most up to date information on the state of the system. High false alarm rates will make the operator ignore a real alarm condition. Methods for determining the effectiveness of an HCI, such as heuristic evaluation, cognitive walkthroughs, and empirical evaluations like protocol analysis, exist, but are often cumbersome and do not provide conclusive data on the safety and usability of an HCI. System designers must insure that the HCI is easy and intuitive for human operators to use, but not so simple that it lulls the operator into a state of complacency and lowers his or her responsiveness to emergency situations.


Contents:


Introduction

In any complex system, most errors and failures in the system can be traced to a human source. Incomplete specifications, design defects, and implementation errors such as software bugs and manufacturing defects, are all caused by human beings making mistakes. However, when looking at human errors in the context of embedded systems, we tend to focus on operator errors and errors caused by a poor human-computer interface (HCI).

Human beings have common failure modes and certain conditions will make it more likely for a human operator to make a mistake. A good HCI design can encourage the operator to perform correctly and protect the system from common operator errors. However, there is no well defined procedure for constructing an HCI for safety critical systems.

In an embedded system, cost, size, power, and complexity are especially limited, so the interface must be relatively simple and easy to use without sacrificing system safety. Also, a distinction must be made between highly domain specific interfaces, like nuclear power controls or airplane pilot controls, and more general "walk up and use" interfaces, like automated teller machines or VCR onscreen menus [Maxion95]. However, this is not a hard and fast distinction, because there are interfaces such as the one in the common automobile that specifically require some amount of training and certification (most places in the world require a driver's license test) but are designed to be relatively simple and universal. However, all cars do not have the same interface, and even small differences may cause an experienced driver to make a mistake when operating an unfamiliar car.

In safety critical systems, the main goal when of the user interface is to prevent the operator from making a mistake and causing a hazard.  In most cases usability is a complementary goal in that a highly usable interface will make the operator more comfortable and reduce anxiety.  However, there are some tradeoffs between characteristics that make the interface usable and characteristics that make it safe. For example, a system that allows the user to commit a procedure by simply pressing the enter key a series of times may make it extremely usable, but allow the operator to bypass important safety checks or easily confirm an action without assessing the consequences. This was one of the problems with the Therac-25 medical radiation device. Operators could easily bypass error messages on the terminal and continue to apply treatment, not realizing they were administering lethal doses of radiation to the patient. Also, the error messages were not particularly descriptive, which is also another problem with user interfaces providing appropriate feedback.  It is also important to recognize that not all systems are safety critical, and in those cases, usability is the main goal of the HCI. If the user must operate the system to perform a task, the interface should guide the user to take the appropriate actions and provide feedback to the user when operations succeed or fail.


Key Concepts

Human operators are often the weak link in any embedded system. Failure rates for humans as system components are several orders of magnitude higher than other parts of the system. Most system hardware components are considered safe if they have failure rates of 10-6 or lower. The performance limit for a single human operator working in ideal conditions is a failure rate of 10-4. If a team of operators in employed, the failure rate can be improved to 10-5. See the table below for common human error probability data taken from [Kirwan94].  This makes improving the HCI and correcting for human errors a key part of designing a safety critical system. We may be able to improve HCI design by observing that certain situations can degrade human performance, and designing the HCI to avoid putting the operator in those situations.
 
Description Error Probability
General rate for errors involving high stress levels 0.3
Operator fails to act correctly in the first 30 minutes of an emergency situation 0.1
Operator fails to act correctly after the first few hours in a high stress situation 0.03
Error in a routine operation where care is required 0.01
Error in simple routine operation 0.001
Selection of the wrong switch (dissimilar in shape) 0.001
Human-performance limit: single operator 0.0001
Human-performance limit: team of operators performing a well designed task 0.00001
General Human-Error Probability Data in Various Operating Conditions

Sources of Human Error

Automated systems are extremely good at repetitive tasks. However, if an unusual situation occurs and corrective action must be taken, the system usually cannot react well. In this situation, a human operator is needed handle an emergency. Humans are much better than machines at handling novel occurrences, but cannot perform repetitive tasks well. Thus the operator is left to passively monitor the system when there is no problem, and is only a fail-safe in an emergency. This is a major problem in HCI design, because when the user is not routinely involved in the control of the system, they will tend to become bored and be lulled into complacency. This is known as operator drop-out. Since the user's responsiveness is dulled, in a real emergency situation, he or she may not be able to recover as quickly and will tend to make more mistakes.

However, if the human operator must routinely be involved in the control of the system, he or she will tend to make mistakes and adapt to the common mode of operation. Also, if the operator has a persistent mental model of the system in its normal mode of operation, he or she will tend to ignore data indicating an error unless it is displayed with a high level of prominence. The HCI must be designed so that it provides enough novelty to keep the user alert and interested in his or her job, but not so extremely complicated that the user will find it difficult to operate.

Stress is also a major contributing factor to human error. Stressful situations include unfamiliar or exceptional occurrences, incidents that may cause a high loss of money, data, or life, or time critical tasks. Human performance tends to degrade when stress levels are raised. Intensive training can reduce this affect by making unusual situations a familiar scenario with drills. However, the cases where human beings must perform at their best to avoid hazards are often the cases of most extreme stress and worst error rates. The failure rate can be as high as thirty percent in extreme situations. Unfortunately, the human operator is our only option, since a computer system usually cannot correct for truly unique situations and emergencies. The best that can be done is to design the user interface so that the operator will make as few mistakes as possible.

HCI Problems

The HCI must provide intuitive controls and appropriate feedback to the user. Many HCI's can cause information overload. For example, if an operator must watch several displays to observe the state of a system, he or she may be overwhelmed and not be able to process the data to gain an appropriate view of the system. This may also cause the operator to ignore displays that are perceived as having very low information content. This can be dangerous if one particular display is in control of a critical sensor. Another way to overwhelm the operator is to have alarm sensitivity set too high. If an operator gets an alarm for nearly every action, most of which are false, he or she will ignore the alarm when there is a real emergency condition [Murphy98].

The HCI must also have a confidence level that will allow the operator to assess the validity of its information. The operator should not have to rely on one display for several sensors. The system should have some redundancy built into it. Also, several different displays should not relay information from the same sensor. This would give the user the unsubstantiated notion that he or she had more information than what was available, or that several different sources were in agreement. The operator should not trust the information from the HCI to the exclusion of the rest of his or her environment.

There are several heuristics for judging a well designed user interface, but there is no systematic method for designing safe, usable HCI's. It is also difficult to quantitatively measure the safety and usability of an interface, as well as find and correct for defects.


Available tools, techniques, and metrics

Several techniques exist for evaluating user interface designs, but they are not mature and do not provide conclusive data about an HCI's safety or usability. Inspection methods like heuristic evaluation and cognitive walkthrough have the advantage that they can be applied at the design phase, before the system is built. The fact that a real interface is not being tested also limits what can be determined about the HCI design. Empirical methods like protocol analysis actually have real users test the user interface, and do lengthy analyses on all the data collected during the session, from keystrokes to mouse clicks to the user's verbal account during interaction.

HCI Design

There are no structured methods for user interface design.  There are several guidelines and qualities that are desirable for a usable, safe HCI, but the method of achieving these qualities is not well understood.  Currently, the best method available is iterative design, evaluation, and redesign.  This is why evaluation methods are important.  If we can perform efficient evaluations and correctly identify as many defects as possible, the interface will be greatly improved.  Also, accurate evaluations earlier in the design phase can save money and time.  However, it is easier to find HCI defects when you have a physical interface to work with.  It is also important to separate design of the HCI from other components in the system, so defects in the interface do not propagate faults through the system.  [Burns91] outlines an architecture for decoupling the HCI from the application in hard real-time systems so that complexity is reduced and timing constraints can be dealt with.

Heuristic Evaluation

Heuristic evaluation involves having a set of people (the evaluators) inspect a user interface design and judge it based on a set of usability guidelines. These guidelines are qualitative and cannot be concretely measured, but the evaluators can make relative judgments about how well the user interface adheres to the guidelines. A sample set of usability heuristics from [Nielsen94] would be: This technique is usually applied early in the life cycle of a system, since a working user interface is not necessary to carry it out. Each individual evaluator can inspect the user interface on his or her own, judging it according to the set of heuristics without actually having to operate the interface. This is purely an inspection method. It has been found that to achieve optimal coverage for all interface problems, about five or more independent evaluators are necessary. However, if this is cost prohibitive, heuristic evaluation can achieve good results with as few as three evaluators.

Heuristic evaluation is good at uncovering errors and explaining why there are usability problems in the interface. Once the causes are known, it is fairly easy to implement a solution to fix the interface. This can be extremely time and cost saving since things can be corrected before the user interface is actually built. However, the merits of heuristic evaluation are very dependent on the merits of the evaluators. Skilled evaluators who are trained in the domain of the system and can recognize interface problems are necessary for very domain specific applications.

Cognitive Walkthrough

Another usability inspection method is the cognitive walkthrough. Like the heuristic evaluation, the cognitive walkthrough can be applied to a user interface design without actually operating a constructed interface. However, the cognitive walkthrough evaluates the system by focusing on how a theoretical user would go about performing a task or goal using the interface. Each step the user would take is examined, and the interface is judged based on how well it will guide the user to perform the correct action at each stage [Wharton94]. The interface should also provide an appropriate level of feedback to ensure to the user that progress is being made on his or her goal.

Since the cognitive walkthrough focuses on steps necessary to complete a specific task, it can uncover disparities in how the system users and designers view these tasks. It can also uncover poor labeling and inadequate feedback for certain actions. However, the method's tight focus loses sight of some other important usability aspects. This method cannot evaluate global consistency or extensiveness of features. It may also judge an interface that is designed to be comprehensive poorly because it provides too many choices to the user.

In order for a user interface to be designed well and as many flaws as possible to be caught, several inspection methods should be applied. There is a trade off between how thoroughly the interface is inspected and how many resources are able to be committed at this early stage in the system life cycle. Empirical methods can also be applied at the prototype stage to actually observe the performance of the user interface in action.

Protocol Analysis

Protocol analysis is an empirical method of user interface evaluation that focuses on the test user's vocal responses. The user operates the interface and is encouraged to "think out loud" when going through the steps to perform a task using the system. Video and audio data are recorded, as well as keystrokes and mouse clicks. Analyzing the data obtained from one test session can be an extremely time consuming activity, since one must draw conclusions from the subjective vocal responses of the subject and draw inferences from his or her facial expressions. This can be very tedious simply because the volume of data is very high and lengthy analysis is required for each second.

MetriStation

 MetriStation is a tool being developed at Carnegie Mellon University to automate the normally tedious task of gathering and analyzing all the data gathered from empirical user interface evaluations. The system consists of software that will synchronize and process data drawn from several sources when a test user is operating the interface being evaluated. Keystrokes, mouse clicks, tracking the user's eye movements, and the user's speech during a test session are recorded and analyzed. The system is based on the premise that if the interface has good usability characteristics, the user will not pause during the test session, but logically proceed from one step to the next as he or she completes an assigned task. Any statistically unusual pauses between actions would indicate a flaw in the interface and can be detected automatically [Maxion97].

MetriStation seems like a promising tool in aiding empirical analysis. It can give more quantitative results, and can reduce greatly the time spent collecting and processing data from test sessions. However, this tool may only flag problems that cause the user to hesitate in a task. It can do nothing about problems in the interface that do not slow the user down. For instance, it may be extremely easy to crash a system through the user interface quickly, but this is clearly not a desired outcome. The premise that most usability problems will cause the user to hesitate has limited scope and applicability.


Relationship to other topics

Since human error is the largest source of system failures, it must be a large factor in safety critical system analysis.

Conclusions

The following ideas are the important ones to take away from reading about this topic:

Annotated Reference List

Further Reading


Index of other topics

Home page