Distributed embedded systems are difficult to design correctly. Three types of models should be used when designing such systems: analytic models, executable simulations, and prototypes. Each of the three model types have both strengths and weaknesses; using all three greatly increases the likelihood of producing a correct design.
When designing embedded systems, going to a distributed approach offers many potential advantages compared to a centralized approach. Distributed designs can be more scalable, offer cleaner separation of tasks for design teams, and facilitate use of commercial off-the-shelf hardware and software building blocks. (In this context, distributed systems encompass both logically distributed systems such as multiple tasks running on a single CPU, and physically distributed systems such as multiple computers on a communication network.)
However, it can be difficult correctly design distributed systems. This is because the distributed system must manifest a correct emergent behavior involving a collection of loosely coupled components. The correctness of this emergent behavior or even what the emergent behavior is may not be obvious from the point of view of any component in the system. Furthermore, many distributed systems are too complex for a human designer to understand without considerable study.
One solution to designing a system correctly is to create models that help the designers understand and evaluate both the system requirements and implementation. We think that three modelling techniques are required in order to successfully design distributed systems: analysis, simulation, and prototyping.
Analysis involves the use of mathematical approaches to create high-level abstractions of system properties, most notably performance. Analytic models are typically succinct mathematical equations that may be evaluated for any set of conditions to predict system properties. Different analytic models are typically required to express different categories of system properties.
Analytic techniques include:
Analytic models typically have the advantage of being readily grasped. In many cases, analytic models can be constructed in a few hours. They are often relatively cheap to evaluate, and can give a reasonable estimate of system characteristics quickly. Also, analytic models can provide insight into the dynamics of the system to help guide design activities.
On the other hand, analytic models can only be created by people having keen insight into the system being designed. There is always a risk that the system properties being analyzed are not the ones that will ultimately dominate the system's characteristics. Also, in some cases too many simplifying assumptions must be made in order to create tractable models.
So, while analysis can provide quick answers during the exploration phase of a design, it is possible that the answers are not good approximations to reality. It is also possible that while analysis answers questions correctly, it may not provide insight into whether the right questions have been asked.
Simulation involves the use of executable computer programs to demonstrate emergent system behavior. Building an executable model at even a high level of abstraction forces the designer to think through issues that otherwise might be swept under the rug with a non-executable specification technique. More than one simulation technique and corresponding model are often desirable for any particular system, depending on the aspects that must be studied.
Simulation techniques include:
Simulations require a "workload", or stimulus at an appropriately high level of abstraction. Simulations may be fed by:
Simulations provide an important intermediate capability between analytic models and actual prototypes. By building a model of the system and executing it, designers can see what behavior emerges. With appropriate instrumentation and attention, a simulation can reveal unexpected interactions and performance bottlenecks that are missed by analysis. In particular, simulations are valuable for studying "fine-grain", detailed interactions that deal with specific sequences of events rather than the broad-brush steady-state approach typical of analytic methods.
Simulations can also be superior to prototypes in many cases. It is relatively simple to create arbitrary initial conditions (controllability) and detailed monitoring devices (observability) in a simulation. Controllability is important to investigate conditions that are unlikely to happen in practice, or are too expensive to create in the laboratory more than once. With the complete controllability offered by digital computer simulations, it is generally easy to repeat experiments in order to evaluate potential design changes.
Simulations also offer superior observability, since any state within the model is available as a value in some memory location. An important implication of complete observability is that it is usually straightforward to freeze operation of a system and capture the complete state when an infrequent bug occurs.
So, simulation provides an intermediate step between quick tradeoff studies performed by analysis and detailed validation provided by prototyping.
Prototyping involves the creation of actual or approximated system hardware and software for evaluation. Prototypes can usually be created much more quickly than production units because of relaxed manufacturability, tooling, material cost and life-cycle requirements. Typically, prototypes are expensive on a per-unit basis, and so can be built only in limited quantities.
Prototypes potentially offer an exact model of the final system in all important aspects.
On the other hand, prototypes may be difficult and expensive to change. Setting initial conditions and providing appropriate instrumentation may be difficult, time-consuming, and expensive. And, prototypes may be too few in number to obtain meaningful predictions about performance scalability.
Analysis, Simulation, and Prototyping are all required for successful system development. It is not sufficient to have only one or two of the three models in order to be sure that a distributed system will be designed correctly (not to mention on time and on budget).
Analytic models should be employed first in order to get the broad brush strokes of the system's characteristics. To the extent that similar systems have been built before, analytic models will in general be helpful in providing guidance. However, for areas in which the new system is novel, it may be impossible to accurately predict which system characteristics must be monitored for potential problems. Analysis should always be attempted as the first step of a design, but its limitations should be well understood.
Even if the system design is familiar, it is likely that analytic techniques are unavailable for some important facets of the design. There is always the temptation to make simplifications that help the system fit into known analytic solutions, whether such simplifications are warranted or not. As a result, analytic results must be treated with caution and attention to limitations in their applicability.
After an initial analytic modelling attempt, it is vital to build a simulation of the system. In the presence of good analysis, the simulation will validate the analytic models. In the absence of thorough analysis (because, for example, the system is so novel that it is not apparent what should be analyzed), execution of a simulation can provide a way to gain enough insight to attempt analytic model creation.
It is common for there to be a tightly coupled iteration between analysis and simulation. In fact, it is often desirable to have multiple analysis approaches and multiple simulation approaches in order to converge on answers that are understood, explainable, and reproducible by more than one technique.
Finally, when analysis and simulation both suggest that the system is well designed, prototypes should be constructed and instrumented to verify results.
Phil Koopman -- firstname.lastname@example.org