Ballista OS Robustness Test Suite - Catastrophic Detection
While running the Ballista OS Robustness Test Suite catastrophic robustness failures may be uncovered. Catastrophic failures manifest themselves as computer crashes or panics, and require a reboot to recover. Due to their nature catastrophic robustness failures are not automatically counted or recorded by the Ballista OS Robustness Test Suite. The manual steps needed to verify, and reproduce the catastrophic failure as well as restart the test suite follow.
> cd outfilesThe last outfile. file listed corresponds with the last function the system recorded testing. The text between the first and second period ('.') is the function name. After the second period is a list of parameters. For example:
> ls -lrt outfile.*
.../ballista>cd outfilesNote, occasionally with Linux the order of outfile. files can get slightly jumbled. We suggest comparing the last 4 or 5 outfile. files listed, and determining which file is associated with the latest entry in callTable.all or callTable<system>.all. Use this function specification as the last recorded function and continue processing.
.../ballista/outfiles>ls -lrt outfile.*
<...>
-rw------- 1 kdevale system 214 Feb 15 19:29 outfile.fgetpos.b_ptr_file.b_ptr_fpos_t
-rw------- 1 kdevale system 279 Feb 15 19:29 outfile.fsetpos.b_ptr_file.b_ptr_fpos_t
The last recorded function is fsetpos. The parameters associated with this test were b_ptr_file and b_ptr_fpos_t.
> cd ..
> cp callTable.all callTable.backup
callTable.all stdio.h function int fsetpos b_ptr_file b_ptr_fpos_t
In our example the only entry in callTable.all should be the line corresponding to fsetpos with parameters of b_ptr_file and b_ptr_fpos_t.
.../ballista>perl ostest.pl
If the system crashes then continue with step 3. Otherwise, we will need to check one additional function for the source of the catastrophic error. It is possible that the Ballista OS Robustness Test Suite was unable to record the function under test before the system crashed. Therefore, we will need to determine the function that immediately follows the one we just checked.
In the callTable backup file find the function that we just tested. The "following" function specification is the first entry that follows without a comment.
callTable.all <...>
stdio.h function int fgetpos b_ptr_file b_ptr_fpos_t
stdio.h function int fsetpos b_ptr_file b_ptr_fpos_t
# getchar intentionally omitted since it requires stdin
stdlib.h function long labs b_long
In our example the blank line and commented line following fsetpos are ignored and labs with parameter b_long is the "following" function specification.
Now repeat steps 2b and 2c substituting the "following" function for the last recorded function. If this function reproduces the robustness failure continue with step 3.
If running the test suite on the "following" function does not reproduce the function crash it is unlikely the crash you encountered is associated with the operation of the test suite. Copy the callTable backup file to its original location and try restarting the test suite.
> cp callTable.backup callTable.all
> perl ostest.pl
> cp callTable.backup callTable.all
In our example lets say that the catastrophic failure was associated with fsetpos. Therefore in callTable.all this function entry should now be preceded with a commentcallTable.all <...>
stdio.h function int fgetpos b_ptr_file b_ptr_fpos_t
#stdio.h function int fsetpos b_ptr_file b_ptr_fpos_t
# getchar intentionally omitted since it requires stdin
stdlib.h function long labs b_long
Make note of the function associated with the catastrophic failure. You will need this information later. If relevant you may want to copy the outfile. file associated with the function to another location for further processing. (The outfiles subdirectory will be deleted as part of rerunning the test suite.)
> perl ostest.pl