Section 6.2 Failures
Let us consider a program
\(p\) and a specification
\((P,Q)\text{.}\) If
\(p\) is not
partially correct with respect to
\((P,Q)\) the program has an
error. Then, due to
Definition 6.1.7, there exists a state
\(\sigma\models P\) such that
\(\config p\sigma\) has one of the following outcomes:
\(\config p\sigma\) gets stuck.
\(\config p\sigma\) aborts.
\(\config p\sigma\) terminates properly in a state \(\sigma'\) but \(\sigma'\not\models Q\text{.}\)
Definition 6.2.1. Failure.
A state \(\sigma\models P\) is called a failure with respect to \(p\) and \((P,Q)\) if one of the three cases is true.
A failure is a witness for an error. In general however, failures do not inform us about the cause of the error, i.e. the part of the program that is erroneous. A failure is therefore a symptom not a diagnosis. Let us consider the following example:
unsigned min(unsigned x, unsigned y) {
if (x < y)
return y;
else
return x;
}
unsigned sum_first_n(unsigned *arr, unsigned sz, unsigned n) {
unsigned sum = 0;
n = min(n, sz);
for (unsigned i = 0; i < n; i++)
sum += arr[i];
return sum;
}
Assume that the specification is given, as usual in practice, in plain English:
“The function sum_first_n
shall sum up the first n
, however at most sz
, elements of the array arr
.”
Let us consider the following calls to sum_first_n
:
int main() {
unsigned arr[] = { 1, 2, 3 };
unsigned r1 = sum_first_n(arr, 3, 2); // (1)
unsigned r2 = sum_first_n(arr, 3, 3); // (2)
unsigned r3 = sum_first_n(arr, 3, 4); // (3)
The first call is a failure because our specification demands that the result r1
must be 3, it is however 6.
is not a failure.
causes the program to get stuck. The last iteration of the loop in
sum_first_n
intends to access the array element
arr + 3
which does not exist and therefore the address
arr + 3
is not valid. We discussed
before 6.2.2 that we cannot make any assumptions on the program's behavior in this case.
The cause for the failure in (1) and the getting stuck in (3) is an error (also called defect or bug) in the function min
: min
does not compute the minimum but the maximum.
This examples shows that a failure may tell us little about the cause of the error that failure exposes. In practice, it can be especially hard to localize the error in program code. The first input merely results in a wrong result whereas the third causes undefined behavior which can be hard to observe (see
Remark 6.2.2).
Finding errors in programs can be significantly simplified if programs fail as early as possible when they enter an erroneous state. Then, the program won't continue executing with “wrong” values and thereby dislocating the cause from the symptom. A very common way to achieve this is to use assertions in the code to assert invariants, i.e. conditions on the program state at a particular program point that have to hold for every input. These assertions can then be checked at runtime and if they are violated, the program can be aborted immediately.
Definition 6.2.4. assert().
We define the function \(\CAssert e\) as \(\CIf{e}{\CBlock{}}{\CAbort}\text{.}\)
In C, assertions are typically implemented as macros that are expanded to an if-then-else construct as in
Definition 6.2.4. When defining the macro
NDEBUG
(typically by given the command line switch
-DNDEBUG
) asserts can be deactivated by defining the macro
assert
to the empty text. This is typically done for production code that has been “sufficiently” tested.
Example 6.2.5. Assertions in the Minimum Program.
#include <assert.h>
unsigned min(unsigned x, unsigned y) {
unsigned m = x < y ? x : y;
assert(m <= x && m <= y && (m == x || m == y));
return m;
}
assert
evaluates the argument expression. If the resulting value is 0, then the execution aborts immediately. Hence, the program aborts very close to the error location.
In general, it is advisable to document as much invariants as possible using assertions. They come for free in terms of runtime in the production code because they can just be disabled. They can help to expose bugs and can make it easier to fix them. And they are also helpful for documentation purposes because the denote (at least a part) of the invariant that the programmer had in mind for the particular program point. This holds especially for pre- and postconditions. Note however that the C expression language is often not strong enough to formulate powerful invariants because it does not contain quantifiers like the assertion language we have defined in
Definition 6.1.2 (they would of course be very hard to check at runtime). Therefore, often only a part of the invariant can be documented in an assertion.
Let us consider the following example where we can use assertions productively to specify the precondition. Which in this case is that the array is non empty:
Example 6.2.6.
Consider a function that is supposed find the minimum of all elements of an array. This operation is not well defined on empty arrays. So it should not be possible to call the function with an empty array. We document this by the assertion assert(n > 0);
. In debug builds, this assertion is checked during runtime and readers of the code will see that the precondition is that we don't pass empty arrays.
unsigned min_arr(unsigned* arr, unsigned n) {
assert(n > 0);
unsigned res = arr[0];
for (unsigned i = 1; i < n; i++)
res = min(res, arr[i]);
return res;
}
Note that checking the assertion in debug builds has a concrete benefit: If we accessed the first element of an empty array, the behavior of the program would be undefined with all the consequences outlined in
Remark 6.2.2. Checking the assertion at runtime however directly aborts the program in this situation and the failure we get from that brings us closer to the error location.