Lecture Notes On Software Testing As A Supplement To The Lecture "Dependable Systems"
Lecture Notes On Software Testing As A Supplement To The Lecture "Dependable Systems"
Katinka Wolter
5. Januar 2006
• user-oriented testing is similar to functional testing, only that it uses the software as a
whole. While functional testing executes single functions, user-oriented testing evaluates
the software as seen by a user. (This is a form of black-box testing.)
Code coverage analysis is a structural testing technique. Structural testing compares test
program behavior against the apparent intention of the source code. Structural testing ex-
amines how the program works, taking into account possible pitfalls in the structure and
logic. Functional testing examines what the program accomplishes, without regard to how it
works internally.
Structural testing is also called path testing since you choose test cases that cause paths to
be taken through the structure of the program. Do not confuse path testing with the path
coverage measure, explained later.
At first glance, structural testing seems unsafe. Structural testing cannot find errors of omis-
sion. However, requirements specifications sometimes do not exist, and are rarely complete.
This is especially true near the end of the product development time line when the require-
ments specification is updated less frequently and the product itself begins to take over the
role of the specification. The difference between functional and structural testing blurs near
release time.
We can distinguish testing as based on the software phase in which testing is used:
1
• product (the whole system)
• regression (re-release)
The commonly used white-box testing is used to achieve structural coverage. It consists of
• mutation testing.
In particular we distinguish
1. Statement coverage. Every statement is executed at least once. Does statement co-
verage = 1.0 provide a guarantee for a fault free programm P?
The chief disadvantage of statement coverage is that it is insensitive to some control
structures. For example, consider the following C/C++ code fragment:
int* p = NULL;
if (condition)
p = &variable;
*p = 123;
Without a test case that causes condition to evaluate false, statement coverage rates
this code fully covered. In fact, if condition ever evaluates false, this code fails. This is
the most serious shortcoming of statement coverage. If-statements are very common.
Statement coverage does not report whether loops reach their termination condition
- only whether the loop body was executed. With C, C++, and Java, this limitation
affects loops that contain break statements.
Since do-while loops always execute at least once, statement coverage considers them
the same rank as non-branching statements.
Statement coverage is completely insensitive to the logical operators (k and &&).
Statement coverage cannot distinguish consecutive switch labels.
Test cases generally correlate more to decisions than to statements. You probably would
not have 10 separate test cases for a sequence of 10 non-branching statements; you
would have only one test case. For example, consider an if-else statement containing
one statement in the then-clause and 99 statements in the else-clause. After exercising
one of the two possible paths, statement coverage gives extreme results: either 1% or
99% coverage. Basic block coverage eliminates this problem.
Block coverage uses a sequence of statements instead of single statements and for the
rest is like statement coverage.
2. Decision coverage is a measure indicating whether every decision in the code evaluated
to true and false.
This measure has the advantage of simplicity without the problems of statement cover-
age.
2
A disadvantage is that this measure ignores branches within boolean expressions which
occur due to short-circuit operators. For example, consider the following C/C++/Java
code fragment:
This measure could consider the control structure completely exercised without a call to
function1. The test expression is true when condition1 is true and condition2 is true, and
the test expression is false when condition1 is false. In this instance, the short-circuit
operators preclude a call to function1.
3. Data flow coverage. This variation of path coverage considers only the subpaths from
variable assignments to subsequent references of the variables. It indicates whether all
defuse pairs are covered. Example:
S1 : x = f ()
S2 : p = g(x, .)
3
Mutation testing may be used to judge the effectiveness of a test set: the test set should
kill all the mutants. Similarly, test generation may be based on mutation testing: tests
are generated to kill the mutants. Interestingly, many test criteria may be represented
using mutation testing by simply choosing appropriate mutation operators.
Functional testing uses operational profiles. An operational profile consists in test inputs
together with their relative frequencies of use.
1) A test set T is adequate with respect to (wrt) decision coverage, if all decisions in a
software system are covered when executed against all t ∈ T.
2) A test set T is adequate wrt p-use (or c-use), if all p-uses (c-uses) are covered by T .
• for several types of errors structural testing is not sufficient, but functional testing is.
(Errors of ommission).
4
Let ei be the effort in execution i of P , then
l2
X
Ek = ei
i=l1
where el1 and el2 is the effort of the first and last execution of P during the k-th failure time
interval.
Another view on reliability: The reliability R of P is the probability of no failure over the
entire input domain.
R = P rP (d)is correct for any d ∈ D
be the cumulative effort over k inter-failure epochs. Let x be the expose period. The probability
that the software will not fail during the next x time units is formalised as
Convergence of R(x|t):
R(x|t) → R as x → ∞
if the test inputs are operationally significant.
Example: In studies and semi-formal proofs it could be shown that structural testing is not
able to reveal all faults. For functional testing not even a saturation effect can be proven.
In a study, TEX by Knuth and AWK by Kernighan were tested using the tools TRIPTEST
(TEX) and ATAC.
The coverage statistics are
A possible scenario would look employ a test sequence as the one shown in the following graph.
The dashed fields indicate the saturation region, where the test method employed does not
reveal any more faults.
References
• http://www.bullseye.com/coverage
5
F residual faults 1111
0000
0000
1111
Mutation 0000
1111
0000
1111
1111111
0000000
0000000
1111111 0000
1111
Data flow 0000000
1111111 0000
1111
0000
1111
0000000
1111111 0000
1111
faults revealed
0000000
1111111
0000000
1111111 0000
1111
Saturation
0000000
1111111 0000
1111
11111
00000
region
0000000
1111111 0000
1111
0000
1111
Decision 00000
11111
00000
11111 0000000
1111111 0000
1111
111111
000000
Functional
00000
11111 0000000
1111111 0000
1111
000000
111111
000000
111111 00000
11111 0000000
1111111 0000
1111
000000
111111 00000
11111 0000000
1111111
0000000
1111111 0000
1111
000000
111111 00000
11111 0000000
1111111 0000
1111
00000
11111 0000
1111
testing effort (t)