unit 4 part 1
unit 4 part 1
CHAPTER
10
CODING AND TESTING
CHAPTER OUTLINE
®® Coding
®® Code Review
®® Software Documentation
®® Introduction to Testing
®® Unit Testing
®® Black-box Testing
®® White-Box Testing
®® Debugging
®® Program Analysis Tools
®® Integration Testing
®® Testing Object-Oriented Programs
®® System Testing
®® Some General Issues Associated with
Testing
I
n this chapter, we will discuss the coding and testing phases
LEARNING OBJECTIVES of the software life cycle.
®® Coding standards and In the coding phase, every module specified in the design
guidelines document is coded and unit
®® Code review tested. During unit testing, each Coding is undertaken once the
®® Software documentation module is tested in isolation design phase is complete and
from other modules. That is, a the design documents have
®® Basic concepts in software
testing module is tested independently been successfully reviewed.
as and when its coding is
®® Unit testing
complete.
®® Integration testing Integration and testing
After all the modules of a
®® Testing object-oriented of modules is carried out
system have been coded and
programs according to an integration plan.
unit tested, the integration
®® System testing The integration plan, according
and system testing phase is
to which different modules are
undertaken.
integrated together, usually
envisages integration of
modules through a number of steps. During each integration
step, a number of modules are added to the partially integrated
system and the resultant system is tested. The full product takes
shape only after all the modules have been integrated together.
System testing is conducted on the full product. During system
testing, the product is tested against its requirements as recorded
in the SRS document.
We had already pointed out in Chapter 2 that testing is an
important phase in software development and typically requires
the maximum effort among all the development phases. Usually,
testing of any commercial software is carried out using a large
number of test cases. It is usually the case that many of the
different test cases can be executed in parallel by different team
members. Therefore, to reduce the testing time, during the
testing phase the largest manpower (compared to all other life
cycle phases) is deployed. In a typical development organisation,
at any time, the maximum number of software engineers can
be found to be engaged in testing activities. It is not very
surprising then that in the software industry there is always a
large demand for software test
engineers. However, many Over the years, the general
novice engineers bear the perception of testing as
wrong impression that testing monkeys typing in random data
is a secondary activity and that and trying to crash the system
it is intellectually not as has changed. Now testers
stimulating as the activities are looked upon as masters
associated with the other of specialised concepts,
development phases. techniques, and tools.
Coding and Testing 425
10.1 CODING
The input to the coding phase is the design document produced at the end of the design
phase. Please recollect that the design documents contain not only the high-level design of
the software in the form of a structure charts (representing the module call relationships),
but also the detailed design. The detailed design is usually
documented in the form of module specifications where The objective of the coding
the data structures and algorithms for each module are phase is to transform the
specified. During the coding phase, different modules design of a system into code in
identified in the design document are coded according to a high-level language, and then
their respective module specifications. We can describe the to unit test this code.
overall objective of the coding phase to be the following.
Normally, good software development organisations require their programmers to
adhere to some well-defined and standard style of coding which is called their coding
standard. Software development organisations usually formulate their own coding standards
that suit them the most, and require their developers to follow the standards rigorously
because of the significant business advantages it offers. The main advantages of adhering
to a standard style of coding are the following:
A coding standard gives a uniform appearance to It is mandatory for the
the codes written by different engineers. programmers to follow the
It facilitates code understanding and code reuse. coding standards. Compliance
It promotes good programming practices. of their code to coding
A coding standard lists several rules to be followed standards is verified during
during coding, such as the way variables are to be code inspection. Any code that
named, the way the code is to be laid out, the error does not conform to the coding
return conventions, etc. Besides the coding standards, standards is rejected during
several coding guidelines are also prescribed by software code review and the code is
companies. But, what is the difference between a coding reworked by the concerned
guideline and a coding standard? programmer. In contrast,
After a module has been coded, usually code review coding guidelines provide
is carried out to ensure that the coding standards are some general suggestions
followed and also to detect as many errors as possible regarding the coding style
before testing. It is important to detect as many errors to be followed but leave the
as possible during code reviews, because reviews are an actual implementation of these
efficient way of removing errors from code as compared guidelines to the discretion of
to defect elimination using testing. We first discuss a the individual developers.
few representative coding standards and guidelines.
Subsequently, we discuss code review techniques. We then discuss software documentation
in Section 10.3.
426 Fundamentals of Software Engineering
Avoid obscure side effects: The side effects of a function call include modifications to the
parameters passed by reference, modification of global variables, and I/O operations that
are not obvious behaviour of the function. An obscure side effect is hard to understand from
a casual examination of the code. For example, suppose the value of a global variable is
changed or some file I/O is performed, which may be difficult to infer from the function’s
name and header information.
Do not use an identifier for multiple purposes: Programmers often use the same
identifier to denote several temporary entities. For example, some programmers make use
of a temporary loop variable for also computing and storing the final result. The rationale
that they give for such multiple use of variables is memory efficiency, e.g., three variables
use up three memory locations, whereas when the same variable is used for three different
purposes, only one memory location is used. However, there are several things wrong with
this approach and hence should be avoided. Some of the problems caused by the use of
a variable for multiple purposes are as follows:
Each variable should be given a descriptive name indicating its purpose. This is not
possible if an identifier is used for multiple purposes. Use of a variable for multiple
purposes can lead to confusion and make it difficult for somebody trying to read
and understand the code.
Use of variables for multiple purposes usually makes future enhancements more
difficult. For example, while changing the final computed result from integer to
float type, the programmer might subsequently notice that it has also been used as
a temporary loop variable that cannot be a float type.
Code should be well-documented: As a rule of thumb, there should be at least one
comment line on the average for every three source lines of code.
Length of any function should not exceed 10 source lines: A lengthy function is usually
very difficult to understand as it probably has a large number of variables and carries out
many different types of computations. For the same reason, lengthy functions are likely
to have disproportionately larger number of bugs.
Do not use GO TO statements: Use of GO TO statements makes a program unstructured.
This makes the program very difficult to understand, debug, and maintain.
This technique reportedly produces documentation and code that is more reliable
and maintainable than other development methods relying heavily on code execution-
based testing. The main problem with this approach is that testing effort is increased as
walkthroughs, inspection, and verification are time consuming for detecting simple errors.
Also testing-based error detection is efficient for detecting certain errors that escape manual
inspection.
1
Manpower turnover is the software industry jargon for denoting the unusually high rate at which
personnel attrition occurs (i.e., personnel leave an organisation).
Coding and Testing 431
Observe that the fog index is computed as the sum of two different factors. The first
factor computes the average number of words per sentence (total number of words in
the document divided by the total number of sentences). This factor therefore accounts
for the common observation that long sentences are difficult to understand. The second
factor measures the percentage of complex words in the document. The complex words
are considered to be those with three or more syllabi. Note that a syllable is a group of
words that can be independently pronounced. For example, the word “sentence” has three
syllables (“sen”, “ten”, and “ce”). Words having more than three syllables are complex
words and presence of many such words hamper readability of a document.
PROBLEM 10.1 Consider the following sentence: “The Gunning’s fog index is based
on the premise that use of short sentences and simple words makes a document easy to
understand.” Calculate its fog index.
Solution: The given sentence has 23 words. Four of the words have three or more syllabi.
The fog index of the problem sentence is therefore
0.4 × (23/1) + (4/23) × 100 = 26.5
If a users’ manual is to be designed for use by factory workers whose educational
qualification is class 8, then the document should be written such that the Gunning’s fog
index of the document does not exceed 8.
10.4 TESTING
The aim of program testing is to help in identifying all defects in a program. However,
in practice, even after satisfactory completion of the testing phase, it is not possible to
guarantee that a program is error free. This is because the input data domain of most
programs is very large, and it is not practical to test the program exhaustively with respect
to each value that the input can assume. Consider a function taking a floating point number
as argument. If a tester takes 1 second to type in an integer value, then even a million
testers would not be able to exhaustively test it after trying for a million years. Even with
this obvious limitation of the testing process, we should not underestimate the importance
of testing. We must remember that careful testing can expose a large percentage of the
defects existing in a program, and therefore, testing provides a practical way of reducing
defects in a system.
which the failure occurs. However, unless the conditions under which a software fails are
noted down, it becomes difficult for the developers to reproduce a failure observed by the
testers. For example, a software might fail for a test case only when a network connection
is enabled. Unless this condition is documented in the failure report, it becomes difficult
to reproduce the failure.
Terminologies
As is true for any specialised domain, the area of software testing has come to be associated
with its own set of terminologies. In the following, we discuss a few important terminologies
that have been standardised by the IEEE Standard Glossary of Software Engineering
Terminology [IEEE, 1990]:
A mistake is essentially any programmer action that later shows up as an incorrect
result during program execution. A programmer may commit a mistake in almost
any of the development activities. For example, during coding a programmer might
commit the mistake of not initializing a certain variable, or might overlook the
errors that might arise in some exceptional situations such as division by zero in
an arithmetic operation. Both these mistakes can lead to an incorrect result during
program execution.
An error is the result of a mistake committed by a developer in any of the
development activities. Mistakes can give rise to an extremely large variety of errors.
One example error is a call made to a wrong
function. The terms error, fault, bug,
The terms error, fault, bug, and defect are used and defect are considered to
interchangeably by the program testing community. be synonyms in the area of
Please note that in the domain of hardware testing, program testing.
the term fault is used with a slightly different
connotation [IEEE, 1990] as compared to the terms error and bug.
PROBLEM 10.2 Can a designer’s mistake give rise to a program error? Give an example
of a designer’s mistake and the corresponding program error.
Solution: Yes, a designer’s mistake can give rise to a program error. For example, a
requirement might be overlooked by the designer, which can lead to it being overlooked
in the code as well.
A failure of a program essentially denotes an incorrect behaviour exhibited by
the program during its execution. An incorrect behaviour is observed either as
production of an incorrect result or as an inappropriate activity carried out by the
434 Fundamentals of Software Engineering
program. Every failure is caused by one or more bugs present in the program. In
other words, we can say that every software failure can be traced to one or more bugs
present in the code. The number of possible bugs that can cause a program failure is
extremely large. Out of the large number of the bugs that can cause program failure,
in the following we give three randomly selected examples:
– The result computed by a program is 0, when the correct result is 10.
– A program crashes on an input.
– A robot fails to avoid an obstacle and collides with it.
It may be noted that mere presence of an error in a program code may not necessarily
lead to a failure during its execution.
PROBLEM 10.3 Give an example of a program error that may not cause any failure.
Solution: Consider the following C program segment:
int markList[1:10]; /* mark list of 10 students*/
int roll; /* student roll number*/
...
markList[roll]=mark;
In the above code, if the variable roll assumes zero or some negative value under some
circumstances, then an array index out of bound type of error would result. However, it may
be the case that for all allowed input values the variable roll is always assigned positive
values. Then, no failure would occur. Thus, even if an error is present in the code, it does
not show up as a failure for normal input values.
Explanation: An array index out of bound type of error is said to occur, when the array
index variable assumes a value beyond the array bounds.
A test case is a triplet [I, S, R], where I is the data input to the program under test,
S is the state of the program at which the data is to be input, and R is the result
expected to be produced by the program. The state of a program is also called its
execution mode. As an example, consider the different execution modes of a certain
text editor software. The text editor can at any time during its execution assume
any of the following execution modes—edit, view, create, and display. In simple
words, we can say that a test case is a set of certain test inputs, the mode in which
the input is to be applied, and the results that are expected during and after the
execution of the test case.
An example of a test case is—[input: “abc”, state: edit, result: abc is displayed], which
essentially means that the input abc needs to be applied in the edit mode, and the
expected result is that the string abc would be displayed.
A test scenario is an abstract test case in the sense that it only identifies the aspects
of the program that are to be tested without identifying the input, state, or output. A
test case can be said to be an implementation of a test scenario. For example, a test
scenario can be the traversal of a path in the control flow graph of the program. In
the test case, the input, output, and the state at which the input would be applied
is designed such that the scenario can be executed. An important automatic test case
design strategy is to first design test scenarios through an analysis of some program
abstraction (model) and then implement the test scenarios as test cases.
Coding and Testing 435
A test script is an encoding of a test case as a short program. Test scripts are
developed for automated execution of the test cases.
A test case is said to be a positive test case if it is designed to test whether the
software correctly performs a required functionality. A test case is said to be negative
test case, if it is designed to test whether the software carries out something that
is not required of the system. As one example each of a positive test case and a
negative test case, consider a function to manage user logins. A positive test case
can be designed to check if the function correctly validates a legitimate user entering
correct user name and password. A negative test case in this case can be a test case
that checks whether the login functionality validates and admits a user with wrong
or bogus user name or password.
A test suite is the set of all test cases that have been designed by a tester to test a
given program.
Testability of a program indicates the effort needed to validate the program. In other
words, the testability of a requirement is the degree of difficulty to adequately test
an implementation to determine its conformance to its requirements.
PROBLEM 10.4 Suppose two programs have been written to implement essentially the
same functionality. How can you determine which one of these is more testable?
Solution: A program is more testable, if it can be adequately tested with less number of
test cases. Obviously, a less complex program is more testable. The complexity of a program
can be measured using several types of metrics such as number of decision statements used
in the program. Thus, a more testable program should have a lower structural complexity
metric.
A failure mode of a software denotes an observable way in which it can fail. In other
words, all failures that have similar observable symptoms, constitute a failure mode.
As an example of the failure modes of a software, consider a railway ticket booking
software that has three failure modes—failing to book an available seat, incorrect
seat booking (e.g., booking an already booked seat), and system crash.
Equivalent faults denote two or more bugs that result in the system failing in the
same failure mode. As an example of equivalent faults, consider the following two
faults in C language—division by zero and illegal memory access errors. These two
are equivalent faults, since each of these leads to a program crash.
PROBLEM 10.5 Is it at all possible to develop a highly reliable software, using validation
techniques alone? If so, can we say that all verification techniques are redundant?
Solution: It is possible to develop a highly reliable software using validation techniques
alone. However, this would cause the development cost to increase drastically. Verification
techniques help achieve phase containment of errors and provide a means to cost-effectively
remove bugs.
code. These two approaches to test case design are complementary. That is, a program has
to be tested using the test cases designed by both the approaches, and one testing using
one approach does not substitute testing using the other.
PROBLEM 10.6 Is it a good idea to thoroughly perform any one of black-box or white-
box testing and leave out the other?
Solution: No. Both white-box and black-box tests have to be performed, since some bugs
detected by white-box test cases cannot be detected by black-box test cases and vice-versa.
For example, a requirement that has not been implemented cannot be detected by any
white-box test. On the other hand, some extra functionality that has been implemented by
the code cannot be detected by any black-box test case.
We can think of the test suite designed by using a test case design strategy as a bug
filter. When a number of test suite design strategies are used successively to test a program,
we can think of the number of bugs in the program
successively getting reduced after the application of each Execution of a well-designed
bug filter. However, it must be remembered if a program test suite can detect many
has been adequately tested using some test suite design errors in a program, but
approaches, then further testing of the program using cannot guarantee complete
additional test cases designed using the same approach absence of errors.
would yield rapidly diminishing returns. In this context,
we discuss an analogy reported in [Beizer, 1990].
Suppose a cotton crop is infested with insects. The farmer may use a pesticide such
as DDT, which kills most of the bugs. However, some bugs survive. The surviving bugs
develop resistance to DDT. When the farmer grows cotton crop in the next season, the bugs
again appear. But, application of DDT does not kill many bugs as the bugs have become
resistant to DDT. The farmer would have to use a different insecticide such as Malathion.
However, the surviving bugs would have become resistant to both DDT and Malathion,
and so on.
PROBLEM 10.7 Suppose 10 different test case design strategies are successively used
to test a program. Each test case design strategy is capable of detecting 30% of the bugs
existing at the time of application of the strategy. If 1000 bus existed in the program,
determine the number of bugs existing at the end of testing.
Solution: Testing using each test case design strategy, detects 30% of the bugs. So, after
testing using a test strategy, 70% of the bugs existing before application of the strategy
survive. At the end of application of the 10 strategies, the number of surviving bugs =
1000 × (0.7)10 = 1000 × 0.028 = 28.
they too have been unit tested. In this context, stubs and drivers are designed to provide
the complete environment for a module so that testing can be carried out. The role of stub
and driver modules is pictorially shown in Figure 10.3. We briefly discuss the stub and
driver modules that are required to provide the necessary environment for carrying out
unit testing are briefly discussed in the following.
Stub: A stub module consists of several stub procedures that are called by the module
under test. A stub procedure is a dummy procedure that takes the same parameters as the
function called by the unit under test but has a highly simplified behavior. For example,
a stub procedure may produce the expected behaviour using a simple table look up
mechanism, rather than performing actual computations.
FIGURE 10.3 Unit testing with the help of driver and stub modules.
Driver: A driver module contains the non-local data structures that are accessed by the
module under test. Additionally, it should also have the code to call the different functions
of the unit under test with appropriate parameter values for testing.
1. If the input data values to a system can be specified by a range of values, then one
valid and two invalid equivalence classes can be defined. For example, if the
equivalence class is the set of integers in the range
1 to 10 (i.e., [1,10]), then the two invalid equivalence The main idea behind defining
classes are [−∞,0], [11,+∞], and the valid equivalence classes of input
equivalence class is [1,10]. data is that testing the code
2. If the input data assumes values from a set of with any one value belonging
discrete members of some domain, then one to an equivalence class is as
equivalence class for the valid input values and good as testing the code with
another equivalence class for the invalid input any other value belonging to
values should be defined. For example, if the valid the same equivalence class.
equivalence classes are {A,B,C}, then the invalid
equivalence class is ∪-{A,B,C}, where ∪ is the universe of all possible input values.
In the following, we illustrate equivalence class partitioning-based test case generation
through three examples.
PROBLEM 10.8 Consider a program unit that takes an input integer that can assume
values in the range of 0 and 5000 and computes its square root. Determine the equivalence
classes and the black box test suite for the program unit.
Solution: There are three equivalence classes—The set of negative integers, the set of
integers in the range of 0 and 5000, and the set of integers larger than 5000. Therefore, the
test cases must include representatives for each of the three equivalence classes. A possible
test suite can be: {–5,500,6000}.
PROBLEM 10.9 Design the equivalence class test cases for a function that reads two
integer pairs (m1, c1) and (m2, c2) defining two straight lines of the form y=mx+c. The
function computes the intersection point of the two straight lines and displays the point
of intersection.
Solution: First, there are two equivalence classes: valid and invalid. The valid class can
be divided into the following equivalence classes:
No point of intersection: Parallel lines (m1 = m2, c1 c2)
One point of intersection: Intersecting lines (m1 m2)
Infinite points of intersection: Coincident lines (m1 = m2, c1 = c2)
Now, selecting one representative value from each equivalence class, we get the required
equivalence class test suite {{(a,a)(a,a)}{(2,2)(2,5)},{(5,5)(7,7)}, {(10,10)(10,10)}}. The pair {(a,a)
(a,a)} represents the invalid class.
PROBLEM 10.10 Design equivalence class partitioning test suite for a function that reads
a character string of size less than five characters and displays whether it is a palindrome.
Solution: The domain of all input values can be partitioned into two broad equivalence
classes: valid values and invalid values. The valid values can be partitioned into palindromes
and non-palindromes. The equivalence classes are the leaf level classes shown in
Figure 10.4. The equivalence classes are palindromes, non-palindromes, and invalid inputs.
Now, selecting one representative value from each equivalence class, we have the required
test suite: {abc,aba,abcdef}.
Coding and Testing 443
PROBLEM 10.11 For a function that computes the square root of the integer values in
the range of 0 and 5000, determine the boundary value test suite.
Solution: There are three equivalence classes—The set of negative integers, the set of
integers in the range of 0 and 5000, and the set of integers larger than 5000. The boundary
value-based test suite is: {0,-1,5000,5001}.
PROBLEM 10.12 Design boundary value test suite for the function described in
Problem 10.8.
Solution: The equivalence classes have been showed in Figure 10.5. There is a boundary
between the valid and invalid equivalence classes. Thus, the boundary value test suite is
{abcdefg, abcdef}.
444 Fundamentals of Software Engineering
FIGURE 10.5 CFG for (a) sequence, (b) selection, and (c) iteration type of constructs.
some aspect of source code and is based on some heuristic. We first discuss some basic
concepts associated with white-box testing, and follow it up with a discussion on specific
testing strategies.
Fault-based testing
A fault-based testing strategy targets to detect certain types of faults. An example of a
fault-based strategy is mutation testing, which is discussed later in this section.
Coverage-based testing
A coverage-based testing strategy attempts to execute (or cover) certain elements of a
program. Popular examples of coverage-based testing strategies are statement coverage,
branch coverage, multiple condition coverage, and path coverage-based testing.
PROBLEM 10.13 Design a statement coverage-based test suite for the following Euclid’s
GCD computation function:
int computeGCD(int x,int y){
1 while (x != y){
2 if (x>y) then
3 x=x-y;
4 else y=y-x;
5 }
6 return x;
}
Solution: To design the test cases for achieving statement coverage, the conditional
expression of the while statement needs to be made true and the conditional expression
Coding and Testing 447
of the if statement needs to be made both true and false. By choosing the test set {(x =
3, y = 3), (x = 4, y = 3), (x = 3, y = 4)}, all statements of the program would be executed
at least once.
PROBLEM 10.14 For the program of Problem 10.13, determine a test suite to achieve
branch coverage.
Solution: The test suite {(x = 3, y = 3), (x = 3, y = 2), (x = 4, y = 3), (x = 3, y = 4)} achieves
branch coverage.
It is easy to show that branch coverage-based testing is a stronger testing than statement
coverage-based testing. We can prove this by showing that branch coverage ensures
statement coverage, but not vice versa.
Theorem 10.1 Branch coverage-based testing is stronger than statement coverage-based
testing.
Proof: We need to show that (a) branch coverage ensures statement coverage, and (b)
statement coverage does not ensure branch coverage.
(a) Branch testing would guarantee statement coverage since every statement must
belong to some branch (assuming that there is no unreachable code).
(b) To show that statement coverage does not ensure branch coverage, it is sufficient
to give an example of a test suite that achieves statement coverage, but does not
cover at least one branch. Consider the following code, and the test suite {5}.
if(x>2) x+=1;
The test suite would achieve statement coverage. However, it does not achieve
branch coverage, since the condition (x > 2) is not made false by any test case in
the suite.
2
Of course, the number of test cases required to achieve MCC is usually lower, due to the short-
circuit expression evaluation deployed by compilers.
Coding and Testing 449
would be required to test a function that contains a dozen of such statements. Modified
Condition/Decision Coverage (MC/DC) testing was proposed to ensure achievement of
almost as much thorough testing as is achieved by MCC, but to keep the number of test
cases linear in the number of basic conditions. Due to this reason, MC/DC has become
very popular and is mandated by several safety-critical system standards such as the US
Federal Aviation Administration (FAA) DO-178C safety-critical standard, and international
standards such as IEC 61508, ISO 26262, EN50128.
The name (MC/DC) implies that it ensures decision coverage and modifies (that is
relaxes) the MCC. The requirement for MC/DC is usually expressed as the following: A
test suite would achieve MC/DC if during execution of the test suite each condition in a decision
expression independently affects the outcome of the decision. That is, an atomic condition
independently affects the outcome of the decision, if the decision outcome changes as a
result of changing the truth value of the single atomic condition, while other conditions
maintain their truth values. We can express the requirement for MC/DC as three basic
requirements. We now state these requirements and for each give an illustrative example.
Requirement 1: Every decision expression in a program must take both true as well as
false values.
Recollect that this requirement is the same decision coverage (DC). As an example,
consider the following decision statement: if (( a>10 ) && (( b<50 ) || (
c==0 ))). In this decision statement, the decision contains three atomic conditions. The
decision expression can be made to take true and false values for the following two sets
of values: {(a = 5, b = 10, c = 1) (a = 5, b = 10, c = 0)}.
Requirement 2: Every condition in a decision must assume both true and false values.
Recollect that this requirement is the same as BCC. As an example, consider the
following decision statement: if (( a>10 ) && (( b<50 ) || ( c==0 ))). For this decision
expression {(a = 10, b = 10, c = 5)(a = 20, b = 60, c = 0)} will achieve BCC and meet the
Requirement 2.
Requirement 3: Each condition in a decision should independently affect the decision’s
outcome.
Every conditions in the decision independently affect the decision’s outcome. As an
example, consider the following decision statement: if (( a>10 ) && (( b<50 ) || ( c==0
))). Consider the test suite: {(a = 5, b = 30, c = 1)(a = 15, b = 30, c = 1)}, this achieves toggling
of the outcome of the decision with the toggling of the truth value of the first condition,
while b and c are maintained at constant values (b = 30, and c = 1). Now, consider the
test values {(a = 15, b = 10, c = 1)(a = 15, b = 50, c = 1). This makes the second condition in
the decision expression to independently influence the outcome of the decision while the
‘a’ and ‘c’ are maintained at the values a = 15 and c = 1. Finally, consider the set of two
test cases, {(60,60,0)(60,60,1)}. It can be observed that these two test cases let the condition
(c==0) to independently determine the decision outcome.
It is not hard to observe that a set of test cases that satisfies Requirement 3, also satisfies
Requirements 2 and 1. We have now explained the requirement for achieving MC/DC.
Now, the question arises as to given a decision expression, how do we determine the set
of test cases that achieve MC/DC?
450 Fundamentals of Software Engineering
PROBLEM 10.16 Design MC/DC test suite for the following decision statement: if ( A
and B ) then
Solution: We first draw the truth table (Table 10.1).
TABLE 10.1 Truth table for the decision expression (A and B)
Test case A B Decision Test case Test case
number pair for A pair for B
1 T T T 3 2
2 T F F 1
3 F T F 1
4 F F F
From the truth table in Table 10.1, we can observe that the test cases 1, 2 and 3 together
achieve MC/DC for the given expression.
PROBLEM 10.17 Design a test suite that would achieve MC/DC for the following decision
statement: if( (A && B) || C)
Solution: We first draw the truth table (Table 10.2).
TABLE 10.2 Truth table for the decision expression ((A&&B)||C)
Test case ABC Result A B C
1 TTT T 5
2 TTF T 6 4
3 TFT T 7 4
4 TFF F 2 3
5 FTT F 1
6 FTF F 2
7 FFT F 3
8 FFF F
From the table, we can observe that different sets of test cases achieve MC/DC. The sets
are {2,3,4,6}, {2,3,4,7} and {1,2,3,4,5}.
Coding and Testing 451
Subsumption hierarchy
We have already pointed out that MCC subsumed the other condition-based test coverage
metrics. In fact, it can be proved that we shall have the subsumption hierarchy shown in
Figure 10.7 for the various coverage criteria discussed. We leave the proof to be worked out
by the reader. Observe that MCC is the strongest, and statement and condition coverage
are the weakest. However, statement and condition coverage are not strictly comparable
and are complementary.
the CFG for these three types of constructs can be drawn. The CFG representation of the
sequence and decision types of statements is straight forward. Please note carefully how
the CFG for the loop (iteration) construct can be drawn. For iteration type of constructs
such as the while construct, the loop condition is tested only at the beginning of the loop
and therefore always control flows from the last statement of the loop to the top of the
loop. That is, the loop construct terminates from the first statement (after the loop is found
to be false) and does not at any time exit the loop at the last statement of the loop. Using
these basic ideas, the CFG of the program given in Figure 10.8(a) can be drawn as shown
in Figure 10.8(b).
Path
A path through a program is any node and edge sequence from the start node to a terminal
node of the control flow graph of a program. Please note that a program can have more
than one terminal nodes when it contains multiple exit or return type of statements.
Writing test cases to cover all paths of a typical program is impractical since there can
be an infinite number of paths through a program in presence of loops. For example, in
Figure 10.5(c), there can be an infinite number of paths such as 12314, 12312314, 12312312314,
etc. If coverage of all paths is attempted, then the number of test cases required would
become infinitely large. Therefore, we can say that all path testing is impractical. For this
Coding and Testing 453
reason, path coverage testing does not try to cover all paths, but only a subset of paths
called linearly independent paths (or basis paths). Let us now discuss what are linearly
independent paths and how to determine these in a program.
How is path testing carried out by using computed McCabe’s cyclomatic metric
value?
Knowing the number of basis paths in a program does not make it any easier to design
test cases for path coverage, only it gives an indication of the minimum number of test
cases required for path coverage. For the CFG of a moderately complex program segment
of say 20 nodes and 25 edges, you may need several days of effort to identify all the
linearly independent paths in it and to design the test cases. It is therefore impractical to
require the test designers to identify all the linearly independent paths in a code, and then
design the test cases to force execution along each of the identified paths. In practice, for
path testing, usually the tester keeps on forming test cases with random data and executes
those until the required coverage is achieved. A testing tool such as a dynamic program
analyser (see Section 10.8.2) is used to determine the percentage of linearly independent
paths covered by the test cases that have been executed so far. If the percentage of linearly
independent paths covered is below 90 per cent, more test cases (with random inputs) are
added to increase the path coverage. Normally, it is not practical to target achievement of
100 per cent path coverage. The first reason is the presence of infeasible paths. Though
the percentage of infeasible programs varies across programs, it is often in the range of
1–10%. An example of an infeasible path is the following:
if(x==1) {…}
if (x==2) {…}
Coding and Testing 455
In the above code segment, it is not possible to have both the conditional expressions
as true. Also, McCabe’s metric is only an upper bound and does not give the exact number
of paths.
a subtraction operator. If a mutant does not introduce any error in the program, then the
original program and the mutated program are called equivalent programs.
It is clear that a large number of mutants can be generated for a program. Each time
a mutated program is created by application of a mutation operator, it is tested by using
the original test suite of the program. If at least one test case in the test suite yields an
incorrect result, then the mutant is said to be dead, since the error introduced by the
mutation operator has successfully been detected by the test suite. If a mutant remains alive
even after all the test cases have been exhausted, it indicates inadequacy of the test suite
and the test suite is enhanced to kill the mutant. However, it is not this straightforward.
Remember that there is a possibility of a mutated program to be an equivalent program.
When this is the case, it is futile to try to design a test case that would identify the error.
An equivalent mutant has to be recognized, and needs to be ignored rather than trying to
add a test case that can kill the equivalent mutant.
An important advantage of mutation testing is that it can be automated to a great
extent. The process of generation of mutants can be automated by predefining a set of
primitive changes that can be applied to the program. These primitive changes can be
simple program alterations such as—deleting a statement, deleting a variable definition,
changing the type of an arithmetic operator (e.g., + to -), changing a logical operator (and
to or) changing the value of a constant, changing the data type of a variable, etc. A major
pitfall of the mutation-based testing approach is that it is computationally very expensive,
since a large number of possible mutants can be generated.
Mutation testing involves generating a large number of mutants. Also each mutant
needs to be tested with the full test suite. Obviously therefore, mutation testing is not
suitable for manual testing. Mutation testing is most suitable to be used in conjunction of
some testing tool that should automatically generate the mutants and run the test suite
automatically on each mutant. At present, several test tools are available that automatically
generate mutants for a given program.
10.8 DEBUGGING
After a failure has been detected, it is necessary to first identify the program statement(s)
that are in error and are responsible for the failure, the error can then be fixed. In this
section, we shall summarise the important approaches that are available to identify the
error locations. Each of these approaches has its own advantages and disadvantages and
therefore each will be useful in appropriate circumstances. We also provide some guidelines
for effective debugging.
error. This approach becomes more systematic with the use of a symbolic debugger (also
called a source code debugger), because values of different variables can be easily checked and
break points and watch points can be easily set to test the values of variables effortlessly.
Single stepping using a symbolic debugger is another form of this approach, where the
developer mentally computes the expected result after every source instruction and checks
whether the same is actually computed by a statement by single stepping through the
program.
Backtracking
This is also a fairly common approach. In this approach, starting from the statement at
which an error symptom has been observed, the source code is traced backwards until the
error is discovered. Unfortunately, in the presence of decision statements and loops, this
approach becomes cumbersome as the number of source lines to be traced back increases,
the number of potential backward paths increases and may become unmanageably large
for complex programs, limiting the use of this approach.
Program slicing
This technique is similar to back tracking. In the backtracking approach, one often has to
examine a large number of statements. However, the search space is reduced by defining
slices. A slice of a program for a particular variable and at a particular statement is the
set of source lines preceding this statement that can influence the value of that variable
[Mund, Mall and Sarkar, 2002]. Program slicing makes use of the fact that an error in the
value of a variable can be caused by the statements on which it is data dependent.
the basic units of testing. Since methods in an object-oriented program are analogous to
procedures in a procedural program, can we then consider the methods of object-oriented
programs as the basic unit of testing? Weyuker studied this issue and postulated his
anticomposition axiom as follows:
The main intuitive justification for the anticomposition axiom is the following. A
method operates in the scope of the data and other methods of its object. That is, all the
methods share the data of the class. Therefore, it is
necessary to test a method in the context of these. Moreover,
Adequate testing of individual
objects can have significant number of states. The behaviour
methods does not ensure that
of a method can be different based on the state of the
a class has been satisfactorily
corresponding object. Therefore, it is not enough to test all
tested.
the methods and check whether they can be integrated
satisfactorily. A method has to be tested with all the other
methods and data of the corresponding object. Moreover, a method needs to be tested at
all the states that the object can assume. As a result, it is improper to consider a method
as the basic unit of testing an object-oriented program.
Thus, in an object oriented program, unit testing would
An object is the basic unit
mean testing each object in isolation. During integration
of testing of object-oriented
testing (called cluster testing in the object-oriented testing
programs.
literature) various unit tested objects are integrated and
tested. Finally, system-level testing is carried out.
method call have to be identified and tested. This is not easy since the bindings take place
at run-time.
Object states: In contrast to the procedures in a procedural program, objects store data
permanently. As a result, objects do have significant states. The behaviour of an object is
usually different in different states. That is, some methods may not be active in some of
its states. Also, a method may act differently in different states. For example, when a book
has been issued out in a library information system, the book reaches the issuedOut state.
In this state, if the issue method is invoked, then it may not exhibit its normal behaviour.
In view of the discussions above, testing an object in only one of its states is not
enough. The object has to be tested at all its possible states. Also, whether all the transitions
between states (as specified in the object model) function properly or not should be tested.
Additionally, it needs to be tested that no extra (sneak) transitions exist, neither are there
extra states present other than those defined in the state model. For state-based testing, it
is therefore beneficial to have the state model of the objects, so that the conformance of
the object to its state model can be tested.
State transition coverage: It is tested whether all transitions depicted in the state model
work satisfactorily.
State transition path coverage: All transition paths in the state model are tested.
already integrated classes are integrated and tested. This is continued till all the classes
have been integrated and tested.
Stress testing
Stress testing is also known as endurance testing. Stress testing evaluates system performance
when it is stressed for short periods of time. Stress tests are black-box tests which are
designed to impose a range of abnormal and even illegal input conditions so as to stress
the capabilities of the software. Input data volume, input data rate, processing time,
utilisation of memory, etc., are tested beyond the designed capacity. For example, suppose
an operating system is supposed to support fifteen concurrent transactions, then the
system is stressed by attempting to initiate fifteen or more transactions simultaneously. A
real-time system might be tested to determine the effect of simultaneous arrival of several
high-priority interrupts.
Stress testing is especially important for systems that under normal circumstances
operate below their maximum capacity but may be severely stressed at some peak demand
hours. For example, if the corresponding non-functional requirement states that the response
time should not be more than twenty secs per transaction when sixty concurrent users are
working, then during stress testing the response time is checked with exactly sixty users
working simultaneously.
Volume testing
Volume testing checks whether the data structures (buffers, arrays, queues, stacks, etc.) have
been designed to successfully handle extraordinary situations. For example, the volume
testing for a compiler might be to check whether the symbol table overflows when a very
large program is compiled.
Configuration testing
Configuration testing is used to test system behaviour in various hardware and software
configurations specified in the requirements. Sometimes systems are built to work in
different configurations for different users. For instance, a minimal system might be
required to serve a single user, and other extended configurations may be required to
serve additional users. During configuration testing, the system is configured in each of
the required configurations and it is checked if the system behaves correctly in all required
configurations.
Compatibility testing
This type of testing is required when the system interfaces with external systems (e.g.,
databases, servers, etc.). Compatibility aims to check whether the interfaces with the external
systems are performing as required. For instance, if the system needs to communicate with
468 Fundamentals of Software Engineering
a large database system to retrieve information, compatibility testing is required to test the
speed and accuracy of data retrieval.
Regression testing
This type of testing is required when a software is maintained to fix some bugs or
enhance functionality, performance, etc. Regression testing is discussed in some detail in
Section 10.13.
Recovery testing
Recovery testing tests the response of the system to the presence of faults, or loss of power,
devices, services, data, etc. The system is subjected to the loss of the mentioned resources
(as discussed in the SRS document) and it is checked if the system recovers satisfactorily.
For example, the printer can be disconnected to check if the system hangs. Or, the power
may be shut down to check the extent of data loss and corruption.
Maintenance testing
This addresses testing the diagnostic programs, and other procedures that are required
to help maintenance of the system. It is verified that the artifacts exist and they perform
properly.
Documentation testing
It is checked whether the required user manual, maintenance manuals, and technical
manuals exist and are consistent. If the requirements specify the types of audience for
which a specific manual should be designed, then the manual is checked for compliance
of this requirement.
Usability testing
Usability testing concerns checking the user interface to see if it meets all user requirements
concerning the user interface. During usability testing, the display screens, messages, report
formats, and other aspects relating to the user interface requirements are tested. A GUI
being just being functionally correct is not enough. Therefore, the GUI has to be checked
against the checklist we discussed in Section 9.5.6.
Security testing
Security testing is essential for software that handle or process confidential data that is to
be guarded against pilfering. It needs to be tested whether the system is fool-proof from
security attacks such as intrusion by hackers. Over the last few years, a large number of
security testing techniques have been proposed, and these include password cracking,
penetration testing, and attacks on specific ports, etc.
Error seeding, as the name implies, it involves seeding the code with some known
errors. In other words, some artificial errors are introduced (seeded) into the program.
The number of these seeded errors that are detected in the course of standard testing is
determined. These values in conjunction with the number of unseeded errors detected
during testing can be used to predict the following aspects of a program:
The number of errors remaining in the product.
The effectiveness of the testing strategy.
Let N be the total number of defects in the system, and let n of these defects be found
by testing.
Let S be the total number of seeded defects, and let s of these defects be found during
testing. Therefore, we get:
n s
N S
or
n
N S
s
Defects still remaining in the program after testing can be given by:
(S 1)
N nn
s
Error seeding works satisfactorily only if the kind seeded errors and their frequency
of occurrence matches closely with the kind of defects that actually exist. However, it is
difficult to predict the types of errors that exist in a software. To some extent, the different
categories of errors that are latent and their frequency of occurrence can be estimated
by analyzing historical data collected from similar projects. That is, the data collected is
regarding the types and the frequency of latent errors for all earlier related projects. This
gives an indication of the types (and the frequency) of errors that are likely to have been
committed in the program under consideration. Based on these data, the different types
of errors with the required frequency of occurrence can be seeded.
Test documentation
A piece of documentation that is produced towards the end of testing is the test summary
report. This report normally covers each subsystem and represents a summary of tests which
have been applied to the subsystem and their outcome. It normally specifies the following:
What is the total number of tests that were applied to a subsystem?
Out of the total number of tests how many tests were successful?
How many were unsuccessful, and the degree to which they were unsuccessful, e.g.,
whether a test was an outright failure or whether some of the expected results of
the test were actually observed?
470 Fundamentals of Software Engineering
Regression testing
Regression testing spans unit, integration, and system testing. Therefore, it is difficult
to classify it into any of these three levels of testing. Instead, it can be considered to
be a separate dimension to these three forms of testing. After a piece of code has been
successfully tested, changes to it later or may be needed for various reasons. Regression
testing is the practice of running an old test suite after each change to the system or after
each bug fix to ensure that no new bug has been introduced due to the change or the
bug fix. However, if only a few statements are changed, then the entire test suite need
not be run—only those test cases that test the functions and are likely to be affected by
the change need to be run.
While resolution testing checks whether the defect has been fixed, regression testing
checks whether the unmodified functionalities still continue to work correctly. Thus,
whenever a defect is corrected and the change is incorporated in the program code, a
danger is that a change introduced to correct an error could actually introduce errors in
functionalities that were previously working correctly. As a result, after a bug-fixing session,
both the resolution and regression test cases need to be run. This is where the additional
effort required to create automated test scripts can pay off. As shown in Figure 10.9, some
test cases may no more be valid after the change. These have been shown as invalid test
case. The rest are redundant test cases, which check those parts of the program code that
are not at all affected by the change.
FIGURE 10.9 Types of test cases in the original test suite after a change.
Test automation
Testing is usually the most time consuming and laborious of all software development
activities. This is especially true for large and complex software products that are being
developed nowadays. In fact, at present testing cost often exceeds all other development life
cycle costs. With the growing size of programs and the increased importance being given to
product quality, test automation is drawing considerable attention from both industry circles
and academia. Test automation is a generic term for automating one or some activities of the
test process.
Coding and Testing 471
Other than reducing human effort and time, test automation also significantly
improves the thoroughness of testing, because more testing can be carried out using
a large number of test cases within a short period of time without any significant cost
overhead.
The effectiveness of testing, to a large extent, depends on the exact test case design
strategy used. Considering the large overheads that sophisticated testing techniques
incur, in many industrial projects, often testing is carried out using randomly selected
test values. With automation, more sophisticated test case design techniques can be
deployed. Without the use of proper tools, testing large and complex software products
can especially be extremely time consuming and laborious. A further advantage of using
testing tools is that automated test results are much more reliable and eliminate human
errors during testing. Regression testing after ever change or error correction requires
running of several old test cases. In this situation, test automation simplifies running of
the test cases again and again. Testing tools hold out the promise of substantial cost and
time reduction even in the testing and maintenance phases.
Every software product undergoes significant changes overtime. Each time the code
changes, it needs to be tested whether the changes induce any failures in the unchanged
features. Thus, the originally designed test suite needs to be run repeatedly each time code
changes, of course additional tests have to be designed and carried out on the enhanced
features. Repeated running of the same set of test cases over and over after every change
is monotonous, boring, and error-prone. Automated testing tools can be of considerable
use in repeatedly running the same set of test cases. Testing tools can entirely or at least
substantially eliminate the drudgery of running same test cases and also significantly
reduce testing costs. A large number of tools are at present available both in the public
domain as well as from commercial sources. It is possible to classify the tools into the
following types based on the specific methodology on which they are based.
Capture and playback: In this type of tools, the test cases are executed manually
only once. During the manual execution, the sequence and values of various inputs
as well as the outputs produced are recorded. On any subsequent occasion, the test
can be automatically replayed and the results checked against the recorded output. An
important advantage of the capture playback tools is that once test data are captured and
the results verified, the tests can be rerun easily and cheaply a large number of times.
Thus, these tools are very useful for regression testing. However, capture and playback
tools have a few disadvantages as well. Test maintenance can be costly when the unit
under test changes since some of the captured tests may become invalid. It would require
considerable effort to determine and remove the invalid test cases or modify the test
input and output data. Also new test cases would have to be added for the altered code.
Test script: Test scripts are used to drive an automated test tool. The scripts provide input
to the unit under test and record the output. The testers employ a variety of languages
to express test scripts. An important advantage of test script-based tools is that once the
test script is debugged and verified, it can be rerun easily and cheaply a large number of
times. However, debugging the test script to ensure its accuracy requires significant effort.
Also, every subsequent change to the unit under test entails effort to identity impacted test
scripts, modify them, rerun and reconfirm them.
472 Fundamentals of Software Engineering
Random input test: In this type of automatic testing tool, test values are randomly
generated cover the input space of the unit under test. The outputs are ignored because
analyzing them would be extremely expensive. The goal is usually to crash the unit under
test and not to check if the produced results are correct. An advantage of random input
testing tools is that it is relatively easy. This approach however can be the most cost-effective
for finding some types of defects. However, random input testing is a very limited form
of testing. It finds only the defects that crash the unit under test and not the majority of
defects that do not crash the system, but simply produce incorrect results.
Model-based test: A model is a simplified representation of program. There can be
several types of models of a program. These models can be either structural models
or behavioral models. Examples of behavioral models are state models and activity
models. A state model-based testing generates tests that adequately covers the state space
described by the model.
SUMMARY
In this chapter we discussed the coding and testing phases of the software life cycle.
Most software development organisations formulate their own coding standards and
expect their engineers to adhere to them. On the other hand, coding guidelines serve
as general suggestions to programmers regarding good programming styles, but the
implementation of the guidelines is left to the discretion to the individual engineers.
Code review is an efficient way of removing errors as compared to testing, because
code review identifies errors whereas testing identifies failures. Therefore, after
identifying failures, additional efforts (debugging) must be done to locate and fix
the errors.
Exhaustive testing of almost any non-trivial system is impractical. Also, random
selection of test cases is inefficient since many test cases become redundant as they
detect the same type of errors. Therefore, we need to design a minimal test suite
that would expose as many errors as possible.
There are two well-known approaches to testing—black-box testing and white-box
testing. Black box testing is also known as functional testing. Designing test cases
for black box testing does not require any knowledge about how the functions have
been designed and implemented. On the other hand, white-box testing requires
knowledge about internals of the software.
Object-oriented features complicate the testing process as test cases have to be
designed to detect bugs that are associated with these new types of features that
are specific to object-orientation programs.
We discussed some important issues in integration and system testing. We observed
that the system test suite is designed based on the SRS document. The two major types
of system testing are functionality testing and performance testing. The functionality
test cases are designed based on the functional requirements and the performance
test cases are designed to test the compliance of the system to the non-functional
requirements documented in the SRS document.