01 Fundamentals of Testing
01 Fundamentals of Testing
CO MP0
Fundamentals of Testing MP 103-
010 A7P
3-A
TU
Validation & Verification
Part of the slides are used with kind permission of Dr Shin Yoo and Dr Yue Jia
COMP0103 [email protected]
Why do we test software?
UCL
COMP0103 [email protected]
Major Software Failures
UCL
✤ NASA’s Mars lander: September 1999, crashed due to a units
integration fault
✤ Ariane 5 explosion
https://www.youtube.com/watch?v=gp_D8r-2hwk
http://www.cas.mcmaster.ca/~baber/TechnicalReports/Ariane5/Ariane5.htm
COMP0103-A7P/U [email protected]
London Heathrow Terminal 5
Opening
UCL
Staff successfully tested
the brand new baggage
handling system with
over 12,000 test pieces of
luggage before the
opening to the public
COMP0103-A7P/U [email protected]
London Heathrow Terminal 5
Opening
UCL
1 single real life scenario caused
the entire system to become
confused and shut down
COMP0103-A7P/U [email protected]
Cost of Software Bugs
UCL
COMP0103-A7P/U [email protected]
What is Software Testing? UCL
COMP0103-A7P/U [email protected]
Level of Testing Goals
UCL
to show correctness
Increasing Testing
Process Maturity
to show problems
COMP0103-A7P/U [email protected]
Software Testing
UCL
COMP0103-A7P/U [email protected]
Software Qualities
UCL
✤ Dependability
✤ Correctness
✤ A program is correct if it is consistent with its specification
✤ Seldom practical for non-trivial systems
✤ Reliability
✤ Probability of correct function for some ‘unit’ of behaviour
✤ Relative to a specification and usage profile
✤ Statistical approximation to correctness (100% reliable = correct)
✤ Safety
✤ Preventing hazards (loss of life and/or property)
✤ Robustness
✤ Acceptable (degraded) behaviour under extreme conditions
✤ Performance
✤ Usability
COMP0103-A7P/U [email protected]
Software Testing
UCL
✤ Testing is the process of finding differences between the expected behaviour specified by
system models and the observed behaviour of the implemented system.
✤ Unit testing finds differences between a specification of an object and its realisation as a
component
✤ Structural testing finds differences between the system design model and a subset of
integrated subsystems
✤ Functional testing finds differences between the use case model and the system
✤ When differences are found, the developers identify the defect causing the observed failure
and modify the system to correct it or if the system model is identified as the cause of the
difference it is updated to reflect the system
COMP0103-A7P/U [email protected]
Terminology: Fault, Error, Failure
UCL
COMP0103-A7P/U [email protected]
Terminology: Fault, Error, Failure
UCL
COMP0103-A7P/U [email protected]
Example: Faults, Error, Failure
UCL
A patient gives a doctor a list of symptoms
COMP0103-A7P/U [email protected]
Example: Faults, Error, Failure
UCL
A patient gives a doctor a list of symptoms Failure
COMP0103-A7P/U [email protected]
Dynamic vs. Static
UCL
COMP0103-A7P/U [email protected]
What About Software Bugs?
UCL
COMP0103-A7P/U [email protected]
How To Deal With Faults
Object-Oriented Software Engineering: Using UML, Patterns, and Java, 3rd Edition
UCL
Prentice Hall, Upper Saddle River, NJ, September 25, 2009.
✤ Fault avoidance
✤ Use Reviews
✤ Fault detection
✤ Fault tolerance
✤ Exception handling
✤ Modular redundancy
COMP0103-A7P/U [email protected]
More Terminology
UCL
✤ Test Input: a set of input values that are used to execute a given program
✤ Test Oracle: a mechanism for determining whether the actual behaviour of a test input
execution matches the expected behaviour
✤ Test Effectiveness: the extent to which testing reveals faults or achieves other objectives
✤ Testing vs. Debugging: testing reveals faults, while debugging is used to remove them
COMP0103-A7P/U [email protected]
Example
UCL
SUT
System
Under Test
COMP0103-A7P/U [email protected]
Example
UCL
COMP0103-A7P/U [email protected]
Example
UCL
COMP0103-A7P/U [email protected]
Testing Activities
UCL
Test Design Test Execution Test Evaluation
COMP0103-A7P/U [email protected]
Example
UCL
x = -2 SUT y = -5 y>0
COMP0103-A7P/U [email protected]
A Test Case Failed
UCL
x = -2 SUT y = -5 y>0
COMP0103-A7P/U [email protected]
A Test Case Failed
UCL
Requirements
x = -2 SUT y = -5 y>0
COMP0103-A7P/U [email protected]
A Test Case Failed
UCL
Requirements
x = -2 SUT y = -5 y>0
COMP0103-A7P/U [email protected]
A Test Case Failed
UCL
Requirements Libs
x = -2 SUT y = -5 y>0
COMP0103-A7P/U [email protected]
A Test Case Failed
UCL
Requirements Libs
x = -2 SUT y = -5 y>0
OS
COMP0103-A7P/U [email protected]
A Test Case Failed
UCL
Requirements Libs
x = -2 SUT y = -5 y>0
OS
Hardware
COMP0103-A7P/U [email protected]
A Test Case Failed
UCL
Requirements Libs …but, when re-executed,
sometimes it passes!
x = -2 SUT y = -5 y>0
COMP0103-A7P/U [email protected]
Brief Look at Software Lifecycle
UCL
COMP0103 [email protected]
Waterfall Model (Royce, 1970)
UCL
Requirements
Design
Implementation
Integration
Validation
Deployment
COMP0103-A7P/U [email protected]
Spiral Model (Boehm, 1988)
UCL
COMP0103-A7P/U [email protected]
Recent Paradigms
UCL
✤ Agile?
✤ Test-Driven Development?
COMP0103-A7P/U [email protected]
Testing Activities
UCL
Developer Client
COMP0103-A7P/U [email protected]
Testing Activities
UCL
COMP0103-A7P/U [email protected]
Brief Look at Testing Techniques
UCL
COMP0103 [email protected]
How Do You Test … ?
UCL
COMP0103-A7P/U [email protected]
Testing Techniques
UCL
✤ There is no fixed recipe that works always
COMP0103-A7P/U [email protected]
Random Testing
UCL
✤ Can be both black-box or white box
✤ Pros:
✤ Cons:
COMP0103-A7P/U [email protected]
Combinatorial Testing
UCL
✤ Black-box technique
COMP0103-A7P/U [email protected]
Structural Testing
UCL
✤ White-box technique
COMP0103-A7P/U [email protected]
Mutation Testing
UCL
✤ White-box technique
COMP0103-A7P/U [email protected]
Regression Testing
UCL
COMP0103-A7P/U [email protected]
Model-based Testing
UCL
COMP0103-A7P/U [email protected]
Why Is Testing Hard?
Exhaustive Testing & Oracles
COMP0103-A7P/U [email protected]
Exhaustive Testing
UCL
✤ Can we test each and every program with all possible inputs, and
guarantee that it is correct every times? Surely then it IS correct
✤ Takes three 32bit integers, tells you whether they can form three
sides of a triangle, and which type if they do
COMP0103-A7P/U [email protected]
Exhaustive Testing
UCL
✤ 32bit integers: between -231 and
231-1, there are 4,294,967,295
numbers
COMP0103-A7P/U [email protected]
Test Oracle
int testMe (int x, int y)
{
return x / y;
}
UCL
✤ In the example, we immediately know something is wrong when we
set y to 0: all computers will treat division by zero as an error
✤ What about those faults that forces the program to produce answers
that are only slightly wrong?
COMP0103-A7P/U [email protected]
Oracles and Non-Testable
Programs
UCL
✤ Weyuker observed that many programs are ‘non-testable’, in the
sense that it is nearly impossible to construct an effective oracle for
them
✤ Many numerical algorithms, e.g., multiplication of two large
matrices containing large values
✤ Must somehow compute result independently to validate it
✤ But that independent computation may be just as faulty
✤ Many large distributed real-time programs, e.g., USA’s Strategic
Defence Initiative (SDI), aka ‘Star Wars’
✤ Testing must demonstrate with sufficient confidence that it
would protect the USA from a nuclear attack
COMP0103-A7P/U [email protected]
Oracles and Reliability Testing
UCL
✤ Reliability testing gets around some of the problems of non-testable
programs by applying statistical reasoning to a testing activity
✤ Reliability: The probability of failure-free operation over some stated
period of time
✤ Can be estimated through testing, to a level of precision that depends on
how much testing was performed
✤ The greater the amount of testing, the greater the precision
✤ Butler and Finelli observe that it is physically impossible to attain the
stated reliability targets of many safety-critical systems
✤ Example: achieving ‘nine 9s’ reliability would require centuries of
testing
COMP0103-A7P/U [email protected]
High Dependability
Vs. Time-to-Market UCL
✤ Mass market products
COMP0103-A7P/U [email protected]
When To Stop Testing?
UCL
✤ When the program has been tested “enough”
✤ Temporal Criteria: the time allocated run out
✤ Cost Criteria: the budget allocated run out
✤ Coverage Criteria: a predefined percentage of the elements of a program is
covered by the tests; or test cases covering certain predefined conditions
are selected
✤ Statistical Criteria: predefined MTBF (mean time between failures)
compared to an existing predefined reliability model
✤ Practical Goals
✤ maximising the number of faults found (may require many test cases)
✤ minimising the number of test cases (and therefore the cost of testing)
COMP0103-A7P/U [email protected]
Competing Goals…
UCL
✤ Practical Goals
✤ maximising the number of faults found (may require many test cases)
✤ minimising the number of test cases (and therefore the cost of testing)
http://crest.cs.ucl.ac.uk/about/
COMP0103-A7P/U [email protected]