Testing

Uploaded by

sadaf.mushtaq

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Testing

Uploaded by

sadaf.mushtaq

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

TESTING FOUNDATIONS

In this module, you will investigate a variety of testing principles, models of testing, and types of
systematic testing strategies.
Learning Objectives
 Explain the difference between faults, errors, and failures
 Remember the definitions of several dependability definitions and measures
 Interpret test results to determine systemic classes of errors occurring in development and
use this information to focus the testing process
 Describe the "what, where, when, how, and why" principles of testing
 Relate testing goals at different levels of abstraction: unit, subsystem, and system
 Define testable units within a program specification.
Lesson No 1: Dependability Definitions:
Hello, my name is Mike Whalen, and I'm going to talk about dependability.
Dependability is what you would expect. We're interested in determining whether or not the
software is dependable, that is whether it delivers a service such that we can rely on it.
Dependability is the quality of delivered service such that reliance can justifiably be placed
on the service
Service is the system behavior as it's perceived by the user of the system. So in an airplane,
the service of the airplane is to fly people from one destination to another.
A failure occurs when the delivered service deviates from the Service specification that
defines what the desired service is. So we have a website where we'd like to be able to buy
things and when we try and buy something, it returns an error. That would be a failure that the
user can see.
An error is the part of the system state that can lead to a failure. So errors can be latent or
effective. So we have some bugs buried in our code, and at some execution, we actually hit that
bug and then the error becomes effective. Previous to that, it was latent.
Fault is the root cause of the error. It can be human caused (e.g., programmers mistake) or
physical (e.g., short circuit)
Now, when we're dealing with mechanical systems, it could be that a piece breaks, so you
actually have a physical fault in a piece. Or that you have some error of cognition, error of
understanding. So the programmer doesn't completely understand what the requirements are for
the system, and so when they start writing the code, they don't do it correctly.
So just flipping it around and doing it from the other direction, we have programmer mistakes
which are faults, which are the root cause of latent errors. Which are bugs in the program, at
some point, which become effective errors when the program is executing, and we actually hit
that line of code that contains the bug. And depending on whether we have some fault tolerance
built into the code, that error may not become visible to the user. But if it does, and it causes the
system to misbehave, then we have a failure. So that could be that the program crashes, or it
returns some error code or other thing that causes the user not to be able to do what they expect
to be able to do. And our goal with testing, but also with a variety of different software
development life cycle processes, is to build dependable software. And in order to do that, we
have to look at four different kinds of approaches to dependability.
Fault avoidance, which is preventing, by construction, certain kinds of faults. So if you look
at different programming languages, C versus Java, for example, they have different kinds of
fault avoidance.
So in C, I can write an array and then I can just read past the end of it, and I can have something
called a buffer overflow, that causes lots of problems with the security. Now, Java actually has as
an array bounds checking. So it's not possible to write that code, that will cause the failure later
on. So certain kinds of languages actually have built-in fault avoidance. Certain languages such
as Haskell have very rich type systems that allow you, at compile time, to find a lot of the
problems that you might, otherwise, inject into your program and say, see.
Fault tolerance: So here, what we are going to do is we're going to have redundant subsystems
such that one of them can fail and the rest will continue to operate. So you see this a lot in critical
systems where you have, for example, multiple actuators that control an aileron in an airplane.
So if we lose one of the actuators, if it misbehaves for some reason, we can still use the other one
to move the flaps in the aircraft. When you look at big websites, you tend to have lots and lots of
redundant servers, and if one of them crashes, the rest of the servers can handle the remaining
load.
Error Removal: So in this process, what we're trying to do is get rid of the errors themselves.
So we're going to apply verification to remove latent errors. And finally, we have error
forecasting, which unlike the other three, is just a way of us measuring how likely we are to have
failures based on looking at the behavior of the program.
So if you think about it, under which of these categories is testing?
Okay, hopefully, everyone was able to come up with the error removal.
If we're to look at this graphically, what we'd look at is we have impairments, which are the
things we're trying to avoid, so faults. All programmers make mistakes, and they're going to
eventually introduce errors into the code, but we'd rather that these didn't lead to failures. And so
what we're going to do, is we're going to look at means of achieving dependability. We're going
to look at, in the case of testing, error removal. So we're going to run tests against the software,
and we're going to remove some of those errors from the code. And based on the results of our
testing, we might do error forecasting that says, well, we've run a big test suite for three weeks,
and we haven't found any problems, so we think it's likely to be reliable. But we can also look at
procurement. So we can look at certain techniques that will prevent falls from being introduced
in the first place. So places like JPL, and automotive industries, and the aerospace industries.
They have a lot of guidance in what you can write in your program and what you can't write, and
how you design things in order to avoid lots of classes of faults.
The other thing we can do is be tolerant of faults. So we know that there's a certain level of errors
that we're going to find in code. And we're going to build things around those possibly erroneous
components, in such a way that the system can continue to operate, even in the presence of
errors. And finally, we're going to try and measure our dependability, and we use two different
metrics for doing this. We're going to use both reliability and availability. So, what's the
difference? So availability is the readiness of the software to respond to user requests.
And reliability is continuity of correct service. That seems like those are basically the same
thing, but they're not. Because it turns out that you may have an unreliable system, but if you can
reboot it really fast, it actually is still pretty available. So reliability says, how long can I run this
thing contiguously and have it still work correctly? And availability says, what are the chances in
any given time that the system's available for me? And there are some other measures that are
equally important especially when you start looking at safety critical systems. Safety is the
absence of catastrophic consequences based on failures of the software. And you may think
again, what's the difference between safety and reliability? Well, it turns out, you can have a
system that's very safe, that's very unreliable. So, for example, if your car never starts, it's pretty
safe as long as it's in your drive way, but it's very unreliable.
On the other hand, you can have a fairly reliable system that occasionally is very unsafe. So you
could have a car, that every once in awhile, exhibits unintended acceleration. Most of the time,
it's reliable, but when that corner case happens, it's very unsafe. So some other measures that are
important, are integrity, which is the absence of improper system alteration. So when you think
about security and someone taking over your computer, what you have is a failure of integrity.
So they've exploited some buffer overflow, they're able to change the software, and thereby, gain
access into your data. And finally, maintainability, the ability for a process to undergo
modifications and repairs. So, with software, it's always the case that you're upgrading things,
and you're changing the way that the software works if it's successful in being used at scale. So
this maintainability can actually contribute to reliability, because if you have to take the software
offline to maintain it and it takes a long time, that's going to decrease your reliability.
So then we can turn these into numbers.
So for reliability, we talk about mean time between failures. So this idea of being able to run the
system contiguously for a long time.
Recoverability is how quickly, if the system fails, it can be restored to correct operation. So this
is measured in mean time to repair.
So then we can put those two numbers together, and determine availability. So availability is the
mean time between failures divided by the mean time between failures plus the time to recover.
So basically, if it's running this amount, and it takes this amount of extra to recover once a failure
occurs, that gives you your availability, that ratio.
And when we think about designing software, depending on what you're doing, if you're in a
critical environment, and this doesn't have to be defining software for airplanes. It could be that
you work at a bank, and you have hundreds of millions of dollars going through the system. You
have to plan for failure. You have to expect that other software systems are going to lie to you.
That physical actuators and sensors may not behave as expected, and that the hardware you're
running on may also be unreliable. So one of the things that becomes important when you work
in critical systems is being able to determine how robust your system is in the presence of these
failures. And where this comes in to the testing process is that when we look at the expected
behavior for tests, we're going to have to set up a testing environment where we can cause some
of the inputs to be unreliable. And the system should still do the right thing, or we may even
have a stress test where we pull the plug on certain pieces of hardware. This is in fact what
Netflix does to test the reliability and the robustness of its systems, is it actually has something
called Chaos Monkey, which is an automated testing tool that just nukes certain pieces of
software and hardware from time to time, and then they measure how well the system responds.
So I'm going to talk, just for a minute, on how one piece of guidance for critical systems, for
airplanes, looks at planning for failure, and that example is DO178B which is used for airplanes.
So, what we do is we categorize pieces of software by what's the worst that can happen if this
thing fails? So there are five levels of criticality. One's catastrophic, which means that if the
software fails, the plane may crash, and in fact, they may be likely to crash. Another one is
hazardous. So this one, it's not going to cause the plane to crash, but it's going to make it really
hard for the crew to fly the airplane. Major, in this case, it's going to significantly increase the
work load of the crew. Minor, in this case, it's an annoyance. And finally, no effect.
And what happens is that we're going to drive the rigor of the testing process based on the
criticality of the software and its robustness. So if we have a level A piece software that has
catastrophic failure conditions, we're going to require all kinds of robustness to a variety of
different environmental scenarios. And we're going to define a bunch of different objectives that
the software has to meet. So we're going to talk later on about adequacy criteria for tests. What
kinds of things do we have to do before we consider the system to be adequately tested? And
what we're going to do is we're going to match the software, the level of criticality of the
software to the rigor of the adequacy measure.
So basically, what we're trying to do is to be systematic about
determining how important a piece of software is, and then how much time and effort to spend
verifying it.
And just to recap, what we talked about here is dependability. How much confidence can we
place in a piece of software based on the way it was developed, the testing process that we used
to remove errors from it, and some mechanisms that are built into our design processes, or our
language that prevents certain class of failures, excuse me. And the way that we measure
dependability, we talk about it is first, we talk about faults, which are the programmers error in
understanding of the situation. And then the latent errors, which are those errors in the brain
turning into errors in the code. And then those turn into effective errors when we're executing the
program, and we actually hit that erroneous piece of code, which can turn into failures if we don't
have any fault prevention techniques in place that can respond to those errors.
And when we look at testing, testing is very important to build dependable software. But it's only
a small part to some degree of an effective strategy for error removal and for preventing bad
software, for creating dependable software. We have to look at fault avoidance, fault tolerance,
error removing, and error forecasting.
And in order to create dependable software, it's not even enough for our programs to do the right
thing. When we're talking about critical systems, we have to plan for failure. We have to have
software that's robust. So it's not enough that if all the inputs match our expectations, the system
behaves as intended. We have to be robust to situation where the inputs don't match our
expectations, and be able to respond to those situations.
And so what we're going to do is when we define testing regimens, we're going to determine the
rigor that's necessary based on the criticality of the software that we make.
And just for some references, here is where that information comes from. Thank you.