How To Break Software: James A. Whittaker
How To Break Software: James A. Whittaker
Abstract— This paper describes a number of methods (called “attacks”) to expose design and development flaws in
software. The attacks are manual, exploratory tests designed and executed on-the-fly with little or no overhead.
The attacks were conceived after studying hundreds of real software bugs and generalizing their cause and symp-
toms. Two semesters of refinement at the hands of software testing students at the Florida Institute of Technology
have identified dozens of approaches for attacking software with the intent of finding bugs. The attacks have been
very successful, resulting in hundreds of additional bugs— all found as a direct result of the attack strategies— in a
very short period of time with little or no familiarity with the products involved. This paper describes a subset of
the attacks and demonstrates their use to find real bugs in released products.
Introduction
What is it that makes good testers good? What instinct do they possess that guides them so reliably to bugs? Is this
valuable talent teachable?
These questions are the subject of this paper. I believe that good testers are guided by more than instinct, indeed, it
appears that over the years many testers build an arsenal of standard attacks. Each time they are faced with a new
testing problem they orchestrate their attacks and invariably find bugs. Although these attacks strategies rarely get
written down, they serve an important role in manual testing and in mentor-based training of new testers.
We have begun the process of documenting this arsenal by studying real testers and real bugs. In this paper, we
explore a subset of the existing attack strategies that have resulted from this work. Our next challenge is to begin
the process of automating the attacks and to derive measures of their effectiveness.
The attacks fall into at least one of three general categories:
q Input/output attacks
q Data attacks
q Computation attacks
Within each category specific types of attacks can be identified that yield very interesting software failures. In the
sections that follow I describe a number of attack types from each category and include real bugs as examples.
Each bug I demonstrate comes from products developed at Microsoft Corporation. This should not be viewed as
an anti-Microsoft stance on my part. Indeed their reign as the top software company on this planet makes them an
obvious target. But do not assume that Microsoft products are more buggy than those of other vendors. The attacks
described in this paper have broken software from many vendors on almost every conceivable operating platform.
My experience indicates that developers write bugs at a fairly uniform rate regardless of their application domain,
which operating system they use or whether or not they publish their source code. Unless, of course, they are web
developers, very little strategy is required to break web software… it is pretty good at dying all on its own.
Attacking Input/Output
Attacks on input and output are what most testers call “black box” testing because no information about internal
data or computation is required to pull them off. Indeed, this is the most common type of testing because looking at
source code is tedious, time consuming and largely unproductive unless you really know what you are looking for
(we’ll discuss what you should be looking for in the next two sections).
I/O attacks include the 10 items listed below, organized by input value, input value combination and input order.
Discussion of each attack and example bugs are discussed next.
J. Whittaker is an associate professor and chair of software engineering at the Florida Institute of Technology ([email protected]).
His research interests include the technical side of software engineering, specifically, coding, testing and dependability meas-
urement. He generally steers clear of project management and software process.
done a huge service to our users (not to mention our maintenance developers).
Figure 1 shows an interesting bug my students found in Microsoft Word 2000 in which an error message appeared
twice in a row for no particular reason. This bug was found when attacking the error handling routines by investi-
gating single values of inputs.
Make sure you force the software to establish default values. Developers very often forget to establish proper
default values when users enter data out of range or configure parameters improperly. Sometimes forcing defaults
means doing nothing at all— an act that can trip up even good developers because it is so unexpected. For example,
in Word 2000 the following dialog has an options menu that when left unchanged actually makes controls disap-
pear when the dialog is redisplayed. Compare the dialog on the left with the one on the right. Notice any missing
controls?
Sometimes forcing defaults requires changing values from their initial settings once and then changing them a
second time to an improper configuration. These back-to-back changes ensure that the default settings can be re-
established once they are changed to other valid values.
Explore allowable character sets for variable input. Some input values are simply problematic, particularly
when you consider that special characters like $, %, #, quotation marks and so forth have special meaning in many
programming languages and often require special handling when they are read as input. If the developer failed to
consider this situation then these inputs may cause the program to fail when they are encountered.
Force output size to change by replacing large inputs with small inputs and vice versa. Focusing on the dispo-
sition of output is a lucrative and little-used technique to find bugs. The idea is to think of an output or behavior
that would signify a bug and then try to come up with the inputs that will force that behavior to occur. One conven-
ient attack along these lines is forcing output areas to be recomputed by changing the length of inputs and input
strings.
A good conceptual example is setting a clock to 9:59 and watching it roll over to 10:00. In the first case the display
area is 4 characters long and the second it is 5. Going the other way, we establish 12:59 (5 characters) and then
watch the text shrink to 1:00 (4 characters). Too often developers write code to work with the initial case of a blank
display area and are often disappointed when the display area already has data in it and new data of different size
is used to replace it.
For example, “WordArt” in PowerPoint has an interesting problem. Suppose we enter a long string as shown be-
low.
Notice that the entire string doesn’t display because it is so long. But that’s not what is really important. Two
things went on when the OK button was pressed. First, the routine computed the size of the output field needed and
then populated the field with the text we entered. Now let’s edit the string and replace it with a single character.
Notice that the display area stays the same size despite the fact that only one character is inserted and the font size
was not changed. Let’s pursue this further. If we edit the string again and type a multi-line string the output is even
more interesting.
I think the point is made and
we can move on to the next
attack.
Make sure you explore the
edges of display areas. This
is another attack based on
outputs that is very similar to
the previous attack. However,
instead of looking for ways to
cause the area inside the dis-
play to get corrupted, we are
going to concentrate on out-
side the display area. This
time we are going to do
things we hope don’t require
recalculation of the display
boundaries but simply over-
flow them.
Considering PowerPoint
again, we can draw a textbox
and fill it with a superscripted
string.
Changing the size of the su-
perscript to a large font
causes the top of the exponent
to be truncated.
This feature is demonstrated
below in conjunction with the
following related problem.
Try to force screen refresh
problems. This is a major
problem for users of modern
windows-based GUIs. It is an
even bigger problem for de-
velopers: refresh too often and you slow down your application, failing to refresh causes anything from minor an-
noyances (i.e., requiring the user to force refresh) to major bugs (preventing the user from getting work done).
The general idea in searching for refresh problems is to add, delete and move objects around on the screen. This
causes the background object to redisplay and if it doesn’t do it properly and in a timely fashion, you have just
found the classic refresh bug. It is a good idea to try varying the distance you move an object from its original loca-
tion. Move it a little, then move it a lot; move it a once or twice, then move it a dozen times.
Continuing with the large superscript example from above, try moving it around on the screen a little at a time.
Note the nasty refresh problem shown below.
Another recurring problem in Office 2000 associated with screen refresh is disappearing text. This is most annoy-
ing in Word just around the page boundaries.
It is interesting to note that just rotating the text box 180 degrees does not reveal the bug. One must follow the se-
quence of rotate commands described: rotate 10°(or more) followed by 180°. Undo-ing the sequence of operation
does not correct the problem either, each time one clicks outside the title area, it disappears.
The reason that input sequencing is such a bug-rich attack strategy is that many operations complete successfully
but leave side-effects that cause future operations to fail. A thorough investigation of input sequences will expose
many of these problems. Sometimes, the amount of variation with the input sequence doesn’t have to be particu-
larly diverse in order to find a bug as the next attack shows.
Repeat the same input or input sequence over and over again. This has the effect of gobbling resources and
stressing an application’s stored data space, not to mention uncovering undesirable side-effects. Unfortunately,
most applications are unaware of their own space and time limitations and many developers like to assume that
plenty of resources are always available.
An example of this can be found in Word’s equation editor which seems to be unaware that it can only handle 10
levels of nested brackets.
Attacking Data
Data is the lifeblood of software; if you manage to corrupt it the software will eventually have to use the bad data
and what happens then may not be pretty. So it is worthwhile to understand how and where data values are estab-
lished.
Essentially, data is stored either by reading input and then storing it internally or by storing the result of some in-
ternal computation. So it is through supplying input and forcing computation that we enable data to flow through
the application under test. The attacks on data follow this simple fact as outlined below.
1. Force incorrectly typed data to be stored
Attacks by variable value
2. Force data values to exceed allowable range
3. Overflow input buffers
Data Attacks
Attacks by data element size 4. Force too many values to be stored
5. Force too few values to be stored
Attacks by data access 6. Find alternate ways to modify the same data
Attacks by Variable Value
This class of attacks require investigation of the data type and allowable values associated with internally stored
data objects. If one has access to the source then this information is readily available, however, significant type
information can be determined through a little exploratory testing and attention to error messages.
Vary the data type used in input fields to find type mismatches. Entering characters where the program expects
integers (and similar attacks) have long proven fruitful but we have found that such attacks are less successful than
before because of the ease at which type checking and type conversion are handled by modern programming lan-
guages.
Try to exceed allowable ranges of data values. Variable data that is stored is subject to the same attacks as vari-
able data entered as input.
Attacks by Data Element Size
The second class of data attacks is aimed at overflowing
and underflowing data structures. In other words, the at-
tacks attempts to find data that violates the predetermined
size constraints of data objects.
The first such attack is the classic buffer overflow.
Try to overflow input buffers. This idea here to enter
long strings to overflow input buffers. This is a favorite
attack by hackers because sometimes the application is
still executing a process after it crashes. If a hacker at-
taches an executable string to the end of the long input
string, the process may execute it.
A buffer overflow in Word 2000 is one such exploitable
bug. The bug is in the Find/Replace feature is shown be-
low. It is interesting to note that Find field is properly con-
Attacking Computation
1. Force computation with illegal operand
Attacks by operand
2. Find illegal operand combinations
Computation Attacks 3. Force a computation result to be too large
Attacks by result
4. Force a computation result to be too small
Attacks by feature interaction 5. Find features that share data poorly
Attacks by Operand
This class of attacks require investigation of the data type and allowable values associated with operands in one or
more internal computations. If one has access to the source then this information is obtainable. Otherwise, testers
must do their best at determining what computation is taking place and what type of data is being used.
Try to make a computation occur with an illegal operand. Sometimes inputs or stored data are well within the
legal boundaries but are illegal for some types of computation. Division by zero is a good example. Zero is a valid
integer but invalid as the denominator of a division computation.
Try to find a combination of operands that cannot coexist. Computations that have more than one operand are
subject to not only the above attack but also to potential operand conflict.
Attacks by Result
The second class of computation attacks is aimed at overflowing and underflowing data objects that store computa-
tion results.
Try to force the computation of a result that is too large to store. Even simple computations like y=x+1 are
problematic around boundary values. If both x and y are 2 byte integers and x has the value 32768 then this com-
putation will fail because the result will overflow its storage.
Try to force the computation of a result that is too small to store. Same as above but use y=x-1 and assign x
the value –32767.
Attacks by Feature Interaction
This last attack category discussed in this paper is perhaps the granddaddy of them all and the one that separates
testing novices from the pros: feature interaction. The problem here is nothing new: that different application fea-
tures share the same data space and either through differing assumptions about the disposition of the data or
through the generation of undesirable side-effects, the interaction of the two features causes the application to fail.
But which features share data and could interpret it in conflicting ways is an open question in testing. Right now
we are stuck with trial and error. So this example must suffice.
This example shows an unexpected result when combining footnotes and dual columns on a single page in Word
2000. The problem is that Word computes the page width of a footnote from the reference point of the note. Thus,
if one has two footnotes on the same page, one referenced from a dual column location and one from a single col-
umn location, the single column footnote pushes the dual column footnote to the next page. Also pushed to the next
page is any text between the notes reference point and the bottom of the page.
The following screen shots illus-
trate the problem vividly. Where is
the second column of text? On the
next page along with the footnote.
Can you live with the document
looking like this? You’ll have to
unless you find a workaround
(which means time spent away
from preparing your document).
Conclusion
Simply going through the 21 at-
tacks outlined above should exer-
cise a great deal of an application’s
functionality. Indeed, staging a
successful attack usually means
experimentation with dozens of
possibilities and pursuing a num-
ber of dead-ends. But just because
some of this exploration doesn’t
find bugs does not mean that it is not useful. First of all, the time spent using the application familiarizes testers
with the range of possible functionality and leads to new ideas for additional attacks. Second, successful tests are
good news! They indicate that a product is reliable: particularly if that set of tests are malicious attacks as outlined
above. If code can withstand this treatment, it may very well withstand whatever users can dish out.
Also, never underestimate the value of having a concrete goal in mind when you are testing. I’ve seen too many
testers waste time poking at a keyboard or making random API calls hoping something breaks. Staging attacks
means formulating clear goals— based specifically on things that could go wrong— and then designing the tests
that investigate the goal. This way, every test has a purpose and progress can be readily monitored.
Finally, remember always that testing should be fun. The attack analogy supports this good natured view of testing
and adds a little more spice to a very enjoyable pastime. Happy hunting!
Want more information about Dr. Whittaker’s research? The following papers are available through published literature
sources. Some of these are posted on http://se.fit.edu.
J. A. Whittaker, “What is software testing. And why is it so hard,” IEEE Software, 17, 1 pp. 70-79, (2000).
J. A. Whittaker and A. Jorgensen, “Why software fails,” ACM SIGSOFT Software Engineering Notes, 24, 4, (1999).
J. A. Whittaker, “Stochastic software testing,” The Annals of Software Engineering, 4, pp. 115-131 (1997).
J. A. Whittaker and M. G. Thomason, “A Markov chain model for statistical software testing, IEEE Transactions on Software
Engineering, 20, 10, pp. 812-824 (1994).
J. A. Whittaker and J. H. Poore, “Markov analysis of software specifications,” ACM Transactions on Software Engineering
and Methodology, 3, 1, pp. 93-106 (1993).
J. A. Whittaker and M. Al-Ghafees, “Selecting software test data using black-box data flow information,” submitted to ACM
Transactions on Software Engineering and Methodology.
J. A. Whittaker and J. M. Voas, “Toward a more reliable theory of software reliability,” submitted to IEEE Computer.
J. A. Whittaker, “Software’s invisible users,” submitted to IEEE Software.