Four Stages of Software Engineering
Four Stages of Software Engineering
Stage 1: Logic
Stage 1: Began with the work of Alan Turing in World War 2.
The “Turing Machine” was a thought or paper experiment in how machines might compute things.
This is a description of an actual, working, correct Turing Machine!
Take one Box with a small window cut into it.
Thread some long paper past the window so that it moves one window-sized square at a
time. Call this exposed bit of paper, “the Cell”.
On the inside of the box near the window, draw a square the same size as the window. Call
this, “the Machine”
Print these instructions on the inside of the box:
If the Cell shows “0” and the Machine shows “1” then…
Write “0” in the Machine box.
Move the paper tape one cell to the right.
If the Cell shows “0” and the Machine shows “0” then…
Write “1” in the Machine Box.
Write “1” in the Cell.
If the Cell shows “1” and the Machine shows “1” then…
Move the paper tape one cell to the left.
If the Cell shows “1” and the Machine shows “0” then…
STOP.
Put an idiot in the box (“idiot” is Turing’s original word choice) and tell him to read the
instructions.
A Turing Machine can in theory do everything a modern computer can do.
Page 1
The Turing Machine itself is only a thought experiment, although students of second year
university Logic often must “run” the machine to prove certain things to themselves and
their instructors:
Add 2 and 2 (takes about 80 steps)
Multiply 2 and 3 (takes upwards of 400 steps)
Use this Turing Machine to execute pre-defined sequences of steps that are already
written on the tape (run programs)
Use the Turing Machine to “execute” a different Turing Machine design that uses other
numbers of Window Cells and Machine States (run virtual machines)
All modern and not so modern computer languages are based in this Logic
We do not see this very basic type of calculation because of the many layers of modern
languages that “hide” the extremely low level functions from us.
“Compilers” and “Interpreters” translate modern computer languages into something very
much like Turing Machine language that is custom-tailored to the computer it will run on.
Stage 2: Structure
Take a step back from Turing Machines, Computers, and Computer languages. Is there some way
to make them easier to read and understand?
Stage 2: Structured Programming was “invented” to solve this problem, which it did to a certain
extent.
“Structured” means all programming constructs (all specific computer language functions) fall into
three distinct classes:
Selection (make a decision about something)
Repetition (do something repeatedly, until a condition is met)
Sequence (do a sequence of things)
This approach, when rigorously followed and when carefully considered does result in “better”
programs.
They are easier to write because they are conceived with greater clarity.
They are easier to read.
They are easier to change.
Page 2
There is still tremendous room for creativity in this method.
Functional example
One might group all Selection code in one place in the program text, all Repitition in
another, and all Sequence in yet a third.
Keep in mind, one single action is a “Sequence” of one.
Each broad section is labeled, then each subsection—for example, “decide which file to
use”, or “decide how to open the chosen file”—are also carefully labeled in a way that
indicates their interrelations.
When something has to be done, it is referred to by its Name. For example,
“Do ChooseFileToOpen” executes the selection block that chooses which file to open.
Topical example
One might group all File-related blocks into one place, all Screen-related blocks in
another, all Finance-related blocks into a third, and so on.
This is also easy to read and easy to change.
Then, as programs got larger and larger (I have personally worked on a single order entry
program of more than 10,000 lines that opened more than 200 files) they became difficult to read
in spite of having (perhaps) good Structure.
There is simply too much to wade through in all that code.
It becomes difficult to determine the impact an apparently simple change might have on the
rest of the program.
Consider for example, the well documented year 2000 bug, where programmers, in the
interest of conserving screen and paper real estate, wrote years in only two digits. In year
2000, it “might” fail to work correctly.
For example, the UNIX kernel, the heart of UNIX and the heart of Internet communication
software, is very highly structured
The UNIX kernel is small and compact, consisting of about 60 or so well defined small
blocks of code in a very good structure.
On the average, each block references about 20 other blocks, and each block is referenced
by about 16 other blocks.
Do the math if you want. This very small program is one of the most complex programs
ever written. It is almost impossible to foresee the impact of even the simplest change, so
because of its great importance changes take a long time to be approved and
implemented by an international committee of UNIX kernel overseers.
Stage 3: Heuristics
Page 3
Take a step back from Structured Design. Is there some way to make large programs and large
collections of programs more manageable? Modern programming languages are so efficient that
programmers can quickly write enormous amounts of code, and that poses enormous code
management problems.
Stage 3: Two important “Heuristics” were defined to guide software engineers toward programs
and systems that were vastly easier to understand and to manage, because their structures were
vastly simpler.
First Heuristic: Minimize coupling.
Coupling refers to the way changes in one block of code (or program) affect another
block of code (or program).
For example, if one program asks for data from another program and expects it to arrive
in a pre-defined string or array of bytes, then these two programs are “coupled” by that
design point! If one program must change the way the data is stored and transmitted,
say by a change as simple as changing the length of one data element in the string or
array, then the other one must be located and changed as well, as must all programs that
are equally coupled to that program.
Second Heuristic: Maximize cohesion.
Cohesion, simply put, refers to how focused a program is on doing one thing, whether that
thing is complex or simple.
For example, a block that is defined as GetCustomerAddressFromCustomerID sounds like
it could be very cohesive indeed.
Another example, a block that is defined as UpdateCustomerRecords sounds like it is
trying to do several things that are not necessarily related to one another functionally.
The functions within this block might all have to do with a Customer, but exactly how
closely is Change Customer Address related to Retrieve Customer Order?
This heuristic approach was at first called Modular Programming in the 1960s.
Very surprisingly, or perhaps not, Modular Programming is still not consciously and
thoroughly practiced by the vast majority of system developers in the industry with most
of those developers serving the Business Systems arena.
It is usually practiced quite well in these areas:
Flight control systems
Computer operating systems
Payroll and Tax systems
Adobe products
Network telecommunications control
It is usually not practiced well in these areas:
Business systems developed in-house
Order Entry, Fulfillment, and Inventory systems
Financial accounting and reporting systems
Marketing systems (product, price, promotion, and placement-on-shelf)
Page 4
Supply Chain systems developed in-house
Digital Asset management systems
One of the main reasons this approach is not more prevalent is that it is difficult to teach and
to learn
The ability to think clearly about these two heuristics depends upon learning a new way
to use ordinary language.
There are several related language skills; all have to do with working at a slightly higher
level of abstraction in ordinary communication about what business people do or what
they want to do.
Identifying the most important parts of a statement: Verb, Noun, Adjective,
Adverb, and so on
Analyzing all adjectives, adverbs, and other subordinate constructs into simple
Noun-Verb statements
Equating nouns with business information, and analyzing them semantically to
determine what information is being used and how it is structured.
Equating verbs with four (and only four) information system functions.
o Read (or get) data (or information)
o Write (or send) data
o Update data
o Delete data
Skillful application this semantic analysis technique combined with skill in the high-level
design of information systems can result highly modular systems.
Highly modular systems are easier to read and understand
Modules are well defined and cohesive
Modules function in ways that are relatively independent of one another—they
are only very slightly coupled at the worst.
Example: Consider a business order entry system with Salesmen, Customers, Products,
and Orders.
There are four primary business nouns, which will correspond to four primary
data objects.
For each data objects there are four modules corresponding to Create, Read,
Update, and Delete.
Four times four is sixteen basic system modules—each cohesive, and altogether
only slightly coupled through the database definitions.
Example extended (1): Consider an Order Create module, where it would be required to
read and perhaps update customer data, create an order header, create order line items,
read product and inventory data, update inventory data, and create drop-ship sub-lines
for order line items.
Note that each task defined above is in the form Verb-Noun, where only the four
approved verbs are used.
Each task can be defined as a separate module, each cohesive and only loosely
coupled with the others.
Page 5
Note also that the Nouns naturally seem to fall into a hierarchy of relationship.
This is modular analysis based on advanced heuristics, and the resulting system
can be designed to be built to this design.
Example extended (2): Note the possibility that the module Read Customer Data might
appear in several different places in a hierarchy.
When looking up a customer’s order history, or when creating a new order for a
customer, or when reporting on which customers bought a particular product.
Since the modules have the same names and functions, they are candidates for
making utility modules, where each higher level module refers to one and only one
common utility module for reading customer data.
Modular approaches have found their way into modern software development as “Object-
oriented” languages.
“Object-oriented” refers to the way an approach (whether a business analysis approach,
design approach, or code approach) as oriented toward business objects that possess well
defined attributes
Business objects are like the nouns mentioned above.
Attributes can be other nouns as in Order and Order Line Item
Or, they are verb-noun objects such as Customer and Read Customer Data.
Notice in the above example, that Read Customer Data is now an attribute of
Customer, and not Order, etc. This means there is now one and only one Read
Customer Data—an attribute (or function, or method) in the Customer object!
The object-oriented approach takes the Verb-Functions that were defined in
Modular design and moves them inside their corresponding Noun-Objects
Having the functions reside inside the objects also alters the way services are
created: the IT “owners” of an object are now responsible for creating all of the
objects services at the request of the owners of other objects, whereas in purely
Modular approaches each module owner may or may not have had responsibility for
locating and accessing any additional sources of data
Modular approaches within programming languages actually go much farther than the
above exampl, to control how database records are opened, locked, read, updated, and so
on, plus a vast array of other low level functions typically not of interest to the business
user.
Stage 4: Semantics
Let’s step back from Modules, now, and ask, How can we make it simple enough to use in real-
world businesses situations where the IT folk might not be all that skilled in the subtleties of
Modular/Heuristic approaches?
Just as Structure was built upon Logic, and then Heuristics were built upon Structure, Semantic
approaches build upon the semantic tools underlying the Heuristic approach.
Page 6
The primary task is still to break down all communication into its grammatical components,
and then build understanding using the simplest Nouns and Verbs that can be agreed upon.
The task is twofold;
Reach Agreement on needs and solutions to those needs.
Attain Clarity on what is being agreed upon.
Both aspects of the task, agreement and clarity, are necessary.
Agreement alone leaves the question, What did we agree on?
Clarity alone has no basis for doing anything.
Clarity includes and is based on reaching Agreement on the simplest terms being used
and then building a larger picture by combining those terms in unambiguous statements,
paragraphs, specifications, and so on.
One of the largest problems facing analysts and systems managers is the fact that
without clarity and agreement on the simple terms, people will still “agree” on the big
picture even though no two people who agreed on it would have agreed on what the
simplest terms used in the agreement!
Systems developers are often the worst offenders when it comes to clarity and unambiguous
use of language.
Give it a test: ask your developers and analysts to define these two terms: “requirements,”
and “design”.
Then ask them to explain what constitutes “good” requirements, and “good” design.
!! When interviewing prospective employees for IT positions, be sure to ask them
those questions, and to ask them to provide examples of how they achieved or
contributed to these.
One of the earliest uses of the semantic approach to engineering (and not just Information
System engineering) was the use or Work Breakdown Structures.
Work Breakdown Structures lay out the components and tasks of work that must or
might be done, using unambiguous terminology. It is expected that all employees using
the structure will understand and agree upon the terms used, and all will be quite clear
on the differences among definitions.
IT developers and analysts only rarely use these. An IT developer or analyst typically only
issues rules as to how to interact with him or her, more akin to the structure of
functional or organizational bureaucracy than to the structure of the work being done.
However, nearly all developers and analysts do tend to use the same words, like
“Requirements,” “Design,” Information,” “Data,” “Function,” “Process,” “Input,” “Good,”
“Validate,” “Program,” “Connection,” and so on.
How To Do It
In order to reach desirable outcomes through communication and agreement, it is necessary
to agree on basic terms and build larger expressions of needs, designs, and so forth from
those and only those basic terms.
Page 7
It is probably not possible that all words relevant to a particular problem mean exactly
the same thing to everyone involved!
However, it is possible for them to be used in a way that is consistent and unambiguous,
and that will lead to desirable outcomes.
Consider the following argument:
“In order to build a common terminology so that we all know what we are agreeing on, it
is necessary to divide up the task into these basic areas: Requirements, Design,
Construction, Implementation, and Support.”
It kind of makes sense, even though it is unlikely that any two people can agree on
precisely what it means!
Yes, that is true. It illustrates part of the problem of building a common
terminology: is it actually accomplished bottom-up using simple well understood
terms, or top-down using guiding principles and ideas that help motivate people and
help divide up the problem for more clarity.
On the apparent agreement, it is certainly helpful to try to draw a distinction
between requirements-talk and design-talk, for example.
Requirements-talk is between analysts and business partners.
Design-talk is between analysts and designers or developers
It is the Analyst’s responsibility to ensure that the requirements-talk language
“agrees with” or “corresponds to” the design talk, or else the design will not satisfy
requirements.
One common trick that Analysts use in this case is to refer to “information” when
talking business, and “data” when talking IT. The reason for this is that information
and data are actually two different things that must correspond in some way.
Hint: Data sits inside computers and has to be managed with the four Verbs of the
Heuristic approach, while information is the way people talk about their business and
must be managed by a more comprehensive semantic model.
The problem is simple, and yet daunting at the same time.
1. Catalog and define all relevant words.
2. Analyze complex words like “validate” into other words that express what is really
happening.
3. Publish the dictionary of terms. Use illustrations, like hierarchical trees, to show how
critical words are interrelated.
4. Hold people to correct usage. Correct incorrect usage. Analysts generally should not
discuss Data with business partners, nor discuss Information with developers.
There are very good reasons for this. People hear different things from different
people they know. If an analyst inadvertently uses “Data” to discuss business objects
with business partners, then the business partners begin to associate “Data-related”
objects with their information, things like “Which server will it be kept on,” which
will confuse the requirements process.
5. Be flexible. Controlling people’s language is a difficult and complex task.
Page 8
6. Provide translations for different groups of people who are naturally more
comfortable using words in a different way.
7. Ensure that translations across groups retain the necessary clarity for language-
independent design to satisfy all parties.
Page 9