Software Engineering Lecture Notes: Paul C. Attie
Software Engineering Lecture Notes: Paul C. Attie
Paul C. Attie
c Paul C. Attie. All rights reserved.
2
Contents
I Hoare Logic 11
1 Propositional Logic 13
1.1 Introduction and Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.1 Combining Propositions: Logical Connectives . . . . . . . . . . . . . . . . 14
1.1.2 Syntax and Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.1.3 Universal Truth of Propositions . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.1 Syntax of Propositions — Propositional Formulae . . . . . . . . . . . . . 15
1.2.2 Deductive Systems, Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2.3 A Deductive System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.4 Example Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.2.5 The Simplified Proof Format . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.3.1 Truth-tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.3.2 Evaluation of Propositions . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3.3 Satisfiability and Validity, Tautologies . . . . . . . . . . . . . . . . . . . . 27
1.3.4 Semantic Entailment, Soundness, Completeness . . . . . . . . . . . . . . . 28
1.4 Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3
4 CONTENTS
II Software Engineering 77
5 Introduction 79
5.1 The Software Construction Problem . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2 Decomposition and Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.2.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.3 Errrors in Programs and their Detection . . . . . . . . . . . . . . . . . . . . . . . 83
6 Review of OO Concepts 85
6.1 Java Program Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2 Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.3 Variables, references, objects, and mutability . . . . . . . . . . . . . . . . . . . . 85
6.3.1 Mutability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.3.2 Equality and Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.3.3 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.4 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.5 Method call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.6 Type checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7 Procedural Abstraction 89
6 CONTENTS
7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.1.1 Abstraction by Parametrization . . . . . . . . . . . . . . . . . . . . . . . . 89
7.1.2 Abstraction by Specification . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.2 Specification of a Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2.1 Example Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2.2 Initial and Final Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.2.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.2.4 Example Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2.5 Contract View of Specifications . . . . . . . . . . . . . . . . . . . . . . . . 94
7.3 Designing Procedural Abstractions . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.3.1 Choosing which procedures to implement . . . . . . . . . . . . . . . . . . 94
7.3.2 Desirable qualities of procedure abstractions . . . . . . . . . . . . . . . . . 95
7.4 Example of Functional Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 95
7.5 Another Example of Functional Decomposition . . . . . . . . . . . . . . . . . . . 97
7.6 Behavioral Equivalence of Implementations . . . . . . . . . . . . . . . . . . . . . 98
8 Data Abstraction 99
8.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
8.2 Abstract Data Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
8.3 Specifying Data Abstractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.4 Using Data Abstractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.5 Implementing Data Abstractions . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.5.1 Selecting a representation . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.5.2 Implement constructors and methods . . . . . . . . . . . . . . . . . . . . . 101
8.5.3 The Abstraction Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.5.4 The Representation Invariant . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.5.5 Implementing the abstraction function and representation invariant . . . . 102
8.6 Properties of Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.6.1 Benevolent side effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.6.2 Exposing the Representation . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.7 Reasoning about data abstractions . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.8 Example: IntSet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.9 Linked Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
CONTENTS 7
10 Testing 121
10.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
10.2 Black Box Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
10.2.1 Testing the cases of a specification . . . . . . . . . . . . . . . . . . . . . . 121
10.2.2 Testing boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . 122
10.3 White Box Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
10.4 Testing Abstract Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
10.5 Unit and Integration Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
10.6 Defensive Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
15 Design 161
15.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
15.2 Design Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
15.2.1 The introductory section . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
15.2.2 The abstraction sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
15.3 The Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
15.3.1 Starting the design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
15.3.2 Designing a target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
15.3.3 Continuing the design: how to select the next target for design . . . . . . 164
Acknowledgments
The material on propositional, predicate, and Hoare Logic is based on Program Construction
and Verification [1] by Roland Backhouse, Prentice-Hall, 1986.
Much of the material in this book is based on Program Development in Java [4], by Barbara
Liskov and John Guttag, Addison-Wesley, 2001.
Part I
Hoare Logic
11
Chapter 1
Propositional Logic
13
14 CHAPTER 1. PROPOSITIONAL LOGIC
We saw above that compound propositions are formed from simple propositions using extra
words such as if . . . then (or, in symbolic form, the symbol ⇒). These extra words represent
logical connectives or operators. We shall mainly be concerned with the following five logical
connectives (it is possible to define others):
All of the connectives take two propositions as input, except for negation, which takes one.
conjunction represents the informal concept of “and”. disjunction represents the informal con-
cept of “inclusive or” (one or the other or both). negation represents the informal concept of
“not,” i.e., the logical “opposite.” implication represents the informal concept of “if ... then.”
This concept is very important in deducing a conclusion logically from a set of assumptions, or
premises. Finally, equivalence represents the informal concept of logical “sameness.”
There are two aspects to propositional logic: syntax and semantics. Syntax refers to the notation
that we use to write propositions. Semantics refers to how we assign “meaning” to propositions.
An analogy can be made with programming: syntax is the programming language in which we
write programs (C++, Java, etc) while semantics is the “behavior” of the program when we
run it, i.e., the programs “meaning”.
A key point is that syntax can be technically defined entirely independent of semantics, as a
“symbol pushing” game. Just as a porgramming language can be defined independently of any
discussion of what executing the various statements will do, e.g.., just give a BNF grammar.
The whole point of a system of logic is to “prove” statements (propositions for now). That is,
we have some notion of universal truth: some statements are universally true and others are not.
For example, by using informal reasoning based on the informal meanings of the propositinal
connectives given above, we intuitively expect the following to be universally true:
1.2. SYNTAX 15
(p ∧ q) ≡ (q ∧ p)
(p ∧ q) ⇒ p
p ∨ ¬p
while we do not expect
(p ∧ q) ≡ (q ∨ p)
(p ∨ q) ⇒ p
p ∧ ¬p
to be universally true.
There are two main methods for formally proving that a proposition is “universally true”:
• Syntactic: devise a deductive system, which consists of axioms and rules of infer-
ence. A deductive systems povides a systematic method of constructing a proof that a
proposition is “universally true”. We discuss deductive systems in Section 1.2.2 below.
• Semantic: formalize the definition of “universally true” in a semantic system and then
check the definition directly.
1.2 Syntax
Definition 1 (Proposition)
Propositions are formed as follows:
You are familiar with arithmetic expressions. We can make an analogy between propositions
and arithmetic expressions as follows:
3. If x and y are arithmetic expressions, then so are (x + y), (x × y), (x − y), (x/y)
16 CHAPTER 1. PROPOSITIONAL LOGIC
((p ∧ q) ∨ r)
r
p∧q
p q
Example 1 If p, q, r are propositions, then so is ((p ∧ q) ∨ r). Figure 1.1 depicts a parse tree for
((p ∧ q) ∨ r), showing how it is built up from p, q, r and (p ∧ q). These are called subpropositions
of ((p ∧ q) ∨ r).
In definition 1, every logical connective has a pair of associated parentheses. These parentheses
are necessary so that a given proposition has a single well-defined meaning. For example,
((p ∧ q) ∨ r) is different from (p ∧ (q ∨ r)); in the state s = {(p, F), (q, F), (r, T)}, the first
proposition evaluates to T while the second evaluates to F (the notions of “state” and “evaluate’
are defined formally later on). Note however, that the outer parentheses are redundant in both
cases, e..g, ((p ∧ q) ∨ r) is equally well written as (p ∧ q) ∨ r.
In general, having one pair of parentheses for each logical connective tends to result in propo-
sitions with many parentheses, which are consequently hard to read.
Precedence rules establish a convention that allows us to omit many of these parentheses. These
rules are:
1. A set of axioms: these are statements that are assumed to be universally true.
2. A set of rules of inference: these are rules that allow us to conclude that a particular
statement q (the consequent) follows logically from some other statements p1 , . . . , pn (the
premises). In particular, if p1 , . . . , pn have already been shown to be universally true, then
we can conclude that q is also universally true.
A rule of inference gives a “deduction” step: if we have already proven that the premises
p1 , . . . , pn are universally true, then we can now deduce that the consequent q is universally
true by applying the rule. An axiom can be viewed as an inference rule with no premises, since
it states that some q is universally true per se.
For the time being, we can think of a “statement” as being a proposition. However, the notion
of proof applies to other kinds of statements, as we will see in the chapter on first-order logic.
Now given that the axioms are universally true, and that the rules of inference preserve universal
truth, it follows that:
2. conclude new statements only by applying the rules of inference to statements that have
previously been shown to be universally true
then we will never incorrectly conclude that a statement is a universal truth when in fact it is
not. This leads us to the following definition of proof:
Definition 2 (Proof)
A proof is a finite sequence e1 , e2 , . . . , en of statements such that each ei (1 ≤ i ≤ n) is either
an axiom, or follows from earlier statements (ej for 1 ≤ j < i) by application of a rule of
inference.
Remark 1 Every statement that occurs in some proof is a universal truth. Every prefix of a
proof is also a proof.
Suppose that, starting with some proposition p as an assumption, we can deduce another
proposition q using both our proof system and in addition the assumption p. In other words,
each ei in Definition 2 can be either an axiom, or follow from previous statements (ej for
18 CHAPTER 1. PROPOSITIONAL LOGIC
1 ≤ j < i) by applying a rule of inference, or can be just p itself, written as a statement in the
proof without any justification whatsoever. Then, we have proven q using p as an assumption,
and so, we have deduced q from p. The same reasoning applies if we replace the single statement
p by a set of statements p1 , . . . , pn .
This leads to notion of a deducibility relation between a set of statements p1 , . . . , pn , used as
premises, and a statement q, used as a conclusion. We use the symbol ` for this relation, and
write p1 , . . . , pn ` q if and only if q can be deduced from p1 , . . . , pn .
Definition 3 (`)
p1 , . . . , pn ` q if and only if there exists a finite sequence e1 , e2 , . . . , en of statements such that
en is q and each ei (1 ≤ i ≤ n) is either:
• an axiom, or
• follows from earlier statements (ej for 1 ≤ j < i) by application of a rule of inference, or
• is one of p1 , . . . , pn .
Note that technically, the sequence of statements in the above definition is not necesarily a
proof, since the p1 , . . . , pn are not necessarily axioms.
When p1 , . . . , pn ` q, there may not be (in general) a single rule of inference whose premises
match p1 , . . . , pn and whose conlcusion matches q. There will be a proof, of some length, of q
from p1 , . . . , pn .
When q occurs in a proof, and so is universally valid, it can be deduced from no assumptions,
and so we write ` q, with an empty left hand side of the ` symbol.
A rule of inference can now be formally written as p1 , . . . , pn ` q. An axiom is written as ` q.
We regard axioms as statements whose universal truth is accepted on “first principles,” and so
does not need to be proven. An alternative notation is p1 ,...,p
q
n
.
The following is a definition of ` equivalent to the one given above, and which illustrates the
“inductive” nature of proof.
Definition 4 (`)-alternative
p1 , . . . , pn ` q if and only if:
• q is an axiom, or
• there exist q1 , . . . , qm such that:
– q follows from q1 , . . . , qm by applying some rule of inference, and
– for all j from 1 to m : p1 , . . . , pn ` qj
We now present a deductive system, i.e., a set of axioms and rules of inference.
Our system consists of several axioms, and two rules of inference. All of our axioms, apart
from the excluded middle, are equivalence statements, i.e., they give the equivalence of two
propositions.
1.2. SYNTAX 19
The Axioms
The rule of transitivity allows us to “string together” two equivalences that have a common
proposition.
13. Rule of Transitivity
If p ≡ q and q ≡ r, then p ≡ r.
Expressed formally, this is:
p ≡ q, q ≡ r ` p ≡ r.
Both of these rules facilitate the decomposition of a proof problem into several simpler “sub-
problems”.
There are several different kinds of statement that can be established using our second deductive
system. First, we show how a proposition can be proven universally true, i.e., how to show ` p.
Proof. Proof of ` (p ⇒ (q ⇒ r)) ⇒ ((p ∧ q) ⇒ r)
1. p⇒q premise
2. ¬p ∨ q (1), implication
3. q ∨ ¬p (2), commutativity
4. ¬¬q ∨ ¬p (3), negation, substitution
5. ¬q ⇒ ¬p (4), implication
Because the “direction of deduction” in a proof is “one way,” from top to bottom, we are
now obliged to carry the entire equivalence statement on every line. Thus there is a lot of
repetition in the above proof. For example, many statements have a part “p ⇒ q” that is never
manipulated. If we use the above format, this will often be the case. The next section presents
a more economical simplified proof format.
In the proof of ` (p ⇒ (q ⇒ r)) ⇒ ((p ∧ q) ⇒ r) above, every statement follows from the
immediately preceding statement. Actually, every statement is equivalent to the immediately
preceding statement. Hence we do not need to number the statements, but merely insert a ≡
sign between each succeeding pair to indicate that these are equivalent. We define this simplified
proof format as follows.
To show that a proposition is valid using the simplified proof format, we show that it is equivalent
to an axiom, or that it is equivalent to T.
Here is Proof 1.2.4 from Section 1.2.4 rewritten in this format.
Proof. Proof of ¬t ∨ t ≡ (p ⇒ (q ⇒ r)) ⇒ ((p ∧ q) ⇒ r)
22 CHAPTER 1. PROPOSITIONAL LOGIC
In the above proof, it is difficult to see how the steps are being decided. Many times, it is easier
to start with the proposition being proven, and to work “backwards”. With the simplified proof
format, this is easy, since ≡ is symmetric. It is, in principle, possible to do this for proofs in
the regular format, but much harder, and usually not useful. When we reverse the steps in
Proof 1.2.5 we get:
Proof. Proof of true ≡ (p ⇒ (q ⇒ r)) ⇒ ((p ∧ q) ⇒ r)
(p ⇒ (q ⇒ r)) ⇒ ((p ∧ q) ⇒ r)
≡ ¬(p ⇒ (q ⇒ r)) ∨ ((p ∧ q) ⇒ r) implication
≡ ¬(¬p ∨ (q ⇒ r)) ∨ (¬(p ∧ q) ∨ r) implication ×2
≡ ¬(¬p ∨ ¬q ∨ r) ∨ (¬(p ∧ q) ∨ r) implication
≡ ¬(¬p ∨ ¬q ∨ r) ∨ (¬p ∨ ¬q ∨ r) DeMorgan
≡ ¬t ∨ t substitution
≡ true axiom of excluded middle
p⇒q
≡ ¬p ∨ q implication
≡ q ∨ ¬p commutativity
≡ ¬¬q ∨ ¬p negation, substitution
≡ ¬q ⇒ ¬p implication
This format lets us prove implications, which is very useful in program verification.
1.3. SEMANTICS 23
1.3 Semantics
1.3.1 Truth-tables
The meaning of the logical connectives can be given using truth-tables. A truth-table for a logical
connective gives the value of a compound proposition formed using the connective in terms of
the values of the simple propositions that are the inputs. As we said above, propositions can
have two values only: true (which will be written as T from now on), and false (which will be
written as F from now on). T and F are called truth-values. The truth-table contains a number
of rows, one for each possible combination of values of the inputs.
Since true is the proposition that is “universally true,” its meaning is just the truth value T:
true
T
Since false is the proposition that is “universally false,” its meaning is just the truth value F:
false
F
p ¬p
T F
F T
Truth-table for negation
Since negation takes one proposition p as input, this table has two rows, one for each possible
value of the input p.
The meaning of conjunction is given by the following table:
p q p∧q
T T T
T F F
F T F
F F F
Since conjunction takes two propositions p, q as input, this table has four rows. Each of the
inputs p, q has two possible values, and so the number of combinations of values is 2 × 2 = 4.
Likewise, the truth-tables for the remaining connectives are as follows:
24 CHAPTER 1. PROPOSITIONAL LOGIC
p q p∨q
T T T
T F T
F T T
F F F
p q p⇒q
T T T
T F F
F T T
F F T
p q p≡q
T T T
T F F
F T F
F F T
A constant proposition is a proposition that does not contain any identifiers. In other words,
constant propositions are composed entirely of the truth values T, F and the logical connectives.
You evaluate a constant proposition by executing the following steps:
2. Evaluate a constant proposition containing exactly one connective by using the truth-
tables given in subsection 1.3.1.
(a) Find all the subpropositions that contain exactly one connective and evaluate them
using step 2. Replace each subproposition by the value obtained for it.
(b) Repeat step 3a until you are left with either T or F.
F
F
T F
Example 7 The proposition ((¬F) ≡ T) is evaluated as follows. First, the subproposition (¬F)
is evaluated using the truth table for negation (page 23). The result is T. Replacing (¬F) by T,
we obtain (T ≡ T). This is evaluated using the truth table for equivalence (page 24), obtaining
the final result of T.
Now a proposition contains identifiers, in general. Hence, the proposition does not have a
truth-value per se. This is because we cannot determine a truth-value for the proposition
without knowing truth-values for all of the identifiers in the proposition first. For example, the
proposition p ∧ q is neither true nor false in itself; it is true if p and q both happen to be true
(but we don’t know this yet), and false otherwise.
Even though propositions do not have truth-values per se, they can be assigned truth-values.
We assign a truth-value to a proposition by assigning truth-values to all of it’s propositional
identifiers. Once this is done, the truth-value of the proposition can be determined by replacing
all the identifiers by their assigned values and then evaluating the resulting constant proposition
as shown in subsection 1.3.2.
Propositional identifiers are assigned truth-values by means of a state:
Definition 7 (State)
A state is a function from identifiers to truth-values.
For example, the state s = {(b, T), (c, F)} assigns T to b and F to c. We use the notation s(b) to
denote the value that a state s assigns to an identifier b. If s assigns no value to b, then s(b) is
26 CHAPTER 1. PROPOSITIONAL LOGIC
undefined A state is sometimes also called a truth-value assignment, or a valuation. We use the
term state because it is more related to the application of logic to programming, which is the
focus of this class. Note that a state is somewhat like a row of a truth-table in that it assigns
a value to every propositional identifier listed in the truth-table.
We say a proposition p is well-defined in state s iff s assigns a truth-value to every identifier in
p. For example, the proposition b ∨ c is well-defined in the state s = {(b, T), (c, F)}, whereas
the proposition b ∨ d is not. We will usually assume that p is well-defined in state s when we
write s(p), and will not mention this assumption explicitly.
If p is well-defined in s, then we use s(p) to denote the truth-value assigned to p by s. s(p) is
evaluated as follows:
2. You now have a constant proposition. Evaluate it as shown above in subsection 1.3.2
Example 8 We evaluate the proposition ((p ∧ q) ∨ r) in the state s = {(p, T), (q, F), (r, F)}.
Replacing p, q, r by their values T, F, F in state s, we obtain the constant proposition ((T∧F)∨F).
From example 6, We see that this evaluates to F.
Example 9 Truth-table for ((p ∧ q) ∨ r). The row within lines corresponds to example 8.
p q r (p ∧ q) ((p ∧ q) ∨ r)
T T T T T
T T F T T
T F T F T
T F F F F
F T T F T
F T F F F
F F T F T
F F F F F
Example 10 We evaluate the proposition ((¬p) ≡ q) in the state s = {(p, F), (q, T)}. Replac-
ing p, q by their values F, T in state s, we obtain the constant proposition ((¬F) ≡ T). From
example 7, We see that this evaluates to T.
2. s(¬p) = ¬(s(p))
Since s(p), s(q) are truth-values, it is permissible to use them as inputs to logical connectives.
An important point is that our method of evaluating propositions is compositional : once the
value of the subformulae p, q has been determined we cau use the appropriate truth-table to
find the value of p ∧ q, p ∨ q etc. Since the (truth) value of a proposition depends only on the
(truth) value of its subpropositions, this is called truth-functional semantics.
Using a deductive system, we formalized the idea of “universally true” by the idea that any
proposition that has a proof is universally true:
We justified this as follows: (1) axioms are universally true, and (2) rules of inference preserve
universal truth, i.e., if the premises are universally true, then so is the conclusion. Then, a
simple inductive argument (on the length of a proof) establishes the above assertion.
However, the notion of “universally true” is still an informal one, so this is not completely
satisfying. Now that we know how to evalulate propositions, we can formalize this notion.
Intuitively, a proposition is “universally true” if it evaluates to true in every state (in which it
is well-defined). We call this formal notion validity:
Definition 9 (Valid)
A proposition p is valid iff for every state s such that s(p) is well-defined, s(p) = T.
Example 12 ¬p ∨ p is a tautology.
(p ⇒ (q ⇒ r)) ≡ ((p ∧ q) ⇒ r) is a tautology.
What about propositions that are “universally false”? The corresponding formal concept is that
of a “contradiction”:
Definition 10 (Contradiction)
A proposition p is a contradiction iff for every state s such that s(p) is well-defined, s(p) = F.
28 CHAPTER 1. PROPOSITIONAL LOGIC
Finally, what about propositions that are neither universally true nor universally false? These
are called “contingencies”:
Definition 11 (Contingency)
A proposition p is a contingency iff there exists a state s in which p is well-defined such that
s(p) = T, and there exists a state t in which p is well-defined such that t(p) = F.
Example 13 p is a contingency.
Definition 12 (Satisfiable)
A proposition p is satisfiable iff there exists a state s in which p is well-defined such that
s(p) = T.
Example 14 ¬p ∧ p is a contradiction.
Exercise 1 Show that p is valid iff ¬p is not satisfiable, i.e., that satisfiability is the dual of
validity.
Show that p is not satisfiable iff p is a contradiction.
Show that p is a contingency iff both p and ¬p are satisfiable.
if ` p then p is valid.
In other words, our deductive system admits only proofs of valid propositions. This is actually
the main reason for having deductive systems, to be able to prove that some propositions are
valid. This crucial property of a deductive system is called soundness. We show below that
our two deductive systems presented above are sound.
The converse property:
if p is valid then ` p
is called completeness. It states that if a proposition is valid, then there is proof of that
proposition. Completeness is desirable: a complete deductive system is more “useful” than an
incomplete one. However, completeness is not crucial in the way that soundness is; incomplete
deductive systems can still be useful. Indeed some logics (e.g., second-order logic, Hoare logic for
languages with procedure parameters) are inherently incomplete: it is known that no complete
deductive system exists for such logics.
In a deductive system that is both sound and complete, we have:
1.3. SEMANTICS 29
` p iff p is valid.
Thus, provability and validity coincide, and we see that validity is the semantic counterpart of
the (syntactic notion of) proof. We would also like a semantic counterpart of p ` q, i.e., of
deducibility. This is given by the relation of “semantic entailment,” which is denoted by the
symbol |=:
We write |= q when there are no pi , i.e., for every state s, s(q) = T. Clearly, |= q just says that
q is valid.
We now generalize the above statements of soundness and completeness as follows:
Soundness: if p1 , . . . , pn ` q then p1 , . . . , pn |= q.
Completeness: if p1 , . . . , pn |= q then p1 , . . . , pn ` q.
For sake of simplicity, we will prove soundness in the restricted case only, and assume the
simplified proof format, just to give you an idea of how such a proof is carried out.
Proof : For each axiom, check its validity by constructing its truth tableand checking that every
row gives a result of T.
For the rule of substitution, we argue that
by induction on the number of times that this rule has been used. Suppose that the first k uses
of the rule are sound. Now suppose that p ≡ q is true. From the previous paragraph, and our
inductive hypothesis, we have p ≡ q.
Let s be any state whatsoever (we usually say: let s be an arbitrary state). By definition of how
a proposition is evaluated (subsection 1.3.2), s(E(p)) and (E(q)) are computed by replacing all
occurrences of p, q in E(p), E(q) by s(p), s(q) respectively. But s(p) = s(q) since p ≡ q. Hence
s(E(p)) must have the same value as s(E(q)). Thus E(p) ≡ E(q) holds.
Now suppose ` p. Thus p occurs in a proof. All proofs in the simplified proof format establish
p ≡ t ∨ ¬t, where t ∨ ¬t is an instance of the axiom of excluded middle, since this is the only
axiom. Thus p ≡ t ∨ ¬t. Now |= t ∨ ¬t. Hence |= p.
30 CHAPTER 1. PROPOSITIONAL LOGIC
Definition 15 (Literal)
A literal is either a propositional identifier or the negation of a propositional identifier.
2.1 Predicates
A predicate is like a proposition, except that propositional identifiers may be replaced by any
expression that has value T or F, e.g.:
1. Predicate symbols: P (v1 , . . . , vn ) expresses that a relation P holds among the n values
v1 , . . . , vn . For example, the arithmetic inequalities =, 6=, <, ≤, >, ≥ are predicates, as in
x1 < x2 .
2. Logical quantifiers: these allow you to express “for all” and “there exists” in formal logic.
These expressions are called atomic predicates. Atomic predicates play an analogous role in
predicates that propositional identifiers do in propositions. They provide the expressions that
are evaluated in a given state to provide truth-values. These truth-values are combined using
the logical connectives to produce the final truth-value of a predicate.
Notice that, predicates take values (over some domain) as arguments, e.g., x1 < x2 . So, we need
to enlarge our propositional language to be able to denote values. First, we admit constants,
e.g., 21, 56, 0. Second, we admit variables, e.g., x, y, z. Finally, we admit function symbols, e.g.,
f (21), g(x, y), h(y, 56). Note that function (symbols) are applied to arguments, e.g, f is applied
to 21, g is applied to x, y etc. A function can be applied (i.e., take as arguments) constants,
variables, or the result of other function applications, e.g., f (g(x, y)), f (f (21)). Note that a
function can be applied to the result from a previous application of the same functions, as in
f (f (21)). This is just how a recursive function works.
Each function symbol takes a fixed number n ≥ 0 of arguments, called its arity. When n = 0,
the function symbol represents a constant, since a function with no arguments cannot change.
Let F be the set of all function symbols in our language. This leads to the definition of the
class of terms:
Definition 18 (Term)
The set of terms is built up as follows:
• A constant is a term.
31
32 CHAPTER 2. PREDICATE (FIRST-ORDER) LOGIC
• A variable is a term.
We used P (v1 , . . . , vn ) above to indicate that relation P holds among the n values v1 , . . . , vn . P
is a predicate symbol, which represents some relation. As with function symbols, each predicate
symbol takes a fixed number n of arguments, i.e., has a fixed arity n. Also, since predicate
symbols denote relations among values, they will take terms as arguments, since terms denote
values. This leads to the definition of atomic prodicate. Let P be the set of all predicate symbols
in our language.
Definition 20 (Predicate)
Predicates are formed as follows:
Example 16 If i, j are integer variables and r is a proposition, then ((i < j) ∨ r) is a predicate.
2.1. PREDICATES 33
((i < j) ∨ r)
r
i<j
<
i j
The operators, such as <, =, used in atomic predicates have higher precedence than logical
connectives.
We assume as axioms all the familiar properties of arithmetic inequalities. These can be used
in proofs by giving “arithmetic” as the “law” used. Some typical properties that you might use
are:
• ∀ i, j, k (i < j ∧ j < k ⇒ i ≤ k) ∧ (i ≤ j ∧ j ≤ k ⇒ i ≤ k)
• ∀ i, j, k (i ≤ j ∧ j ≤ i ⇒ i = j)
• ∀ i, j, k (i < j ⇒ i + k < j + k) ∧ (i ≤ j ⇒ i + k ≤ j + k)
2.2 Quantification
We use LQ to stand for either ∀ or ∃. Let p be a formula not containing any quantifiers. In
LQ x p:
• x is the bound variable. x is said to be bound to LQ. All occurrences of x in LQxp are
bound occurrences, i.e., the occurrence of x immediately following LQ, and all occurrences
of x in p.
• p is the quantified predicate.
In LQ x p, the bound variable x is a “place holder” that can be replaced by another variable y
provided that this does not cause capture:
∃x(w = z ∗ x) and ∃y(w = z ∗ y) mean the same thing, namely that w is a multiple of z, but
∃w(w = z ∗ w) means T (i.e., it is valid), since the quantified predicate w = z ∗ w is true for
w = 0. So, replacing x by y preserved meaning, while replacing x by w did not.
Before defining capture, we need to define the notion of free and bound occurrences of variables.
The discussion above gives a definition of bound occurrence that works only when the quantified
predicate p does not itself contain any quantifiers.
If p contains quantifiers over variables other than x, then this does not affect the binding status
of occurrences of x in p. If however, p contains a quantifier over x, e.g., p is ∃ x p0 , and we have:
∀ x ∃ x p0
then, the ∃ x quantifier overrides the ∀ x quantifier. So, we define:
Notice that in LQ x p, the occurrences of x that are bound to LQ x are exactly those occurrences
of x that are free in p (considered by itself).
In other words, the scope of LQ x is that part of p where any occurrence of x would be bound
to LQ x.
1. y occurs in t, and
2. there is a subformula p0 of p such that
(a) there is a free occurrence of x in p0 , and
(b) p0 occurs in the scope of some LQ y quantifier
In this case, replacing x by t would lead to the capture of y: the occurences of y in t should be
free, but they actually become bound to the pre-existing quantifier LQ y.
1. y occurs in t, and
2. there is a subformula p0 of p such that
(a) there is a free occurrence of x in p0 , and
(b) p0 occurs in the scope of some LQ y quantifier
Example 18 Consider formula p , x < w ∧ ∀ y(x > y) and term t , a ∗ y + b. Then p[t/x] is
a ∗ y + b < w ∧ ∀ y(a ∗ y + b > y). The occurrence of y in a ∗ y has been captured.
For every value v of i such that r(v) is true, p(v) is also true.
There exists a value v of i such that r(v) is true and p(v) is also true.
Arithmetic expressions are built up from inequalities, the arithmetic operators (+, ∗, −, /, etc.),
and the following:
2. (N i : range : quantif ied − expression) where i is an integer valued variable, and range,
quantif ied − expression are both predicates.
(N i : r(i) : p(i)) is the number of values for i within the range r(i) for which p(i) is true.
i.e., N counts the number of times that p(i) is true within the range r(i). N can be defined in
terms of S:
(N i : r(i) : p(i)) = (Σ i : r(i) ∧ p(i) : 1)
We use Q for any quantifier except N . Every Q generalizes an associative and commutative
binary operator q to a set of operands given by the range. There are many axioms that can be
used to manipulate quantifiers. We will omit the usual leading ` when giving these, with the
understanting that it is really present. We will use the = symbol for equality as usual, with the
understanding that, for the logical quantifiers, = is the same as ≡.
If the range of quantification is empty, then the result is the identity element of the associated
binary operator:
(∀ i : F : p(i)) = T
(∃ i : F : p(i)) = F
2.3. PROPERTIES OF QUANTIFIERS 37
(N i : F : p(i)) = 0
(Σ i : F : f (i)) = 0
(Π i : F : f (i)) = 1
(MIN i : F : f (i)) = ∞
(MAX i : F : f (i)) = −∞
Note the slight abuse of notation: the semantic F represents any predicate (i.e., syntax) p such
that |= (p ≡ false).
a) Change of variable
b) Cartesian Product
a) Range Translation
Example 24 (Σ i : 1 ≤ i ≤ n : i) = (Σ i : 0 ≤ i ≤ n − 1 : i + 1)
where r(i) = 1 ≤ i ≤ n, g(i) = i + 1. Hence the range on the right hand side is: 1 ≤ i + 1 ≤ n,
i.e., 0 ≤ i ≤ n − 1
b) Singleton Range
(Q i : i = k : f (i)) = f (k)
38 CHAPTER 2. PREDICATE (FIRST-ORDER) LOGIC
c) Range Splitting
Note that range splitting works correctly when quantification over an empty range is defined
this way: (Q i : r(i) : f (i)) = Q(i : r(i) : f (i)) q Q(i : F : f (i)).
f) Range Disjunction
a) Generalized Associativity
b) Generalized Commutativity
c) Generalized Distributivity
2.4. STATES 39
a) ∀-rule
2.4 States
States must now assign appropriate values to all variables (depending on the type of the vari-
ables), and also assign truth values to propositional identifiers.
Variable types will be integer, unless otherwise declared or obvious from the context of use (e.g.,
b := T makes b a boolean).
We use “functional notation.” The type is determined from the definition (we use = for functions
and sets, ≡ for predicates) and context.
Example 37 Sorting.
a results from sorting b in nondecreasing order:
is − sorted(a, b) ≡ perm(a, b) ∧ ordered − nondec(a)
ordered − nondec(a) ≡ (∀ i : 0 ≤ i < n − 1 : a[i] ≤ a[i + 1])
perm(a, b) ≡ (∀ i : 0 ≤ i < n : num(a, a[i]) = num(b, a[i]) ∧ num(a, b[i]) = num(b, b[i]))
num(c, x) = (N i : 0 ≤ i < n : c[i] = x)
3. If there is only one quantifier, arithmetic operator, or logical connective, then evaluate it
according to the definitions of quantifiers, arithmetic operators, and logical connectives.
(a) Evaluate all subpredicates that contain exactly one quantifier, arithmetic operator,
or logical connective, and replace them by their values.
2.6. SEMANTICS: EVALUATION OF PREDICATES 41
(b) Repeat previous step until you are left with either T or F.
2. s(¬p) = ¬(s(p)),
s(p ∧ q) = s(p) ∧ s(q),
s(p ∨ q) = s(p) ∨ s(q),
s(p ⇒ q) = s(p) ⇒ s(q),
s(p ≡ q) = s(p) ≡ s(q)
Example 38 s = {(a[0], 1), (a[1], 5), (a[2], 3), (a[3], 10), (j, 2)}.
s( (∀ i : 0 ≤ i < j : a[i] ≤ a[i + 1]) ) =
(∀ i : s(0) ≤ i < s(j) : s(a[i]) ≤ s(a[i + 1]) ) =
(∀ i : 0 ≤ i < 2 : s(a[i]) ≤ s(a[i + 1]) ) =
s(a[0]) ≤ s(a[1]) ∧ s(a[1]) ≤ s(a[2]) =
1≤5 ∧ 5≤3=
T ∧ F=
F
Definition 27 (Valid)
A predicate p is valid iff for every model M and state s in which p is well-defined, M, s |= p.
Definition 28 (Satisfiable)
A predicate p is satisfiable iff there exists a model M and a state s in which p is well-defined
such that M, s |= p.
The example we used above, “there exist an infinite number of primes,” which is closed. by
contrast, “x is a prime” is not closed, since the occurrence of x is free. This leads to a predicate
which has x as an argument:
2.7. TRANSLATING ENGLISH INTO FIRST-ORDER FORMULAE 43
A common mistake in writing predicates is to write expressions that are not well defined be-
cause they apply operations to the wrong type of argument, e.g., addition applied to boolean
expressions, conjunction applied to arithmetic expressions, etc. When you write a predicate,
check that all of the operations in it have been applied to the correct type of argument.
44 CHAPTER 2. PREDICATE (FIRST-ORDER) LOGIC
Chapter 3
We shall use a simplified programming language that consists of assignment statements, if statements,
while statements, and sequential composition of statements (denoted by a semicolon). begin and
end are used to bracket statements. The syntax of our programming language is as follows.
assignment statement:
<variable> := <expression>
if statement:
if <predicate> then <statement> else <statement> endif |
if <predicate> then <statement> endif
while statement:
while <predicate> do <statement> endwhile
45
46CHAPTER 3. VERIFICATION OF PROGRAM CORRECTNESS: HOARE-FLOYD LOGIC
For {P } S {Q} to have the meaning given above, we define the validity of {P } S {Q} as follows.
Note that no restriction on the final state t is made if s(P ) = F. So, if the precondition is false
initially, then the postcondition may be either true or false when S terminates.
We specify what a program should do by giving a precondition and postcondition for the
program.
Example 39 Search a nonempty array C[0 : n − 1] that is sorted in increasing order for an
existing value X.
Precondition: n > 0 ∧ (∀ i : 0 ≤ i < n − 1 : C[i] ≤ C[i + 1]) ∧ (∃ i : 0 ≤ i < n : C[i] = X).
Postcondition: 0 ≤ pos ≤ n − 1 ∧ C[pos] = X.
Note how, in the last example, the array A is used to store the initial value of array a. In
general, when writing a specification for a program, we will often need to relate the initial
values of program variables to their final values. We shall usually do this as follows:
1. We use the precondition to make a “copy” of the initial values of the variables, e.g., A = a
in example 40 copies the initial value of array a into array A.
3.4. A DEDUCTIVE SYSTEM FOR PROVING THE VALIDITY OF HOARE TRIPLES 47
2. We use the postcondition to relate the final values of the variables to the initial values,
e.g., is − sorted(a, A) in example 40 states that the final value of a must be the result of
sorting the initial value of a (which is now given by A).
Just as for propositions, we demonstrate the validity of Hoare triples by using a deductive
system. Our deductive system has one proof rule for each type of program statement, together
with two proof rules called the rules of consequence. Hence, to prove a given Hoare triple
valid, there is usually only one proof rule that can be applied at any time. Our proof rules are
presented as rules of inference: if the hypotheses (the part above the line) have been proven to
be valid, then the conclusion (the part below the line) is also valid. The only exception is the
assignment axiom, which has no hypothesis. In other words, any instance of the assignment
axiom can be taken to be valid without first having to prove a hypothesis valid. We now discuss
each proof rule in turn.
x has the value after execution that e has before, so Q(x) is true after iff Q(e) is true before.
Example 41 {x + 1 ≤ 5} x := x + 1 {x ≤ 5}.
This reduces to: {x ≤ 4} x := x + 1 {x ≤ 5}. In other words, if we want x ≤ 5 to be true after
executing x := x + 1, then x ≤ 4 must be true before executing x := x + 1. This conforms to
our intuition about the meaning of x := x + 1.
The hypotheses of the rule require a proof of correctness for both possible cases of execution:
48CHAPTER 3. VERIFICATION OF PROGRAM CORRECTNESS: HOARE-FLOYD LOGIC
We don’t know in advance which path will be taken, since this depends on the values of the
program variables at run time, which cannot be predicted. Hence, we have to account for both
possibilities, i.e., both paths. The rule works as follows.
Assume that the hypotheses of the rule, namely {P ∧ B} S1 {Q} and {P ∧ ¬B} S2 {Q}, are
both valid. Assume also that precondition P is true immediately before executing the if -
statement. If the first case of execution occurs, i.e., B evaluates to true and S1 is executed,
then we know that P is true immediately before execution of S1 (by our assumption), and that
B is true immediately before execution of S1 (otherwise S1 would not be executed, by definition
of the if -statement). Hence we know that P ∧ B is true immediately before execution of S1 .
Therefore, from {P ∧ B} S1 {Q}, we know that Q is true immediately after execution of S1 .
On the other hand, assume that the second case of execution occurs, i.e., B evaluates to false
and S2 is executed. Then, we know that P is true immediately before execution of S2 (by our
assumption), and that B is false immediately before execution of S2 (otherwise S2 would not be
executed, by definition of the if -statement). Hence we know that P ∧ ¬B is true immediately
before execution of S2 . Therefore, from {P ∧ ¬B} S2 {Q}, we know that Q is true immediately
after execution of S2 .
Therefore, in both cases, we have shown that Q is true after execution of the if -statement.
Our assumptions were: 1) the hypotheses {P ∧ B} S1 {Q} and {P ∧ ¬B} S2 {Q}, and 2) that
precondition P is true immediately before execution of the if -statement. In other words, given
the hypotheses {P ∧ B} S1 {Q} and {P ∧ ¬B} S2 {Q}, then if P is true before execution of the
if -statement, Q will be true after execution of the if -statement.
Another way of saying this is that given the hypotheses {P ∧ B} S1 {Q} and {P ∧ ¬B} S2 {Q},
we have proven {P } if B then S1 else S2 {Q}. This is exactly the two-way-if rule.
{P ∧ B} S1 {Q} (P ∧ ¬B) ⇒ Q
{P } if B then S1 {Q}
The hypotheses of the rule require a proof of correctness for both possible cases of execution:
We don’t know in advance which path will be taken, since this depends on the values of the
program variables at run time, which cannot be predicted. Hence, we have to account for both
possibilities, i.e., both paths. The rule works as follows.
Assume that the hypotheses of the rule, namely {P ∧ B} S1 {Q} and (P ∧ ¬B) ⇒ Q, are both
valid. Assume also that precondition P is true immediately before executing the if -statement.
If the first case of execution occurs, i.e., B evaluates to true and S1 is executed, then we
know that P is true immediately before execution of S1 (by our assumption), and that B is
true immediately before execution of S1 (otherwise S1 would not be executed, by definition of
the if -statement). Hence we know that P ∧ B is true immediately before execution of S1 .
Therefore, from {P ∧ B} S1 {Q}, we know that Q is true immediately after execution of S1 . On
the other hand, assume that the second case of execution occurs, i.e., B evaluates to false and
no statement is executed. Then, we know that P is true immediately before the if -statement
(by our assumption), and that B is false immediately before the if -statement (otherwise S1
would have been executed, by definition of the if -statement). Hence we know that P ∧ ¬B
is true immediately before the if -statement. Therefore, from (P ∧ ¬B) ⇒ Q, we know that
Q is true immediately before the if -statement. Since execution of the if -statement involves
no change of state, i.e., “no statement is executed,” Q will also be true immediately after the
if -statement.
Therefore, in both cases, we have shown that Q is true immediately after execution of the if -
statement. Our assumptions were: 1) the hypotheses {P ∧B} S1 {Q} and (P ∧¬B) ⇒ Q, and 2)
that precondition P is true immediately before execution of the if -statement. In other words,
given the hypotheses {P ∧ B} S1 {Q} and (P ∧ ¬B) ⇒ Q, then if P is true before execution of
the if -statement, Q will be true after execution of the if -statement.
Another way of saying this is that given the hypotheses {P ∧ B} S1 {Q} and (P ∧ ¬B) ⇒ Q,
we have proven {P } if B then S1 {Q}. This is exactly the one-way-if rule.
P ⇒Q {Q} S {R}
{P } S {R}
when (and if) execution of S ends. P ⇒ Q says that whenever P is true, then Q will also be
true. Hence, we can conclude, that if P is true when execution of S begins, then Q will also
be true at that point (by validity of P ⇒ Q), and so R will be true when (and if) execution
of S ends (by validity of {Q} S {R}). In other words, if P is true when execution of S begins,
then R will be true when (and if) execution of S ends. But this is exactly {P } S {R}. Hence,
by assuming that P ⇒ Q and {Q} S {R} are both valid, we have shown that {P } S {R} is also
valid. This is exactly what the left consequence-rule states.
Example 46 Prove:
{x ≥ y} z := x {z = max(x, y)} (*)
By the assignment axiom:
{x = max(x, y)} z := x {z = max(x, y)}
x ≥ y ⇒ x = max(x, y) is valid by the properties of max.
We conclude (*) by applying the left consequence-rule:
x ≥ y ⇒ x = max(x, y)
{x = max(x, y)} z := x {z = max(x, y)}
{x ≥ y} z := x {z = max(x, y)}
{P } S {Q} Q⇒R
{P } S {R}
If P guarantees that R is true after execution of S1 , and R guarantees that Q is true after
execution of S2 , then P guarantees that Q is true after execution of S1 followed by execution
of S2 .
3.4. A DEDUCTIVE SYSTEM FOR PROVING THE VALIDITY OF HOARE TRIPLES 51
This rule works in the following way. Assume that the hypotheses {P } S1 {R} and {R} S2 {Q}
are both valid. Assume also that precondition P is true immediately before executing S1 ; S2 .
{P } S {R} says that if P is true when execution of S1 begins, then R will be true when (and
if) execution of S1 ends. Hence we know that R will in fact be true after execution of S1 ,
since we assume P is true before. Since S2 follows S1 sequentially, we conclude that R is true
immediately before execution of S2 . {R} S2 {Q} says that if R is true when execution of S2
begins, then Q will be true when (and if) execution of S2 ends. Hence we know that Q will in
fact be true after execution of S2 , since we have shown that R is true before.
Our assumptions were: 1) the hypotheses {P } S1 {R} and {R} S2 {Q}, and 2) that precondi-
tion P is true immediately before execution of S1 ; S2 . In other words, given the hypotheses
{P } S1 {R} and {R} S2 {Q}, then if P is true before execution of S1 ; S2 , Q will be true after
execution of S1 ; S2 .
Another way of saying this is that given the hypotheses {P } S1 {R} and {R} S2 {Q}, we have
proven {P } S1 ; S2 {Q}. This is exactly the rule of sequential composition.
Example 47 Prove
If the truth of I is preserved by any iteration of the loop, then, if I is true initially, it will still
be true upon termination of the while -loop. Also, the looping condition B will be false upon
termination of the loop. The predicate I is called the invariant of the while -loop.
This rule works as follows. Assume that the hypothesis of the while -rule, namely {I ∧B} S {I},
is valid. Assume also that I is true immediately before executing the while -loop. {I ∧B} S {I}
means that if I ∧ B is true before any execution of S (i.e., any iteration of the while -loop),
then I will be true upon termination of S. Since we assume that I is true immediately before
executing the while -loop, we conclude, by validity of {I ∧ B} S {I}, that I will be true after
the first iteration of the loop, if first iteration is actually executed, since I ∧ B will be true
before the first iteration (B must be true, otherwise the first iteration would not be executed
by definition of the while -loop). Since the end of the first iteration is also the start of the
second iteration, we can also conclude that I will be true before the second iteration of the loop.
Hence, if the second iteration is executed, then by validity of {I ∧ B} S {I}, I will be true at
the end of the second iteration. Proceeding in this way, we can show that, no matter how many
iterations of the loop are actually executed, I will always be true at the beginning and the end
of any iteration. Now when (and if) the loop terminates, the resulting state will be the state
after some iteration. Hence I will be true upon termination of the loop. Also, we know that
¬B is true upon termination of the loop, since otherwise the loop would not have terminated.
Our assumptions were: 1) the hypothesis {I ∧B} S {I}, and 2) that I is true immediately before
execution of the while -loop. In other words, given the hypothesis {I ∧ B} S {I}, then if I is
true before execution of the while -loop, I ∧ ¬B will be true after execution of the while -loop.
Another way of saying this is that given the hypothesis {I ∧ B} S {I}, we have proven
{I} while B do S {I ∧ ¬B}. This is exactly the while rule.
A proof tableau is a way of summarizing a proof of correctness in a single compact form, rather
than as a large number of applications of the proof rules given above. Given a program S
together with its specification, expressed as a precondition P and postcondition Q, we construct
a proof tableau for {P } S {Q} as follows:
1. Write down the program S together with its precondition P and postcondition Q
3. For each assignment statement x := e with postcondition R(x), apply the assignment
axiom to obtain a precondition R(e)
4. For each if -statement if B then S1 else S2 endif with precondition P 0 and postcon-
dition Q0 : 1
5. Repeat steps 3 through 4 until the tableau is complete (see definition 31 below).
6. For each pair of predicates P 0 , P 00 such that P 00 immediately follows P 0 in the tableau (i.e.,
with no statement in between them), extract the verification condition P 0 ⇒ P 00 .
2. The precondition for every assignment statement in the tableau is derived from the post-
condition by applying the assignment axiom.
If execution is started in a state that satisfies the precondition of the program, then, when
program control is “at” the location of a particular predicate in the tableau, that predicate
is guaranteed to be true at that point.
In particular, if and when execution of the program terminates, then control will be “at”
the postcondition, and so the postcondition will be true at that point. This is exactly what
correctness of the program requires: that the postcondition be true upon termination.
We shall prove that the following program is correct with respect to the precondition P (k, sum)
and postcondition Q(sum). Here a[0..(n − 1)] is an array of integer. As our first step (step 1
above), we write down the program below, together with the precondition and postcondition.
The next step is to write the invariant in each of the four places, as given in step 2 above. We
use the invariant I(k, sum) : sum = Σ(i : 0 ≤ i < k : a[i]). This results in the following tableau.
endwhile
{I(k, sum) ∧ ¬B}
Q(sum): {sum = (Σ i : 0 ≤ i < n : a[i])}
We now apply the assignment axiom to the assignment statement k := k+1 and its postcondition
I(k, sum), resulting in the following tableau:
This gives us the postcondition I(k + 1, sum) for the assignment statement sum := sum +
a[k]. Hence we apply the assignment axiom again, this time to sum := sum + a[k] and its
postcondition I(k + 1, sum).
The tableau is now complete. We now extract the following verification conditions (step 6
above):
1) k = 0 ∧ sum = 0 ∧ n ≥ 0 ⇒ I(k, sum)
2) I(k, sum) ∧ B ⇒ I(k + 1, sum + a[k])
3) I(k, sum) ∧ ¬B ⇒ sum = (Σ i : 0 ≤ i < n : a[i])
We prove that the verification conditions are valid predicates using the laws of equivalence and
the rules of substitution and transitivity. For condition 1, we proceed as follows.
56CHAPTER 3. VERIFICATION OF PROGRAM CORRECTNESS: HOARE-FLOYD LOGIC
The following program assigns to m (upon termination) the smallest value that occurs in array
a. Here a[0..(n − 1)] is an array of integer. The precondition is n ≥ 1, which means that array a
contains at least one element (a[0]). The postcondition is m = (MIN i : 0 ≤ i < n : a[i]), which
states that m has the minimum value that occurs in array a.
P: {n ≥ 1}
j := 1;
m := a[0];
while B1 : j 6= n do
if B2 : m > a[j] then
m := a[j]
58CHAPTER 3. VERIFICATION OF PROGRAM CORRECTNESS: HOARE-FLOYD LOGIC
else
skip
endif ;
j := j + 1
endwhile
Q(m): {m = (MIN i : 0 ≤ i < n : a[i])}
Here skip is a statement which has no effect (it’s like a “no op”). We can think of skip as being
the same as the statement x := x (where x is any variable of the program under consideration).
For x := x, the assignment axiom tells us that {Q(x)} x := x {Q(x)} is valid. In other words,
the precondition of skip is the same as its postcondition. We shall use this from now on, in
effect treating skip as an assignment statement that leaves the variable it assigns to unchanged.
The next step is to write the invariant in each of the four places, as given in step 2 above. We
use the invariant I(j, m) : m = (MIN i : 0 ≤ i < j : a[i]). This results in the following tableau.
P: {n ≥ 1}
j := 1;
m := a[0];
{invariant I(j, m) : m = (MIN i : 0 ≤ i < j : a[i])}
while B1 : j 6= n do
{I(j, m) ∧ B1 }
if B2 : m > a[j] then
m := a[j]
else
skip
endif ;
j := j + 1
{I(j, m)}
endwhile
{I(j, m) ∧ ¬B1 }
Q(m): {m = (MIN i : 0 ≤ i < n : a[i])}
Next, we apply the assignment axiom to the assignment statement j := j + 1 and its postcon-
dition I(j, m), resulting in the following tableau:
P: {n ≥ 1}
j := 1;
m := a[0];
{invariant I(j, m) : m = (MIN i : 0 ≤ i < j : a[i])}
while B1 : j 6= n do
{I(j, m) ∧ B1 }
if B2 : m > a[j] then
m := a[j]
else
skip
endif ;
{I(j + 1, m)}
j := j + 1
3.5. PROOF TABLEAUX 59
{I(j, m)}
endwhile
{I(j, m) ∧ ¬B1 }
Q(m): {m = (MIN i : 0 ≤ i < n : a[i])}
Since the if -statement now has a precondition (namely I(j, m) ∧ B1 ) and a postcondition
(namely I(j + 1, m)), we can apply step 4 of our procedure above. This results in the following
tableau.
P: {n ≥ 1}
j := 1;
m := a[0];
{invariant I(j, m) : m = (MIN i : 0 ≤ i < j : a[i])}
while B1 : j 6= n do
{I(j, m) ∧ B1 }
if B2 : m > a[j] then
{I(j, m) ∧ B1 ∧ B2 }
m := a[j]
{I(j + 1, m)}
else
{I(j, m) ∧ B1 ∧ ¬B2 }
skip
{I(j + 1, m)}
endif ;
{I(j + 1, m)}
j := j + 1
{I(j, m)}
endwhile
{I(j, m) ∧ ¬B1 }
Q(m): {m = (MIN i : 0 ≤ i < n : a[i])}
We now apply the assignment axiom to the assignment statement m := a[j] and its postcon-
dition I(j + 1, m). We also write down the precondition for the skip, which is the same as its
postcondition.
P: {n ≥ 1}
j := 1;
m := a[0];
{invariant I(j, m) : m = (MIN i : 0 ≤ i < j : a[i])}
while B1 : j 6= n do
{I(j, m) ∧ B1 }
if B2 : m > a[j] then
{I(j, m) ∧ B1 ∧ B2 }
{I(j + 1, a[j])}
m := a[j]
{I(j + 1, m)}
else
{I(j, m) ∧ B1 ∧ ¬B2 }
60CHAPTER 3. VERIFICATION OF PROGRAM CORRECTNESS: HOARE-FLOYD LOGIC
{I(j + 1, m)}
skip
{I(j + 1, m)}
endif ;
{I(j + 1, m)}
j := j + 1
{I(j, m)}
endwhile
{I(j, m) ∧ ¬B1 }
Q(m): {m = (MIN i : 0 ≤ i < n : a[i])}
Next, we apply the assignment axiom to the assignment statement m := a[0] and its postcon-
dition I(j, m).
P: {n ≥ 1}
j := 1;
{I(j, a[0])}
m := a[0];
{invariant I(j, m) : m = (MIN i : 0 ≤ i < j : a[i])}
while B1 : j 6= n do
{I(j, m) ∧ B1 }
if B2 : m > a[j] then
{I(j, m) ∧ B1 ∧ B2 }
{I(j + 1, a[j])}
m := a[j]
{I(j + 1, m)}
else
{I(j, m) ∧ B1 ∧ ¬B2 }
{I(j + 1, m)}
skip
{I(j + 1, m)}
endif ;
{I(j + 1, m)}
j := j + 1
{I(j, m)}
endwhile
{I(j, m) ∧ ¬B1 }
Q(m): {m = (MIN i : 0 ≤ i < n : a[i])}
Finally, we apply the assignment axiom to the assignment statement j := 1 and its postcondition
I(j, a[0]). The tableau is now complete:
P: {n ≥ 1}
{I(1, a[0])}
j := 1;
{I(j, a[0])}
m := a[0];
{invariant I(j, m) : m = (MIN i : 0 ≤ i < j : a[i])}
3.5. PROOF TABLEAUX 61
while B1 : j 6= n do
{I(j, m) ∧ B1 }
if B2 : m > a[j] then
{I(j, m) ∧ B1 ∧ B2 }
{I(j + 1, a[j])}
m := a[j]
{I(j + 1, m)}
else
{I(j, m) ∧ B1 ∧ ¬B2 }
{I(j + 1, m)}
skip
{I(j + 1, m)}
endif ;
{I(j + 1, m)}
j := j + 1
{I(j, m)}
endwhile
{I(j, m) ∧ ¬B1 }
Q(m): {m = (MIN i : 0 ≤ i < n : a[i])}
So far, we have concerned ourselves with conditional correctness only: if a program terminates,
then the final state will satisfy the postcondition. It is also crucial to prove that the program
does in fact terminate. Towards this end we define the notation hP i S hQi to have the following
3.6. TOTAL CORRECTNESS OF PROGRAMS: THE NOTATION hP i S hQi 63
meaning:
Because termination is guaranteed, this is called total correctness. For hP i S hQi to have the
meaning given above, we define the validity of hP i S hQi as follows.
Total correctness requires two things: 1) the program terminates, and 2) the final state satisfies
the postcondition. It is usually easier to prove each of these properties separately. We already
know how to express (2), it is just conditional correctness (see definition 30).
To express (1), we use hP i S hTi, which states that:
In other words, no constraint is made on the final state, (since any state whatsoever satisfies
T). Hence, the only requirement is termination.
There is an important relationship between total correctness, conditional correctness, and ter-
mination. To see it, we first restate these as follows:
Comparing the statement of “conditional correctness + termination” above with that of total
correctness (definition 33), we see that they are the same. We summarize this as the mnemonic
equation:
3.6.3 Proving Termination: The Proof Rule for Termination of while -loops
From our informal understanding of how our programs are executed, we easily see that the
only source of non-termination is the while -loop. That is, if a program fails to terminate, the
only possible reason is that some while -loop in the program is “stuck” and is being executed
forever. Hence, to prove termination, we only need to introduce one more proof rule, which is
the following:
{I ∧ B} S {I},
I ∧ B ⇒ ϕ ≥ 0,
hI ∧ B ∧ ϕ = Ci S hϕ < Ci
hIi while B do S hTi
To prove that a program terminates, we construct a proof tableau similar to tableaux for
conditional correctness, but we use hP i instead of {P } for the predicates that are inserted into
the tableau. We can interpret a valid proof tableau for termination as follows:
If execution is started in a state that satisfies the precondition of the program, then, when
program control is “at” the location of a particular predicate in the tableau, that predicate
is guaranteed to be true in that program state.
Also, for every Hoare triple hP i S hQi in the tableau, if control reaches P , then control is
guaranteed to eventually reach Q. (This guarantees termination.)
P (n): hn ≥ 0i
hI(0)i
S1 : k := 0;
hI(k)i
S2 : f := 1;
h invariant I(k): 0 ≤ k ≤ ni
/* variant ϕ(k): n − k */
S3 : while B : k 6= n do
hI(k) ∧ k 6= n ∧ ϕ(k) = Ci
hI(k + 1) ∧ ϕ(k + 1) < C ∧ ϕ(k) ≥ 0i
k := k + 1;
hI(k) ∧ ϕ(k) < Ci
f := f ∗ k;
hI(k) ∧ ϕ(k) < Ci
endwhile
hTi
Verification conditions:
1) n ≥ 0 ⇒ I(0)
2) I(k) ∧ B ∧ ϕ(k) = C ⇒ I(k + 1) ∧ ϕ(k + 1) < C ∧ ϕ(k) ≥ 0
Proving (2) establishes {I(k) ∧ B} S {I(k)} and hI(k) ∧ B ∧ ϕ(k) = Ci S hϕ(k) < Ci and
I ∧ B ⇒ ϕ(k) ≥ 0 (where S = “k := k + 1; f := f ∗ k” is the loop body).
Once (2) is proven, we can apply the proof rule for termination of while -loops, and conclude
hI(k)i S3 hTi. Together with (1), this gives us hn ≥ 0i S1 ; S2 ; S3 hTi.
Proof of (1):
n ≥ 0 ⇒ I(0)
≡ n≥0⇒0≤0≤n /* replace I(0) by its definition */
≡ n≥0⇒0≤n
≡ T
Proof of (2):
I(k) ∧ B ∧ ϕ(k) = C ⇒ I(k + 1) ∧ ϕ(k + 1) < C ∧ ϕ(k) ≥ 0
≡ /* replace I, ϕ by their definitions */
0 ≤ k ≤ n ∧ k 6= n ∧ n − k = C ⇒ 0 ≤ k + 1 ≤ n ∧ n − (k + 1) < C ∧ n − k ≥ 0
≡
0 ≤ k < n ∧ n − k = C ⇒ −1 ≤ k ≤ n − 1 ∧ (n − k) − 1 < C ∧ n − k ≥ 0
≡
0 ≤ k < n ∧ n − k = C ⇒ −1 ≤ k < n ∧ C − 1 < C ∧ n − k ≥ 0
≡
T
The procedure to construct a proof tableau for termination is the same as that for conditional
correctness (section 3.5), except that step 2 is replaced by the following:
When the postcondition contains quantifications, the invariant can sometimes be obtained from
the postcondition by making the range of quantification depend on the program variables:
3. When the range has been extended to that in the postcondition, the program can termi-
nate.
Examples of invariants that we derived in this way are the invariants in the following programs:
linear search, array sum, array minimum, bubble sort.
Chapter 4
Our programming language (see section 3.1) so far lacks the facility of defining procedures. We
now remedy this deficiency by extending our programming language with procedures.
The syntax of procedure declaration and invocation is as follows:
Procedure declaration:
procedure pname(value f v; value−result f r) : pbody
Procedure invocation:
call pname(ave, ar)
f v, f r, ar are variable lists
ave is an expression list
Procedures take two types of parameters: value parameters (denoted by the keyword value ),
and value-result parameters (denoted by the keyword value−result ). Value parameters are
treated as constants within the procedure body, i.e., they cannot be changed. They are used only
to pass values into the procedure. Value-result paramenters can be changed in the procedure
body. They are used both to pass values into the procedure and to return values computed by
the procedure to the invoking program.
The parameters that are used to write the procedure declaration are called formal parameters.
Formal parameters can be further subdivided into formal value parameters (given by the list f v
of variables) and formal value-result parameters (given by the list f r of variables). The param-
eters that are passed to the procedure in a procedure invocation are called actual parameters.
Actual parameters can be further subdivided into actual value parameters (given by the list ave
of expressions) and actual value-result parameters (given by the list ar of variables). Note that
since the formal value parameters do not return a value to the invoking program, the actual
value parameters can be expressions instead of variables, since we do not need a variable to
return the changed value.
67
68 CHAPTER 4. VERIFICATION OF PROGRAMS CONTAINING PROCEDURES
• Prove conditional correctness of the procedure body (in terms of the formal parameters).
• Replace the formal parameters by the actual parameters to conclude the conditional cor-
rectness of procedure invocations.
Effectively, this just “simulates” what happens when a procedure is invoked — the formals get
replaced by the actuals. Although this strategy works in general, problems can arise in certain
peculiar situations:
Replacing the formals by the actuals, we conclude {T} call inc1(n, n) {n = n + 1}. Obviously
this is invalid, since (n = n + 1) ≡ F.
Replacing the formals by the actuals, we conclude {T} call copy(a, b, c) {c =?}. Should ? be
a or b. We don’t know — the final value of y is not well-defined.
To avoid problems such as those illustrated above, we make the following assumptions:
• The formal and actual parameters match with respect to number and type.
• The formal parameters are pairwise distinct (and hence the formal parameters have well-
defined initial values).
• All variables other than formal parameters are local to the procedure (i.e., no global
variables).
• No mutual recursion (although simple recursion, where a procedure invokes itself, will be
dealt with).
4.1. PROVING CONDITIONAL CORRECTNESS OF PROCEDURES 69
This states that if pbody is conditionally correct with respect to precondition P (f v, f r) and
postcondition Q(f v, f r), then the procedure invocation call pname(ave, ar) is conditionally
correct with respect to precondition P (ave, ar) and postcondition Q(ave, ar) (i.e., P and Q
with the formal parameters replaced by actual parameters).
Applying the proof rule, we conclude {P (a)} call f act(a, b) {Q(a, b)}.
Replacing P, Q by their definitions, we get {a ≥ 0} call f act(a, b) {b = a!}.
Value parameters cannot be changed — their value is always the initial value. Value-result
parameters can be changed, and so their value at some point in the procedure body is not
necessarily the initial value. In many situations (e.g., incrementing a variable, sorting an array)
we need to be able to relate the final value of a value-result parameter to its initial value in
order to specify correctness as a precondition and postcondition. We shall do this by recording
the initial value of the value-result parameter in an “upper case” variable (that is not changed).
This variable is then passed to the procedure as a value parameter.
Applying the proof rule, we conclude {P (Y, y)} call inc2(Y, y) {Q(Y, y)}.
Replacing P, Q by their definitions, we get {y = Y } call inc2(Y, y) {y = Y + 1}.
These extra value parameters are never referenced in code, (see above example) i.e., they don’t
affect execution. In the actual program, they can be omitted. Since these variables are used
only to carry out the proof, and not to affect program execution, they are called ghost, or
auxiliary variables.
4.1. PROVING CONDITIONAL CORRECTNESS OF PROCEDURES 71
where
sorted(a, b, n) ≡ perm(a, b, n) ∧ ordered − nondec(a, n)
ordered − nondec(a, n) ≡ ∀(i : 0 ≤ i < n − 1 : a[i] ≤ a[i + 1])
perm(a, b, n) ≡ ∀(i : 0 ≤ i < n : num(a, a[i], n) = num(b, a[i], n))
num(c, x, n) = (N i : 0 ≤ i < n : c[i] = x)
merged(b, c, a, `, m, n) ≡
/* `, m, n are the sizes of b, c, a respectively */
∀(i : 0 ≤ i < n : num(a, a[i], n) = num(b, a[i], `) + num(c, a[i], m)
4.2. PROVING TERMINATION OF PROCEDURES 73
merge(B, C, A, mid, n − mid, n, b, c, a) is a procedure that takes two arrays b, c that are sorted
in non-decreasing order and merges them into an array a that is also sorted in non-decreasing
order.
• Prove termination of the procedure body (in terms of the formal parameters).
• Replace the formal parameters by the actual parameters to conclude termination of the
procedure invocation.
There are two main differences from the method for proving conditional correctness. First,
the postcondition is simply T, since we do not care about the actual final state. Second, we
construct a tableau for termination rather than a tableau for partial correctness (the tableau is
constructed for the procedure body and its pre/post-conditions).
This states that if pbody terminates with respect to precondition P (f v, f r), then the procedure
invocation call pname(ave, ar) terminates with respect to precondition P (ave, ar) (i.e., P with
the formal parameters replaced by actual parameters).
Verification conditions:
1) n ≥ 0 ⇒ I(0)
2) I(k) ∧ B ∧ ϕ(k) = C ⇒ I(k + 1) ∧ ϕ(k + 1) < C ∧ ϕ(k) ≥ 0
Proving (2) establishes {I(k)∧B} S {I(k)} and hI(k)∧B ∧ϕ(k) = Ci S hϕ(k) < Ci and I(k)∧
B ⇒ ϕ(k) ≥ 0 (where S = “k := k + 1; f := f ∗ k” is the loop body). Once these are proven, we
can apply the proof rule for termination of while -loops, and conclude hI(k)i while B do S hTi.
Together with (1), this gives us hn ≥ 0i f act−body hTi (where f act−body is the body of procedure
f act). Given that hn ≥ 0i f act−body hTi is valid, we then apply the proof rule for termination
of nonrecursive procedures, and conclude ha ≥ 0i call f act(a, b) hTi, i.e., all invocations with
actual parameter a non-negative terminate.
Hence, applying the proof rule, we conclude h0 ≤ c = ϕ(c)i call rf act(c, d) hTi.
Software Engineering
77
Chapter 5
Introduction
How to construct a large program that is correct, i.e., that provides the required functionality.
So, how do we know what the required functionality is? By writing a requirements specification.
Writing the specification is the first phase of the software life cycle.
Once the specification is written, it remains to produce a design that satisfies the specification,
and then to implement the design to produce a working program.
What are the major challenges in constructing a large program that satisfies a given specifica-
tion?
Writing the specification: It is difficult to write a specification that correctly reflects the
“requirements” of the users, or customers, of the software system. Reasons for this are:
1. The requirements are vague and informal, while a specification must be formal.
Translating informal ideas into formal descriptions is inherently difficult and error-
prone.
2. Often the customer does not have a complete idea what they want, and this idea often
changes upon using the system. Thus, writing a specification is really an iterative
process, for example: write initial (incomplete spec), produce a partial prototype,
revise spec based on customer feedback from using the prototype.
Program correctness: Once the specification is written, there is the challenge of ensuring that
the program behaves as given by the specification. This is a known difficult challenge,
as software is “discrete.” Unlike the physical structures that are constructed by the
older Engineering disciplines, e.g., civil and mechanical engineering, software does not
“degrade gracefully.” A single rusted bolt or some crumbled concrete will not cause a
bridge or building to come crashing down. Furthermore, the signs of physical deterioration
are usually evident, giving time for repair and maintenance effort. With software, a
single incorrect line of code in a one million line program can cause completely incorrect
79
80 CHAPTER 5. INTRODUCTION
behavior. Furthermore, such behavior may manifest suddenly and unpredictably, after
extensive testing and months or years of trouble-free operation. There are many examples
of sudden failure of deployed software, see the “Risks of Computing to the Public” website
at http://catless.ncl.ac.uk/risks.
Cost of development: Third, there is the challenge of carrying out the work at reasonable
cost. When the specification, design, and program are all small, on the order of a few
hundred lines at most, they can all be written by a single programmer or a small team.
when these are large however, the work of writing them must be partitioned among many
individiuals and teams. Experience has shown that controlling and coordinating the in-
teraction among many teams, and in particular keeping the time spent in communication
and coordination to a reasonable limit, are major challenges to large software projects. As
Fred Brooks demonstrated in “The Mythical Man Month” [2], such costs can completely
overwhelm a software project if not controlled properly.
2. The solutions to the subproblems combine to give a solution to the original problem
The key to successful problem decomposition is abstraction: the idea of abstraction is that we
temporarily ignore some aspects of the problem so that we can concentrate on other aspects.
In the context of software, the most important form of abstraction is abstraction by specifica-
tion: we forget about coding and implementation (how the task is to be accomplished) and
concentrate on the specification (what is the task to be accomplished). That is, we decom-
pose module specifications, breaking a module down into several modules, and also introducing
“helper” modules as needed. This is the main activity in the design phase of the software life
cycle. When design is over, the result is a set of module specifications, along with documen-
tation about the interactions among the various modules, (module dependency diagram etc).
Implementation of each module specification can then proceed in parallel.
5.2. DECOMPOSITION AND ABSTRACTION 81
5.2.1 Example
Consider the construction of a program similar to AUBSiS. After writing the specification, we
start the design phase by outlining some of the major “top level” modules. Typically, these will
be modules that initialize the system, and that handle direct interaction with the users of the
system. During the design phase, we only provide module specifications, and leave coding and
implementation to later phases. We proceed to decompose the top level module specifications
by introducing additional “helper” modules that assist the top level modules in carrying out
the functions required by their specifications. Thus, the functionality of a module is actually
decomposed and split over several modules.
Let us look at a concrete example of this. Suppose we have a module (or a method in a module)
that registers a student in a course. Actually, we register for a particular section of a course,
being held in a particular semester. This is an important concept, and so we introduce a helper
module Section to represent it.
Such a method could be an instance method in a class Student and could have the header
where the return value indicates the status of the attempted register operation (e.g., succesfull,
or failed due to some reason...). We assume that the class Student has already been introduced
and specified.
At this stage, we need to write the specification for Section. Informally, we require that a
Section object provides the following information: the course name, the section number, and
the semester, and that this information be valid, i.e., correspond to a section of the course that
is actually offered in the given semester. We will also require other attributes, such as current
number of registered students, allocated classroom, etc. along with methods for getting and
setting these. For our current purpose, we assume the current instance variables
String courseName
int sectionNum
String semester
What else do we need? We must check the condition for registering a course, and if this is
satisfied, actually carry out the registration. The condition for registering a course is:
1. the student has taken and passed the prerequisites for the course, and
To check condition 1, we introduce a module that represents the course catalogue, and therefore
would provide prerequisite information. We include in the specification of this module the
following method
Note that this method comes with its own specification, given as a:
1. Precondition (requires clause): a condition (predicate) over the arguments to the method,
and the instance variables (in case of an instance method), and a
2. Postcondition (effects clause): a condition relating the initial values of the method argu-
ments, the initial values of the instance variables (in case of an instance method) to the
final values of the method arguments, the final values of the instance variables (in case of
an instance method), and the returned value.
The precondition and postcondition can be given informally, in English, as shown above, or
formally, in predicate logic.
Thus, when a module is a class containing several methods, the module specification consists
of:
1. An informal overview statement that describes the purpose of the class, and
Returning to our example, to finish checking condition 1 after obtaining the list of course
prerequisites, we need to check that the student has actually passed all of the prerequisties. This
requires access to the transcript of the student. We thus introduce a new module Transcript
that represents transcripts. Unlike for courses, we let each transcript be an object, so that we
are actually defining a data abstraction. This is because there is only one course catalogue,
which we assume is stored in a database (hence we access it using static methods), while there
are many transcripts, and we are creating more dynamically at run time, as new students enter
AUB. Thus, it makes sense to create each new transcript as a object. Within class Transcript,
we specify the following method:
Note the use of the keyword this to refer to the transcript object itself. It is now possible
to check that the student has passed all the prerequisites using the following implementation
sketch:
5.3. ERRRORS IN PROGRAMS AND THEIR DETECTION 83
// IMPLEMENTATION SKETCH:
// obtain all of the prerequisites for course cl.courseName
// for each prerequisite, obtain the grade and check that it is >= 60
Note that we also added the specification for method register. (In reality this specification
would have been written at the beginning of the development).
Condition 2 can be checked in a similar manner, using class Section, and condition 3 can be
checked using class Transcript and the course catalogue. If all conditions are satisfied, the
student can be registered, using class Section. This leads to the following implementation
sketch:
// IMPLEMENTATION SKETCH:
// obtain all of the prerequisites for course cl.courseName
// for each prerequisite, obtain the grade and check that it is >= 60
// check that section cl has available space
// check that the total credits that student this will be regietered
// for after adding cl is <= 17
// if all conditions are met, add this to the list of students that
// are registered in section cl
5.2.2 Discussion
The final design consists of specifications for all modules (classes, interfaces, and static methods),
and implementation sketches for all methods. The development of module specifications and
implementation sketches proceeds together, as the development of an implementation sketch for
a method A provides insight into what are the appropriate helper methods B, C, D, . . . which
method A invokes, and what should the specifications of these methods be.
The second part of the course (programming in the large) will discuss the development of
specifications and designs in detail.
Internal errors: a single module fails to satisfy its specification due to an error in the design
or coding of that module.
Interface errors: a method A invokes a method B. The designers of A assume that B has the
specification S, whereas the designers of B actually worked to a different specification S 0 .
84 CHAPTER 5. INTRODUCTION
This can happen due to miscomunication between teams, or due to using natural language
(English), which is inherently ambiguous, to write specifications. In this situation, even
though every module satisfies “its” specification, the program as a whole does not work
correctly.
As an example of an interface error, suppose that the specification of method grade above
is mis-communicated, and the word “highest” is ommitted from the effects clause. The
result will be that students who are eligible to take a course will nevertheless be prevented
from doing so.
How can these errors be prevented or, at the least, detected after their occurrence? We will
explore two major approaches:
Testing Given a specification for a method, one can execute the method on a particular input
and observe if the output satisfies the specification. This is called a test case. Usually we
use a test suite, which consists of a “sufficiently large” number of test cases. Testing can
detect errors, but unlike rigorous reasoning, cannot show that errors are absent. Testing
can also detect trivial syntactic errors, such mismatch of file name and class name, absence
of needed import statements, etc.
To detect internal errors, we use unit testing, which tests a single method A against its
specification. If method A calls another method B, then we use a stub to simulate the
execution of B. A stub can be a piece of code that is a simplified implementation of B,
e.g., return a constant result, or it can query the user, who “manually simulates” the
execution of B and then enters the result. If a test case produces an output that violates
the specification of A, then provided that the stubs have been implemented correctly, this
is indicate an error in the implementation of A.
To detect interface errors, we use integration testing, which tests several methods together.
This requires of course that the implementations (code) for all these methods is complete,
whereas unit testing requires only the code for the one method being tested. E.g., once
methods A and B have been unit tested and we are reasonably sure that they are correct,
an integration test case that produces an output which violates the specification of A will
most likely indicate an interface error between A and B.
Review of OO Concepts
Jave programs are constructed from classes and interfaces. Classes used to define new data types
(contructors and instance methods) and also to provide procedures (static methods). Interfaces
are used to provide specifcations, which are implemeted by other classes. An inteface class has
no implementations for any of its methods. Actually, an interface only provides method headers.
The “behavioral” part of the method specification, given by the precondition and postcondition,
must be provided as comments.
6.2 Packages
Packages provide encapsulation and naming. A group of related classes and interfaces will be
all placed in the same package. Classes and methods not declared public can only be accessed
from within the same package (encapsulation).
Each package has a hierarchical name that is unique with respect to the names of all other
packages. Classes and interfaces within a package have names relative to the package name.
Thus, there cannot be any naming conflicts between classes/interfaces in different packages.
Local variables of a method reside on the stack. Variables are partitioned into value tpes and
reference types. A value type simply contains a value from some domain. In Java, the value
types are byte, short, int, long, float, double, char, boolean. These are also called primitive
types.
A reference type contains a reference to an object (more specifically, to the collection of instance
variables of the object’s class definition). The object itself is stored on the heap. In addition,
arrays are also reference types.
85
86 CHAPTER 6. REVIEW OF OO CONCEPTS
6.3.1 Mutability
Objects can be mutable (changeable) or immutable. There are pros and cons to either choice.
Immutable objects are indicated by declaring all of their instance variables with the final
keyword.
Equality testing in Java is a subtle, and can produce unexpected results if you are not familiar
with all of the details.
Java provides an operator == for testing equality of variables. For primitive (value) types, this
does what you would expect: it compares the values. So x == y returns true if and only if x
and y hapepn to have the same value when x == y is evaluated. The values of x and [y at
earlier or later times in the program execution are not relevant to the result.
For reference types (except Strings), the operator == compares the references, i.e., the addresses
in memory. Thus, a==b returns true if and only if a and b refer to the very same object. That
is, for reference types, == is actually a test of equality of the reference, or in other words identity
of the object referred to. By constrast, for value types == is actually a test of equality of the
values.
So, what if we wish to compare two objects (of the same class) for equality rather than for iden-
tity? That is, test if the corresponding instance variables of the objects have the same values?
We do this by implementing an equals method in the class. Java provides implmentations for
equals for many of the data types that are provided as part of the standard implementation of
Java, e.g., String and HashSet.
To summarize, for reference types, we have reference equality, given by ==, which is a built in
Java operator, and object equality, given by the equals instance method, which each class must
implement for itself.
Examples
Consider
int[] a = {1,2,3};
int[] b = {1,2,3};
System.out.println(a==b);
This prints false, since a and b are different arrays. That they happen to have the same value
does not affect the result of ==, since arrays are reference types. For objects, e.g.,
6.3.3 Strings
Strings are reference types (i.e., objects) in Java. Because Java provides built-in language
suppost for strings, the declarations of Strings, and some operators (e.g., +) appear superficially
like those of primitive types.
Strings are immutable; a string cannot be changed. Thus, if we write
String a = "once";
a = "in";
The second statement will not change the string “once”, it will instead change the reference a
to point to “in” rather than pointing to “once.”
Strings behave differently from other reference types with respect to == because Java interns
string literals. Thus
String u = "123";
String v = "123";
System.out.println(u == v);
prints true, since Java identifies the two occurrences of “123” and replaces them by just one
occurrence, and makes both u and v point to this same occurrence. Also,
String t = "12";
String w = t + "3";
System.out.println(u == w);
Java will print out false, even though u and w both have value “123”. For
System.out.println(u.equals(w));
6.4 Aliasing
Consider
System.out.println(s==r);
s.add("once");
88 CHAPTER 6. REVIEW OF OO CONCEPTS
System.out.println(r.size());
r = s;
System.out.println(s==r);
s.add("in");
System.out.println(r.size());
The first two print statements will output false and 0, since s and r refer to different objects, and
so changes to s do not affect r. Next, we execute r = s which causes s and r to reference the
same object, which is called aliasing. Now the subsequent print will output true, as expected.
Now, changes to s will also affect r, so that the s.add method call will cause the size of both
s and r to increase by 1. Hence the last print statement outputs 2, as expected.
Java is strongly typed. The Java compiler checks all assignments and method calls to ensure that
they are type correct. If it finds a type violation, then compilation fails with an error message.
Type checking relies on variable and object declarations and method headers, to provide the
information necessary to actually perform the type checking. Java types are organized in a
hierarchy. Java allows implicit conversion between some primitive types. Java also allows
overloading of method names, by allowing in the same class several emthod definitions with the
same name, but with different headers (i.e., different parameter and return types). See sections
2.4 and 2.6 of the course text for details and examples.
Chapter 7
Procedural Abstraction
7.1 Overview
A procedure “packages” some code, and provides an “interface” to the code via a formal param-
eter list and a parameter passing mechanism. This enables two different kinds of abstraction.
The packaged code can be “reused” via multiple calls. Each call binds a (different, in general)
set of actual parameters to the formal parameters.
When designing and coding the procedure, we write code that manipulates only the formal
parameters. Thus, we do not concern ourselves with the identity of the actual parameters that
are provided in a call to the procedure.
Thus, we abstract from the identity of actual parameters.
Since a procedure defines a specific piece of code that has well defined entry and exit points,
we can define a specification for this piece of code.
A specification states what the procedure does, without stating how it does it.
It is (usually) much shorter and easier to read than an implementation.
Thus, we abstract from the details of implementation of a procedure.
For example, there is only one definition of the array sorting problem, but many different
algorithms for sorting.
Abstraction by specification does not require a procedure mechanism, it can be used on any
piece of code, e.g., that is part of a larger code segment. Nevertheless, it is most useful when
used with a modularity mechanism such as a procedure or method definition.
Benefits of abstraction by specification:
• Locality: When reading and reasoning about the implementation of some procedure
89
90 CHAPTER 7. PROCEDURAL ABSTRACTION
A that calls another procedure B, we only need to look at B’s specification, not at its
implementation.
Importance of Locality
This can be tricky with recursive and mutually recursive procedures. Need to understand
induction on (the height of nodes in) recursion trees.
The decomposition of a given task among several methods is called functional decomposi-
tion.
• Coding and Implementation: When coding A, you only need to know the specification
of every B that A invokes. If you needed to know the implementation of B, then you would
also need to know the implementation of all procedures that B itself invokes, etc.. Thus:
1. Different people can code A and B independently, as long as they agree on the speci-
fication of B.
2. We can understand the whole program one procedure at a time, rather than “all at
once.” Much easier!
Importance of Modifiability
1. Syntactic part, a.k.a. Header: procedure name, formal parameter list, type of result,
e.g., float sqrt(float x).
(a) requires clause: defines a precondition, i.e., a condition on the actual parameters,
which must hold when the procedure is invoked.
(b) modifies clause: lists all the actual parameters that are modified, e.g., via call by
reference.
(c) effects clause: describes the behavior of the procedure for all inputs that satisfy
the requires clause. It defines a postcondition, i.e., a predicate which relates the the
initial and final values of the actual parameters, the initial and final values of the
instance variables (in case of an instance method), and and the value returned by
the procedure.
If the precondition is identically true, then the requires clause is omitted. If the procedure does
not modify any data, then the modifies clause is omitted. The effects clause is never omitted.
The effects clause says nothing about the behavior of the procedure when the precondition is
initially false.
When the precondition is not identically true, then either the users must ensure that the pro-
cedure is always called with the precondition true, or the procedure must test the precondition
at run time, and take some action if the precondition is found to be false. For example, raise an
exception, or (if the procedure parameters are input interactively from a user) put up a dialog
box requesting corrective action from the user.
A specification for a class, in addition to containing a specification for each method of the class
(including the constructorsm if any), will also include an overview clause, which describes
informally the overall purpose of the class.
The following is a partial specification of a class that provides a number of sorting and searching
operations for integer arrays (see Fig 3.4 in the text).
search does not have a requires clause, so its precondition is true. Hence it can be called
with parameters of any value. searchSorted has a requires clause that gives a nontrivial
precondition: the array must be sorted in ascending order. If searchSorted is called with
array parameter a which does not satisfy this, then there is no guarantee for the result: the
effects clause will not necessarily hold. Both these methods do not contain a modifies clause
in their specification, and so they do not modify the array parameter. The sort method does
modify the array parameter, as indicated by the modifies and effects clauses.
Note that the specifications of search and SearchSorted are underdetermined : if a value
occurs more than once in the array, then the index of any occurrence is acceptable. Thus the
specification permits multiple implementations, e.g., searchSorted can be implemented using
both linear and binary search.
The effects clause in general relates the initial values (i.e., the values when the method is
invoked) of parameters and instance variables (in the case of instance methods) to the final
values (i.e., the values when the method returns) of paramteres and instance variables (in the
case of instance methods) and the return value. To distinguish between initial and final values,
we use the suffix pre on identifiers to indicate the initial value, and the suffix post to indicate
final values. This is a slight deviation from the course text, which uses the unadorned variable
name for the initial value, and post for the final value.
Thus, the specification for sort above can be rewritten as
7.2.3 Methodology
Here is an implementation of the above module specification for class Arrays (see Figs 3.5 and
3.6 in the text).
if (a == null) return;
quickSort(a, 0, a.length-1);
}
int x=a[i];
while (true) {
while (a[j] > x) j--;
while (a[i] < x) i++;
if (i < j) { // need to swap
int temp = a[i] ; a[i] = a[j]; a[j] = temp;
j--; i++; }
else return j;
}
}
}
The requires clause (precondition) is an obligation of the client (calling procedure) to supply pa-
rameters that satisfy the precondition. If the client fails to do this, then there are no guarantees
as to the effects of the execution of the called procedure (returned value and/or modifications
to reference parameters and instance variables).
The effects clause (postcondition) is an obligation on the implementer (called procedure) to
satisfy the postcondition in those cases where the client supplies parameters that satisfy the
precondition.
Once the specification has been agreed on, the obligations of both parties are fixed. Implemen-
tation of the calling and called procedures can then proceed in parallel. This is a key to the
development of large software by many teams working in parallel.
A procedure should encapsulate a well-defined and useful function. E.g., for the quicksort im-
plementation shown above, the procedures quickSort and partition are appropriate, as they
each do a useful function needed in the quicksort algorithm: quickSort organizes the recur-
7.4. EXAMPLE OF FUNCTIONAL DECOMPOSITION 95
sive calls, and partition partitions array segments into lower and upper parts. Decomposing
partition into smaller procedures (e.g., for the inner loops) would not be helpful.
Minimally constraining
Generality
The precondition must be strong enough so that we can design a procedure that will terminate
with the postcondition true. It is desirable to handle as many inputs as possible, so we make the
precondition as weak as possible, subject to this constraint. Recall that if the precondition is
false, then the procedure should not be executed, but rather the user is notified (if the procedure
is interactive) or an exception is raised (if the procedure is not interactive).
Simplicity
A procedure should have a well-defined purpose that is independent of its context of use. That
is, it implements an algorithm, such as quicksort, or partition, rather than part of an algorithm,
such as “swap two elements.”
How do we implement this? Clearly, we need to scan through array a at least once. As we
scan, we will count the number of distinct values seen “so far”. More precisely, suppose that
we scan from bottom to top, i.e., from i=0 to i=a.length-1. Then at some arbitrary position
i, we should have counted all the distinct values in a[0],...,a[i-1]1 . So, in incrementing i,
we must check if a[i] is a new distinct value, that is, if a[i] occurs or not in a[0...i-1].
We will do this using a helper procedure contains. This discussion leads to the following
implementation sketch
1
From now on, we will use the notation a[0...i-1] to indicate the sequence of elements a[0],...,a[i-1].
This is known as an array section.
96 CHAPTER 7. PROCEDURAL ABSTRACTION
// IMPLEMENTATION SKETCH
// iterate over array a from index i=0 to i = a.length-1
// at iteration i:
// check if a[i] occurrs in a[0,...,i-1] using helper method "contains"
// if not, increment a counter (which is initially 0)
// return the counter
Thus our functional decomposition is that the implementation of values consists of a loop
that scans through a, maintaining a count of distinct values seen so far. Each new value is
determined to be distinct or not by the helper procedure contains. Thus, the specification for
contains is:
Now that the specifications for both procedures have been determined, their implementation
can proceed in parallel, by different programmers. For values, we obtain the following imple-
mentation.
// IMPLEMENTATION SKETCH
// iterate over array a from index i=0 to i = a.length-1
// at iteration i:
// check if a[i] occurrs in a[0,...,i-1] using helper method "contains"
// if not, increment a counter (which is initially 0)
// return the counter
As mentioned above, a very important point is that the implemenation of values can be
understood by looking only at the specification of contains. The implementation of contains
does not need to be consulted at all, and may not even exist when the implementation of values
is written. If contains itself happened to invoke a third procedure, then even the specification
of this third procedure would not need to be consulted to understand the implementation of
values.
7.5. ANOTHER EXAMPLE OF FUNCTIONAL DECOMPOSITION 97
int i = 0;
while(i < k) {
if (v == b[i]) return(true);
i = i+1;
}
return(false);
}
We will implement median by sorting a copy of array a and then returning the element with
index (a.length-1)/2, since in a sorted array with distinct values, the median is the middle
element. Note that we cannot sort a itself, since there is no modifies clause, so we are not
permitted to modify a. Here is the implementation:
Assume that sort refers to the procedure whose specification and implementation is given
above. To understand the above implementation, we only need to read the specification of
sort. We do not need to read its implementation; the fact that sort calls quicksort is entirely
irrelevant. Hence, the specification and implementation of quicksort are irrelevant, as are the
specification and implementation of any methods that quicksort calls, e.g., partition.
If we changed the implementation of sort so that it called a method heapsort which worked
using heapsort instead of quicksort, then our reasoning about the correctness of median would
not change at all: only the specification of sort is relevant. Thus, a specification acts as
a “logical firewall”: it cuts the chain of dependency in reasoning about the correctness of
implementations.
If we did not use specifications, then to reason about the correctness of the implementation of
median, we would have to look at the implementation of sort, and then the implementation of
98 CHAPTER 7. PROCEDURAL ABSTRACTION
quickSort, and then the implementation of partition! It is thus clear that the use of module
specifications saves a tremendous amount of work when reasoning about the correctness of
implementations. It also makes unit testing possible.
Data Abstraction
8.1 Overview
Consider the above implementation of a set using an array. If we allow any module to access
the array directly, then when we change the implementation to a linked-list, all modules that
use the set object must be modified to process the new representation properly. To insulate the
“using” modules from such changes of representation, we use the following data hiding idea:
1. implement some operations that access the array, and provide the standard set operations,
e.g., is-member?, insert, remove, and
2. require that all accesses to the set object be via these operations.
Thus, when the representation is changed from an array to a linked list, only the implmentations
of the set operations need to be changed to process the new representation properly. The “using”
modules do not need to be modified.
Since the number of using modules is usually far greater than the number of object operations,
this is very important in saving effort and localizing code changes.
Such an implementation of set is an example of an abstract data type:
The Liskov & Guttag book uses the term “data abstraction” for abstract data type. These
mean the same thing.
99
100 CHAPTER 8. DATA ABSTRACTION
1. A class header, (e.g., public class IntSet) which starts with an overview statement
describing informally the data type being specified, i.e., describing the objects.
2. A list of constructors: methods that initialize an object.
3. A list of instance methods: methods that provide access to an object.
Each constructor and method is a procedure, and so is specified using a requires clause and
an effect clause.
Consider the abstract data type “set of integers.” An example specification for this data type,
IntSet, is given on p. 81 of Liskov & Guttag.
We can use the specifications of data abstractions in two ways:
We only need to know the specification of an abstract data type to invoke it. We don’t need to
know the implementation.
1. Select a representation
2. Define the abstraction function and representation invariant (see below for detailed
discussion)
3. Implement constructors to initialize the representation properly, i.e., so that the represen-
tation invariant is true after the object is initialized
4. Implement methods to use/modify the representation properly, i.e., so that the represen-
tation invariant is preserved by each method call
8.5. IMPLEMENTING DATA ABSTRACTIONS 101
A representation is a set of variables that are used to store the state of an instance of the data
type, i.e., they are the instance variables of an object which is an instance of the data type.
The type of each of these variables can be a primitive type (provided by the programming
language being used), or it can be another abstract data type.
Selecting a good representation is important, since some representations are much better than
others, e.g., Arabic numerals are much better than roman numerals.
Criteria for a good representation:
1. It must enable all of the specified operations (constructors and methods) to be imple-
mented with reasonable efficiency
2. It must enable the most frequent operations to be executed quickly. Thus, the right
representation may depend on the pattern of usage, e.g, for IntSet:
an array is better if there are many accesses and few insertions
a linked list is better if there are many insertions and few accesses
To implement data hiding, we declare all instance variables to be private, i.e., not accessible
by code outside of the defining class. Thus, all access to instance variables is mediated by the
methods of the class itself.
These have already been specified. Implementation consists of providing code for each con-
structor and method that conforms to its specification. Additional “helper” methods may also
be specified and implemented. These are usually declared private, and are only used by the
public constructors and methods that provide the external interface of the data type.
In Liskov & Guttag, examples of implementation of data types are given on p. 88 for IntSet,
and p. 91–92 for Poly, a data abstraction for polynomials. Note the addition of the private
getIndex method in the implementation of IntSet.
Let o be an object of some class C which implements an abstract data type. At any time,
the instance variables of o have particular values. This collection of values is called a concrete
state 1 .
A concrete state represents a (single) value of the abstract data type, i.e., an abstract state.
Different concrete states can represent the same abstract state, e.g., in the implementation of
IntSet, both vectors [12] and [21] represent {1, 2}. Thus, the relation between concrete and
1
More accurately, a concrete state is an assignment to each instance variable of a value from its type.
102 CHAPTER 8. DATA ABSTRACTION
abstract states is a function, and is called the abstraction function. E.g., for IntSet, with
representation object a vector v[0...(v.size − 1)], an abstraction function is:
set[v] = {x | ∃i : 0 6 i < v.size : x = v[i]}
Note that, e.g., set([1 2]) = set([2 1]) = {1, 2}. The order in which 1 and 2 appear in the
vector is irrelevant to the abstract value represented. This is documented by definition of the
set itself, since set([1 2]) = set([2 1]). Hence, the abstraction function tells us which aspects of
the implementation are “internal” (e.g., the order of elements) and which affect the abstract
value (e.g., the elements themselves). This is what abstraction is: deciding which information
should appear externally, “at the interface”, and which should be hidden in the implementation.
A key requirement for the implementation is that the abstract and concrete operations must
“commute” w.r.t. the abstraction function. Let AF be an abstraction function, c be a concrete
state, abstract−op be an abstract operation, and concrete−op the corresponding concrete
operation, i.e, method. Then,
AF(concrete−op(c)) = abstract−op(AF(c)).
For example, let s be an object of type IntSet, and let AF(s) = {1, 7, 11}. Then s.insert(3)
should result in a value for s such that AF(s) = {1, 3, 7, 11}, i.e., it should have the same effect
as AF(s) = AF(s) ∪ {3}.
The abstraction function can be implemented as a method that outputs (as a string) the value
of the abstract state that is represented by the current concrete state. This can be useful, e.g.,
for debugging.
The representation invariant can be implemented as a method that checks if the invariant is
true for the current concrete state. If so, it outputs “true”, and otherwise it outputs “false”.
This is also useful for debugging.
8.6. PROPERTIES OF IMPLEMENTATIONS 103
A “query only” method can change the concrete object as a “side effect” if that does not change
the abstract value, i.e, can change from c to c0 if AF(c) = AF(c0 ). This may be useful to speed
up subsequent operations.
If the implementation makes an instance variable available to code outside of the implementa-
tion, then the implementation exposes the representation.
This could happen, e.g., if (1) the instance variables are not declared private, or (2) the
instance variables are private, but a reference (pointer) to the instance variables is returned by
some method of the implementation.
If the representation is exposed, then external code could inadvertently make the representation
invariant false. This destroys the assumptions under which the implementation has been coded,
and may lead to incorrect results in subsequent method calls to the implementation. E.g.,
consider if the IntSet implementation has duplicate elements inserted into v.
So, exposing the representation is very bad. It destroys modularity. Consider it a design/coding
error.
2. show that if an instance method that makes changes (a mutator) is invoked on an object
that satisfies I, then the object still satisfies I when the method returns (note: it is OK
to violate I in the middle of a method call). Also, any objects of the same type that
are constructed, e.g., as return values, or that are modified, e.g., as parameter reference
types, must also satisfy I upon termination of the method.
1. overview statement and definitions for the rep invariant and abstraction function
104 CHAPTER 8. DATA ABSTRACTION
2. method specifications
3. code sketches
import java.util.Scanner;
// Instance variables
private int[] els;
private int top;
// CONSTRUCTORS
public IntSet() {
// EFFECTS: Creates an empty IntSet, i.e., AF = emptyset
// METHODS
int i = 0;
8.8. EXAMPLE: INTSET 105
if (i == top)
//{x notin els[0:top-1]}
return -1;
else
//{0 <= i < top and els[i]=x}
return i;
//endif
}
//EFFECTS: AF_post = AF - x
//MODIFIES: els, top
/* CODE SKETCH
* while some occurrence of x remains in els
* remove it
* if top <= els.length/4
* halve the size of els
*/
//{REP and AF = S}
int i;
while (true) {
//{invariant: REP and (AF = S or AF = S-x) }
i = getIndex(x); //Next occurrence of x to be removed.
if (i == -1) //No occurrence, so return.
//{REP and (AF = S or AF = S-x) and x notin S}
//{REP and AF = S - x} //Satisfies EFFECTS.
break;
106 CHAPTER 8. DATA ABSTRACTION
//EFFECTS: AF_post = AF U x
//MODIFIES: els, top
/* CODE SKETCH
* if no space available in els
* double size of els
* endif
* insert x
*/
//{REP and AF = S}
if (top == els.length) { //No space available in els.
//{REP and AF = S and 0 < top = els.length}
int[] a = new int[2*top];
//Loop to implement a[0:top-1] = els[0:top-1]
for(int i = 0; i < top; i++) a[i] = els[i];
//{REP(a,top) and AF(a,top) = S and 0 < top < a.length = 2*top}
els = a;
//{REP and AF = S and top < els.length}
}
//else
//{REP and AF = S and top != els.length}
//{REP and AF = S and top < els.length}
//endif
//{REP and AF = S and top < els.length}
els[top] = x;
top = top + 1;
//{REP and AF = S U x}
}
return(st);
}
String op = input.next();
int arg = input.nextInt();
if (op.equals("isIn")) {
System.out.println("isIn(" + arg + ") = " + s.isIn(arg));
}
else if (op.equals("insert")) {
System.out.println("insert(" + arg + ")");
s.insert(arg);
System.out.println(s.repToString());
System.out.println(s.toString());
System.out.println("rep is " + s.repOk());
} else if (op.equals("remove")) {
System.out.println("remove(" + arg + ")");
s.remove(arg);
System.out.println(s.repToString());
System.out.println(s.toString());
System.out.println("rep is " + s.repOk());
}
}
8.9. LINKED LISTS 109
}
}
We now show how to specify and implement singly linked lists and their various operations.
First we provide a specification. Note that Node is an inner class of LinkedList.
/* ABSTRACTION FUNCTION:
* The abstraction function gives the sequence of values stored in the
* successive nodes. It is defined recursively. + is sequence
* concatenation and lambda is the empty sequence.
*
* AF(h) = h.val + AF(h.nxt)
* AF(null) = lambda
*
*
*
* REPRESENTATION INVARIANT REP(h)
* The representation invariant requries that lists be acyclic: a node
* cannot point to an "earlier" node in the list.
* We first define a function reach(n) that gives all the nodes that are
* "reachable" from a node n:
*
* reach(n) = n.nxt union reach(n.nxt)
110 CHAPTER 8. DATA ABSTRACTION
* reach(null) = emptyset
*
* Then
*
* acyclic(n) = n notin reach(n)
*
* states that node n is not part of a cycle, since otherwise n would be
* reachable from itself. The rep. invariant states that every node in the
* list (including the head node h) is not part of a cycle:
*
* REP(h): (forall n : n in h union reach(h) : acyclic(n))
*
* We use h union reach(h) in the range since h is not necessarily in
* reach(h). Also, REP(h) permits h = null. This is necessary,
* since h = null represents an empty list, which we otherwise could
* not represent if we required h != null as part of REP(h).
*
* Abbreviations: AF = AF(h), REP = REP(h)
*/
// CONSTRUCTORS
public LinkedList() {
//EFFECTS: Creates an empty linked list.
h = null;
}
public LinkedList(int i) {
//EFFECTS: Creates a linked list consisting of a single node containing i
h.val = i;
h.nxt = null;
}
// METHODS
}
}
How do we implement insert and delete? We formalize the specification of insert as follows.
//{AF(h) = L}
insert;
//{AF(h) = i + L}
}
where L is a constant of type “sequence of integers”, which includes the empty sequence. We
expand the postcondition, using the definition of AF:
h.val = i ∧ AF(h.nxt) = L
We must introduce a new node v to hold the inserted value i. Hence we require v.val = i. This
is easy to establish using
The postcondition is h.val = i∧AF(h.nxt) = L. We can make h.val = i true (upom termination)
by setting h to v. However this does not in general make AF(h.nxt) = L true. So, we calculate
the precondition needed:
//{v.val = i /\ AF(v.nxt) = L}
h = v;
//{h.val = i /\ AF(h.nxt) = L}
v.val = i is established by the previous piece of code. We can make AF(v.nxt) = L true by
exploiting the precondition AF(h) = L: just set v.nxt to h. So, working backwards, we get:
//{v.val = i /\ AF(h) = L}
v.nxt = h;
//{v.val = i /\ AF(v.nxt) = L}
h = v;
//{h.val = i /\ AF(h.nxt) = L}
Now we add the code to create v and set v.val, add the header, and simplify the postcondition
at the end to obtain the complete annotated insert method:
112 CHAPTER 8. DATA ABSTRACTION
//{AF(h) = L}
Node v = new Node();
//{AF(h) = L}
v.val = i;
//{v.val = i /\ AF(h) = L}
v.nxt = h;
//{v.val = i /\ AF(v.nxt) = L}
h = v;
//{h.val = i /\ AF(h.nxt) = L}
//{AF(h) = i + L}
}
Note that we have to be carefull when using the assignment axiom with pointer structures. For
example, the postcondition h.val = i ∧ AF(h.nxt) = L suggests the assignment h.nxt := h,
since replacing h.nxt by h in AF(h.nxt) = L results in AF(h) = L, which is the precondition,
i.e.,
//{AF(h) = L}
h.nxt = h;
//{h.val = i /\ AF(h.nxt) = L}
However, this is obviously wrong since it does not actually use the value i to be inserted.
One problem is that h.nxt = h violates the representation invariant REP(h): it creates a cycle
consisting of the single node h. Our solution above does not violate REP(h).
The lesson is that we have to (1) be careful with pointers, (2) check that our rep. invariant
makes sense, (3) check that our code preserves the rep. invariant, and (4) keep in mind that
development like the above is only “semi formal”, and prone to logical error if we are not careful.
Developing sound proof rules for pointer structures is still a research problem. Some rules have
been developed, but they are quite difficult to apply, and result in very detailed and tedious
tableaux and proofs.
To implement delete we formalize the specification as follows.
//{AF(h) = i + L}
delete;
//{AF(h) = L}
}
//{AF(h) = i + L}
//{h.val = i /\ AF(h.nxt) = L}
h := h.nxt; \\
//{AF(h) = L}
}
We will develop a representation for unordered binary trees, and illustrate it using a program
to sum up the nodes of the tree (each of which contains an integer value).
Each node of the tree is given by:
We will manipulate the instance variables i, l, r using references and assignments, i.e., we assume
that our tree traversal method is part of a class that has access to these instance variables. If
not, we can always replace references and assignments by the obvious getter and setter methods.
/* ABSTRACTION FUNCTION:
* The abstraction function gives the tree of the values stored in the
114 CHAPTER 8. DATA ABSTRACTION
// CONSTRUCTORS
public BinaryTree(int v) {
//EFFECTS: Creates a binary tree consisting of a single node containing v
r.val = i;
r.left = null;
r.right = null;
}
We must visit each node in the tree at least once and compute the sum of the values stored at
all the nodes. The specification is as follows:
A tree is a naturally recursive data structre and many algorithms are most naturally expressed
as recursion on the left and right subtrees. We can sum a tree by recursively computing the
sum of the left and right subtrees and then adding the value of the root. Using the proof rule
for conditional correctness of recursive procedures, we can assume that the recursive calls work
correctly. Termination is easy: since the resursion is on subtrees, we can simply use the number
of nodes in the tree as the variant function: this is obviously always > 0, and it decreases on
each recursive call. We obtain the following (where ret is an auxiliary variable denoting the
returned value):
//{REP(r)}
if (r == null)
//{REP(r) /\ r=null}
return 0;
//{ret = 0 = SUM(r) = (SIGMA n : emptyset : n.val)}
else {
//{r != null /\ REP(r)}
//{REP(r.left)}
sumL = add(r.left);
//{sumL = SUM(r.left) = (SIGMA n : n in r.left union desc(r.left) : n.val)}
}
}
116 CHAPTER 8. DATA ABSTRACTION
Chapter 9
Iterator Abstraction
9.1 Overview
An iterator is used to provide a client with access to every element of a collection, without
giving the client access to the representation of the collection:
For example, suppose we wish to access all the elements of an IntSet (see chapter 4 for discussion
of IntSet), one after the other, e.g., to compute their sum. We could do this by obtaining the
vector els that represents the IntSet and accessing els[0] through els[top − 1]. However, this
exposes the representation and destroys the encapsulation of IntSet. If the representation were
later changed to a search tree (e.g., to permit more efficient searching), then the client code
that uses els would also have to be changed.
117
118 CHAPTER 9. ITERATOR ABSTRACTION
1. boolean valued hasNext which returns true iff there are more elements in the collection
to access, and false otherwise, and
2. Object valued next, which returns the next element in the collection, and raises an
exception if there are no more elements to access. See Figure 6.3 (page 129) in the course
textbook.
3. remove, which Removes from the underlying collection the last element returned by the
iterator. It has void return type.
An iterator returns a special kind of data object called a generator, which keeps track of the
state of a particular iteration (there can be several iterations over the same collection, each
represented by a different generator). For example, the following method can be added to an
implementation of IntSet.
It returns an instance of an inner class IntGenerator, whose definition is also aded to IntSet.
The class IntGenerator implements the Iterator interface:
The generator object that an iterator returns is a subtype of the iterator, and provides the
actual implementations of hasNext and next. As stated above, the generator is defined by an
inner class that implements the Iterator interface.
The behavior of the generator is defined by the specification of the iterator; the generator has
no specification of its own:
So, the generator in this case will return each element of IntSet exactly once, and in an
arbitrary order. The generator has a precondition, given by the first requires clause, that the
els array does not contain duplicates. Also, note that there is a second requires clause, at the
end. This gives constraints on the use of the generator: the code using the generator must not
modify IntSet while the generator is being used, i.e., after the generator has returned some, but
not all, of the elements. Since this is a requirement on the use of the generator rather than a
precondition on the data input to the generator, it is written separately from the first requires
clause.
An Iterator or a generator usually does not modify this (the enclosing object), but it can,
if the object is mutable. In this case, the specification of the Iterator should state what
the modification is and whether it is done by the Iterator (e.g., the code implementing the
modification is in the body of elements) or by the generator (e.g., the code implementing the
modification is in the body of next or hasNext).
A data abstration can have several iterators. For example, we could provide a second iterator
that produces the elements in increasing order.
Here is how we can use an iterator to compute the sum of the elements in an IntSet. Note
that the hasNext method returns an object of type Object, and so this has to be cast into an
Integer object, and then the value can be extracted using the intValue method.
Here is an implementation for an iterator that returns the elements of an IntSet. The iterator
assumes that els does not contain duplicates, and so can be used only with the implementations
whose representation invariant includes the no duplicates condition. Note that the generator
IntGenerator is defined as a private static inner class. Hence clients do not have access to the
type IntGenerator, e.g., they cannot declare variables of this type. Clients also cannot access
the instance variables s and n. Clients can only access IntGenerator via its iterator, in which
case they get a reference to an IntGenerator which they can use to invoke the hasNext and
next methods.
120 CHAPTER 9. ITERATOR ABSTRACTION
The generator IntGenerator has a representation invariant, which states that n is always
between 0 and top. Note that top is referred to as s.top, since that is what the code of
IntGenerator uses. So, you can refer to the instance variables of the containing object (IntSet
in this case) to state the representation invariant and abstration function for a generator.
The abstraction function for a generator is generic: it always gives the current state of the
iteration. In this case, it gives the subset of IntSet consisting of the elements that remain to be
returned.
// inner class
private static class IntGenerator implements Iterator {
private IntSet s; // the IntSet being iterated
private int n; // index of the next element to consider
IntGenerator(IntSet is) {
// REQUIRES: is != null
// Representation invariant: 0 <= n <= s.top
// Abstraction function:
// AF(n, s.els, s.top) = { x | (exists i : n <= i < s.top : s.els[i] = x) }
s = is;
n = 0;
}
Testing
The lecture notes are based on chapter 10 of Program Development in Java, by Barbara
Liskov and John Guttag, Addison-Wesley, 2001.
10.1 Overview
Test a program: run the program on some input, see if the output is “as expected,” i.e., if input
satisfies precondition, then output should satisfy postcondition. A single pair (input, expected
output) is called a test case. A set of test cases is called a test suite, or test set.
Tests are applied either to the main loop of a program or to individual modules (procedures
and abstract data types).
Testing that is based only on the specification. Knowledge of the code is (purposely) not used.
Advantages of black box testing:
• Testing is not influenced by the code: if the code omits certain cases that should be
handled, a test based on that code will also omit the cases.
• Testing is robust w.r.t. changes in the implementation: once a test suite has been devel-
oped, the same suite can continue to be used if changes in the code are made (but not if
changes in the specification are made).
• Only knowledge of the specification is needed to interpret the results of testing; knowledge
of the code is not required.
121
122 CHAPTER 10. TESTING
The test suite should contain cases for “typical” values and also cases for “atypical” values.
For example, if the input is a set, then typical values will be sets containing several elements.
Boundary values will be the empty set and a singleton set (set containing one element). If an
input is an integer, then there should be test cases for the maximum and minimum possible
value (e.g., 215 − 1 and−215 for 16-bit two’s complement integers). If feasible, all combinations
of maximum and minimum values for all numerical inputs should be tested.
This is also called glass-box testing. White-box testing takes both specification and code into
account. The idea is to “cover” the code as thoroughly as possible. There are three main
notions of coverage:
2. Branch coverage: every direction of a branch is executed by at least one test case.
3. Path coverage: every possible path through the code is taken in at least one test case.
In practice, 100% coverage is impractical. The program may contain “dead code,” i.e., code
that is unreachable. The existence of dead code is an undecidable problem. For statement
and branch coverage, we aim for a “high percentage”. If the existence of dead code is ruled
out manually, then we can aim for 100% coverage, but in large programs this may require an
impractically large number of test cases.
Complete path coverage is usually impractical or impossible. A sequence of n if statements has
2n paths. A loop with a variable number of iterations, like a while loop, has an infinite number
of paths, one for each number of iterations that the loop could execute. So, we usually settle
for the following:
• For a loop with a fixed number of iterations (assumed > 2), use a test case that iterates the
loop twice. This checks that the transition from the end of one iteration to the beginning
of the next works properly.
• For a loop with a variable number of iterations, use test cases for 0, 1, and 2 iterations.
For the 0 iteration test case (i.e., the loop terminates immediately), include one test case
for each disjunct of ¬B, where B is the looping condition.
• For recursive procedures, include a test case that causes the procedure to return without
making any recursive calls (to check that termination is handled properly), and a test case
that causes exactly one recursive call (to check that recursive calls are handled properly).
This approach is a compromise: it may fail to detect errors (e.g., that happen after 3 iterations
of a loop) but it is also practical: it does not require excessive computational time and space.
10.4. TESTING ABSTRACT DATA TYPES 123
We test each method of the abstract data type as discussed above. However, we include the
representation invariant I in the postcondition of constructors (so they have the form Q∧I), and
the representation invariant in the precondition and postcondition of methods (so preconditions
have the form P ∧ I and postconditions have the form Q ∧ I).
So far we have discussed testing of a single procedure. However, a procedure calls other proce-
dures, in general. This means that these other procedures are also being tested, and so is the
interface between the calling procedure and the callee (many errors happen at the interface).
Sometimes, we wish to isolate and test a single procedure A by itself. This is because:
• If a test reveals a problem, the error will most likely be in A itself, rather than in a
procedure arbitrarily down the “call chain” from A.
However, there is a problem: how can A actually execute if the procedures it calls have not
yet been coded? The answer is that we provide stubs to take place of these as-yet-unavailable
procedures. There are two ways of doing this:
• Insert code that interacts with a user who manually “simulates” the effect of the called
procedure by providing appropriate results of the call.
When we test procedure A in this manner, we are performing a unit test. If a unit test returns
an incorrect value, then there are three possible explanantions:
• the input data of the test is incorrect, e.g., does not satisfy the precondition of A, or
• some stub has produced a result that is not a good enough approximation, and this has
lead to the error.
So, the possibilities for where the error is are much more restricted than where all procedures
have been implemented and stubs are not used. In the latter case, the error can be in any
procedure reachable (transitively) by procedure calls from A.
One every abstraction has undergone unit testing, it stil is a good idea however, to test the
entire program (or large parts of it) at once. This is called integration testing. The purpose
of integration testing to ensure that everything “fits together,” so that there are no “interface
errors” for example, (e.g., a procedure that A calls does not do the “right thing”, i.e, its
specification needs to be changed).
124 CHAPTER 10. TESTING
We have seen that specifications are used to guide design, implementation, and testing. Another
use for specifications is in defensive programming, i.e., instrument the program with checks that
may detect errors:
• For each procedure, check the precondition when the procedure is called, and the post-
condition just before the procedure returns.
Some checks may be too expensive to leave in the production version of a program. For example,
a binary search procedure has the precondition: “the array is sorted”. Checking this increases
the time complexity from logarithmic to linear in the size of the array, and obviously defeats
the benefit of binary search (just use linear search in this case!).
We should however, retain all checks during testing, since performance is not an issue (unless
we are testing the performance itself, or doing profiling, in which case we disable the checks).
If a check is sufficiently fast, then we can leave it in the production version, since there are still
likely to be undiscovered errors.
Chapter 11
Requirements Specifications
11.1 Overview
We will use a data model to describe a state space. This consists of:
– Nodes: each node represents a named set of items, e.g,. File, Dir. Data domains
are sets that do not have supersets. Data domains are disjoint (from each other).
– Edges: these represent relations over the sets denoted by nodes
– designations for the intended meaning of each set and relation. A designation is
an informal description, written in English, and not in mathematical notation. A
good designation enables one to recognize elements of a set.
– Additional constraints:
∗ Definitions of derived relations
∗ global constraints
125
126 CHAPTER 11. REQUIREMENTS SPECIFICATIONS
Fixed Set The membership is fixed for all time. The set does not gain or lose members.
Notation is a box with double bars on both the left and right sides.
Static Set An element cannot switch from being in the set to being in a different set, or vice
versa. For example, in a file system, the set of files is static, since a file cannot become
a directory, and a directory cannot become a file. Notation is a box with double bars on
the left side.
Size of sets
Constraints on the size of a set are indicated in the lower right corner of the box representing
the set.
Subset edges
Subset edges depict the subset relation. Subset edges in the graph are shown by a closed
arrowhead. Subset edges go from the subset to the superset. If the arrowhead is filled, then the
sets are equal. If the arrowhead is not filled, then the subset is a proper subset, i.e., it is not
equal to the superset.
Two sets can share a subset edge, in which case they are disjoint.
Relation edges
Relation edges depict binary relations. Relation edges in the graph are shown by an open
arrowhead. They are labeled with the name of the relation. Relation edges go from a source
node to a target node. Relations with the same source node must have different names. The
inverse of a relation is indicated in brackets preceded by ∼, e.g., r1(∼ r2) indicates that r2 is
the inverse of r1.
Multiplicity of relations
We annotate the source end of a relation to indicate the size of the preimage, and the target end
of a relation to indicate the size of the image. We use the following multiplicity annotations: *,
+, ?. ! Which have the following meaning:
* : 0 or more
+ : 1 or more
! : exactly one
? : 0 or 1
So, for example, annotating the target end of relation r with ! is equivalent to the statement:
∀x ∈ X : |Imager (x)| = 1
and annotating the source end of relation r with ! is equivalent to the statement:
∀y ∈ Y : |PreImager (y)| = 1.
Similarly for the other multiplicity annotations *, +, and ?.
Mutability
If Imager (x) can change, the we say that the target end of relation r is mutable. Otherwise it
is immutable. We indicate immutability by placing a “|” at the target end, which is equivalent
to the statement:
∀x ∈ X : Imager (x)post = Imager (x)pre
If PreImager (y) can change, the we say that the source end of relation r is mutable. Otherwise it
is immutable. We indicate immutability by placing a “|” at the source end, which is equivalent
to the statement:
∀y ∈ Y : PreImager (y)post = PreImager (y)pre
• Part 1 : For each set and relation, a short (1 sentence) designation of its intended
meaning.
The way to recognize a derived relation is to attempt to define it in terms of the other relations
in the graph. E.g., in a file system, the child relation can be defined as the inverse of the parent
relation.
Thus, a derived relation is really a definition, and does not by itself add any new global
constraints. So, it does not restrict behavior in any way. It does however “inherit” the
global constraints that mention the entities in terms of which the relation is defined.
It is thus almost a misnomer to consider a derived relation to be a “constraint.”
Form 2 : global constraints that restrict behavior, i.e., that enforce the intended meaning
of the sets and relations. We may give some global constraints separately from the data
model (and after the data model has been given).
In terms of the state space given by the data model, a global constraint limits the set of reachable
states, i.e., those states that occur in some computation. For example, the constraint that all
account balances must be positive really means that there can be no reachable state in which
some account balance is negative.
A requirements specification describes the data model of the system and the operations that a
system provides to the users/application domain. There are two kinds of operations:
static operations are invoked when the application is not already running (e.g. “start up”)
• An effects clause (postcondition): this describes the effect of executing the opera-
tion
All operations (static and dynamic) must preserve the constraints (graphical and textual) of
the data model. That is:
if an operation is started in a state that satisfies all the constraints, then it must terminate in
a state that satisfies all the constraints.
Reasons for modifying the definition of an operation:
The data model is internal to the machine, as the user cannot observe it directly. The operations
define the interface between the user and the machine, since the user invokes the operations and
receives their results. Thus, the opreations can expose some (but not, in general, all) aspects
of the data model to the user. This could be formalized as a “projection” operation.
For an interactive program, we need to consider these two issues before writing a specification.
Issue 1 : If the program is interactive, then the operations take string arguments, and produce
no results, since they produce output on a screen, which technically is a “side effect”.
Issue 2 : We must (in general) define the data formats that are used in communicating with the
user. These are always constraints on strings.
In an interactive program, operations must be “total,” since a user can always invoke the
operation interactively. Thus, the precondition is replaced by a checks clause. If the check
fails, then the operation does not execute normally, but inform the user of the problem via a
dialog box. Thus, we only need to consider the modifications made by the effects clause
under the condition that the checks clause is true.
130 CHAPTER 11. REQUIREMENTS SPECIFICATIONS
Chapter 12
Acknowledgment: these lecture notes are based on chapters 12 and 13 of Program Develop-
ment in Java, by Barbara Liskov and John Guttag, Addison-Wesley, 2001.
The requirements specification is given in Section 12.4 of the textbook. A rough sketch is:
• Fetch documents from websites (given by URL’s) and add to the existing collection of
documents.
Textbook Figure 12.12 gives the data model graph. The main domains are:
131
132 CHAPTER 12. EXAMPLE SPECIFICATION FOR A WEB SEARCH ENGINE
S For a match m, m.sum contains the total count of ocurrences of all keywords in the
corresponding document (m.doc):
(∀ m : m ∈ Match : m.sum = sumAll(m.doc, Key))
M2 Match contains exactly the set of documents that satisfy the matches predicate, i.e.,
every document in Match satisfies matches, and every document that satisfies matches is
in Match:
(∀ m : m ∈ Match : matches(m.doc)) ∧
(∀ d : d ∈ Doc : matches(d) ⇒ (∃ m : m ∈ Match : d = m.doc))
Figure 12.1 and textbook Figure 12.14 gives the operations, which are:
• makeCurrent(t): Set Cur to the document with title t. We assume that documents have
unique titles.
For our purposes (i.e., to implement Engine without dealing with www and Interent issues) we
will replace the operation addDocuments(u) by the operation addDocFromFile(f ) which reads
in a single document from a named file f that resides in the same directory as Engine. Define
document(f ) to be the document that corresponds to file f . We introcude the data set FILES,
which stores the names of files whose documents have already been added.
C is an implicit precondition for all operations. We do not include it in the checks clause since
it is not explicitly checked. Instead:
• the operations are designed so that C holds upon termination of every operation.
The engine has a private file that contains the list of uninteresting words.
Static Operations
startEngine()
effects: Starts the engine running with NK containing the words in the private file.
All other sets are empty.
Dynamic Operations
query(String w)
checks: w 6∈ NK ∧ WORD(w)
effects
English: sets Key = {w} and makes Match contain the documents that match w,
ordered as required. Clears CurMatch.
Formal: C ∧ Keypost = {w} ∧ CurMatchpost = ∅
queryMore(String w)
checks: Key 6= ∅ ∧ w 6∈ NK ∧ WORD(w) ∧ w 6∈ Key
effects
English: Adds w to Key and makes Match be the documents already in Match that
additionally match w. Orders Match properly. Clears CurMatch.
Formal: C ∧ Keypost = Keypre ∪ {w} ∧ CurMatchpost = ∅
makeCurrent(String t)
checks: t ∈ Title
effects:
English: Makes Cur contain the document with title t.
Formal: C ∧ Curpost = {d : d.title = t}.
makeCurMatch(String i)
checks: i represents a natural number that is an index in Match,
i.e., 0 6 i 6 |Match| − 1.
effects
English: Makes CurMatch contain the i’th entry in Match.
Formal: C ∧ CurMatchpost = {m : m ∈ Matchpre ∧ m.ind = i}.
addDocFromFile(String f )
checks: f names a file in the current directory that is not in FILES
effects
English: Adds f to FILES and the document in file f to Doc.
If Key is nonempty and the document matches the keywords, then
adds the document to Match and clears CurMatch.
Formal: C ∧ FILESpost = FILESpre ∪ f ∧ d = document(f ) ∧
matches(d) ⇒ Matchpost = Matchpre ∪ m where m.doc = d
Doc findDoc(String t)
effects: If t 6∈ Title throws NotPossibleException,
else returns the document with title t, i.e., document d such that d.title = t.
Preserves C.
13.1 Operations
We will present the operations first, introducing data sets, basic relations, and derived relations,
as needed to express the operation specifications.
We have in mind that a class is a particular instance of a course, i.e., the course given in a
particular semester, and a particular section.
NEXT(sm) is a predicate which is true iff sm is the semester for which class registration is in
effect. Note that most of the time (but not always, e.g., during add/drop period), this will be
the “next” semester.
We also need the following basic relations:
id: gives the identity number of a student
semester: gives the semester that a class is held in
course: gives the course that a class corresponds to
prerequisite: gives the prerequisites for a course
registered: gives the classes that a student is registered in
hours: gives the number of credit hours of a course
And the following derived functions:
classesPassed(Student s) gives the classes that student s has passed
135
136 CHAPTER 13. EXTENDED EXAMPLE: A STUDENT INFORMATION SYSTEM
Dropping a class is done w.r.t. the “current” semester, as oppposed to the “next” semester
for registering, so we define CURRENT(sm) to be a predicate which is true iff sm is the current
semester.
We also need the following derived function:
loadCur(Student s) , (Σ cl : cl ∈ s.registered ∧ CURRENT(cl.semester) : cl.course.hours)
which gives the load of a student s in the current semester.
To compute a grade point average, we divide the total points (grades) by the total hours. Grades
are obtained from the students transcript, which we now define to consist of a set of entries,
one per class completed. We also need to know the major of a course, to compute the major
average.
So, we add the following data sets:
Grade : Grade
Entries : Transcript entries
Major : Major
Transcript : Transcript
and the following basic relations:
trans: gives the transcript of a student
entries: gives the entries of a transcript
class: gives the class corresponding to a transcript entry
grade: gives the grade corresponding to a transcript entry
cmaj: gives the major of a course
smaj: gives the major of a student
Since GPA calculation is similar for all types of GPA (semester, major, and cumulative), we
abstract it by proving a single helper function:
avg(Student s, Set of Class CL) ,
//returns the GPA of student s for the set of classes CL
points := (Σ cl : cl ∈ CL : grade(s, cl) ∗ cl.course.hours)
hours := (Σ cl : cl ∈ CL : cl.course.hours)
return points/hours
grade(Student s, Class cl) ,
//returns the grade that student s obtained in class cl
let e ∈ s.trans.entries be such that e.class = cl
return e.grade
We assume that student s has completed a class iff there exists an entry for that class in the
transcript of s:
classesCompleted(Student s) , {cl | ∃ e ∈ s.trans.entries : e.class = cl}
gives the classes completed by student s.
We add the following derived functions, which are needed for the previous section.
classesPassed(Student s) , {cl | ∃ e ∈ s.trans.entries : e.class = cl ∧ e.grade > 60}
gives the classes that student s has passed
coursesPassed(Student s) , {c | (∃ cl ∈ classesPassed(Students) : cl.course = c)}
gives the courses that student s has passed
Now, all we have to do is write helper functions to select the needed sets of classes: those com-
pleted in a particular semester, those completed for a major course, and all classes completed.
We do this in the following subsections.
138 CHAPTER 13. EXTENDED EXAMPLE: A STUDENT INFORMATION SYSTEM
Semester GPA
Major GPA
classesCompletedMaj(Student s) ,
{cl | cl ∈ classesCompleted(s) ∧ cl.course.cmaj = s.smaj}
majGPA(Student s) , avg(s, classesCompletedMaj(s))
operation checkMajGPA(IdNum n)
checks
English: n is the id number of a student s
Formal: n. ∼ id ∈ Student
effects
English: returns major average of student s
Formal: s := n. ∼ id;
return majGPA(s)
Cumulative GPA
operation checkGPA(IdNum n)
checks
English: n is the id number of a student s
Formal: n. ∼ id ∈ Student
effects
English: returns cumulative average of student s
Formal: s := n. ∼ id;
return GPA(s)
13.1. OPERATIONS 139
I will give an operation for checking the major courses needed for graduation. Courses in other
categories (e.g., humanities) can be handled in a similar manner.
We need the derived function
majorCoursesPassed(Student s) , {c | c ∈ coursesPassed(s) ∧ c.cmaj = s.smaj}
which gives the set of major courses that student s has passed.
operation checkMajNeeded(IdNum n)
checks
English: n is the id number of a student s
Formal: n. ∼ id ∈ Student
effects
English: returns number of major courses that student s must still pass in order to graduate
Formal: s := n. ∼ id;
return majCoursesReqd(s.smaj) − majorCoursesPassed(s)
Here majCoursesReqd(s.smaj) is a function defined by the course catalogue that gives the set of
major courses required for each major. Note that we are ignoring the more complicated reality,
where some specific major courses are required, while others are electives.
operation checkProbation(IdNum n)
checks
English: n is the id number of a student s
Formal: n. ∼ id ∈ Student
effects
English: returns probation status of student s
Formal: x := |probationSemesters(n. ∼ id)|
if x = 0 return “no probation”
else if x = 1 return “probation I”
else if x = 2 return “probation II”
else if x = 3 return “dropped from the faculty”
140 CHAPTER 13. EXTENDED EXAMPLE: A STUDENT INFORMATION SYSTEM
operation checkDeansHonorList(Student s)
checks
English: n is the id number of a student s
Formal: n. ∼ id ∈ Student
effects
English: returns Deans honor list status of student s
Formal: let sm be the last semester that completed (i.e., classes finished),
excluding summer semester;
s := n. ∼ id;
if semGPA(s, sm) > 85 return “on honor list”
else return “not on honor list”
We now present all of the data sets, basic relations, and derived relations, in one place, and
give the data model graph.
Class : Class
CourseName : CourseName
Course : Course
Entries : Entries
Grade : Grade
IdNum : IdNum
Major : Major
Name : Name
Student : Student
Semester : Semester
Transcript : Transcript
Note that we added CourseName and Name . These will be needed for other functions, e.g.,
printing out a complete transcript.
In practice, you would draw the relevant parts of the graph as you introduce data sets and basic
relations.
Principle: try to write the operations first. First write the checks and effects caluses informally
in English. Then, attempt to formalize them. This will lead you to discover what are the formal
entities needed (data sets, basic realtions, and derived relations) in order to express the checks
and effects clauses formally.
As you determine these entities, add them to the current data model graph, so that the graph
is built up gradually, as you write one operation after another.
semGPA(Student s, Semestersm) ,
points := (Σ cl : cl ∈ classesCompletedSem(s, sm) : cl. ∼ class.grade)
hours := (Σ cl : cl ∈ classesCompletedSem(s, sm) : cl.class.hours)
return points/hours
majGPA(Student s) ,
points := (Σ cl : cl ∈ classesCompletedMaj(s) : cl. ∼ class.grade)
hours := (Σ cl : cl ∈ classesCompletedMaj(s) : cl.class.hours)
return points/hours
GPA(Student s) ,
points := (Σ cl : cl ∈ classesCompleted(s) : cl. ∼ class.grade)
hours := (Σ cl : cl ∈ classesCompleted(s) : cl.class.hours)
return points/hours
Chapter 14
Relations:
name: maps a DirEntry to a Name
contents: maps a DirEntry to an FSObject
first: selects the first name in a pathname
rest: selects the rest of a pathname, i.e., the pathname with the first name removed
143
144CHAPTER 14. EXAMPLE REQUIREMENTS SPECIFICATION FOR A FILE SYSTEM
We use the standard notation for relation application: x.R = ImageR (x) = {y | R(x, y)}. We
also write ∼ R for the inverse of a relation: ∼ R = {(y, x) | (x, y) ∈ R}.
FSObject *
parent
(∼children)
? Root
1
File Dir
+ ! +
Cur 61
insides
pn
!
entries
BitString
rest
* ? *
name first
DirEntry Name PathName
+ ! ! +
contents ?
Derived relations are defined in terms of the existing relations and sets. These do not change
the behavior given by the model, but just introduce more convenient terminology.
The most obvious relation in a file system is the one which tells us which directory another
directory is contained in. We call this the parent of the latter directory.
for d ∈ Dir: d.parent , {d2 | d2 ∈ Dir∧(∃e : e ∈ DirEntry : e ∈ d2.entries∧d = e.contents)}
Initially we must define d.parent as a set, since it is unclear that there is only one parent for
each directory. This must follow from various constraints that we impose.
In the data model graph, contents has a multiplicity constraint of ? on its source end, meaning
that every FSObject is pointed to by at most one DirEntry. Also, entries has a multiplicity
constraint of ! on its source end, meaning that every DirEntry is an entry of exactly one direc-
tory. Hence, every FSObject can have at most one parent. We indicate this by a multiplicity
constraint of ? on the source end of the parent relation, which is therefore implied by the
14.3. CONSTRAINTS 145
definition of parent and the multiplicity contraints on the sources of contents and entries.
Textually, this constraint is:
∀ d : d ∈ Dir : |d.parent| 6 1 (P0)
Note that we give contents a multiplicity constraint of ? instead of ! on its source end only
to accomodate the Root, which must have no parent, and so cannot be pointed to by an entry.
Thus, the source multiplicity of ? instead of ! for parent is only to accomodate the root.
We will also need constraints stating that non-Root directories havce at least one parent, since
the above can be satisfied by having 0 parents for all directories, obviously not what we want.
We give these below.
d.pn contains all pathnames that name a path starting from directory d
Note the recursive nature of this definition: a pathname from d is either (1) the name of a file
or directory in d, (p.rest = ∅) or (2) the name of a directory in d followed by a pathname from
that directory (p.rest ∈ e.contents.pn).
14.3 Constraints
Rough sketch: non-root objects have at least one parent, and the root has no parent.
Root.parent = ∅ (P1)
∀ d : d ∈ Dir − Root : |d.parent| > 1 (P2)
Question: are these constraints sufficient to guarantee a “well-formed” file-system, i.e., one
where the parent relation is a single tree with root Root?
Answer: No
Rough sketch: a directory is not it’s own ancestor, where the ancestors of a directory are its
parent, the parent of its parent, etc.. So, we need to first define a helper function that computes
the ancestors of a directory. We can define ancestors as the transitive closure of parent:
ancestors(d : Dir)
if d = Root then ∅
else {d.parent} ∪ ancestors(d.parent)
146CHAPTER 14. EXAMPLE REQUIREMENTS SPECIFICATION FOR A FILE SYSTEM
Note the recursive nature of this definition: the ancestors of a non-Root FSObject are its parent
together with the ancestors of its parent.
Note also that this relation is well-defined even if the parents relation is cyclic, since transitive
closure is defined for any binary relation. In other words, the ancestors of d are all the directories
that can be reached by following parent “edges” starting from d. This is just graph reachability,
and is obviously well-defined.
As a recipe for computation however, the above definition would not terminate when executed
over a cyclic parent relation, but it still nevertheless defines a unique relation. A “bottom up”
evaluation of ancestors, that detects the fixed point, would terminate. Formally, we can write
ancestors(d) = µZ(d.parent ∪ Z.parent), where µ is the “least fixed-point” operator.
The point of this dicussion is for you to realize that, in writing a specification, your relations
must only be well-defined, they need not prescribe a terminating computation.
We can now formally state the acyclicity constraint:
∀ d : d ∈ Dir : d 6∈ ancestors(d) (A)
Rough sketch: every directory except the root has the root as an ancestor
∀ d : d ∈ Dir : d 6= Root ⇒ Root ∈ ancestors(d) (R)
Question: are the acyclicity and reachability constraints sufficient to guarantee a well-formed
file-system?
Answer: yes.
∀ f : f ∈ File :
∀ d1, d2 : d1, d2 ∈ Dir ∧ d1 6= d2 :
¬(∃ e1, e2 : e1, e2 ∈ DirEntry :
e1 ∈ d1.entries ∧ f = e1.contents ∧
e2 ∈ d2.entries ∧ f = e2.contents)
We notice the repetition of the phrase “e ∈ d.entries ∧ f = e.contents” three times in the
above constraints. Also, this phrase has a well-defined meaning: file f occurs in directory d.
This suggests that this phrase should be packaged as a helper definition:
14.3. CONSTRAINTS 147
∀ f : f ∈ File :
∀ d1, d2 : d1, d2 ∈ Dir ∧ d1 6= d2 :
¬(occurs in(f, d1) ∧ occurs in(f, d2))
This is a significant improvement on the first version of these two constraints, as it is much
shorter and much more readable. Even better is to combine both constraints into a single one,
which states that each file occurs in exactly one directory:
∀ f : f ∈ File : |{d : d ∈ Dir ∧ occurs in(f, d)}| = 1
Finally, we extend the parent relation to files:
for f ∈ File : f.parent , {d : d ∈ Dir ∧ occurs in(f, d)}
By the previous constraint, this is a singleton set, and so the parent of a file is unique. We thus
indicate the parent reltation in the data model graph (Figure 14.1) as going from FSObject to
Dir (the text indicates it going from Dir to Dir).
Rough sketch: a directory does not contain two entries with the same name.
∀ d : d ∈ Dir :
∀ e1, e2 : e1, e2 ∈ DirEntry ∧ e1, e2 ∈ d.entries ∧ e1 6= e2 :
e1.name 6= e2.name
Notice that e1, e2 ∈ d.entries implies that e1 and e2 have type DirEntry, i.e., e1, e2 ∈
DirEntry. Hence we can shorten the above to:
∀ d : d ∈ Dir :
∀ e1, e2 : e1, e2 ∈ d.entries ∧ e1 6= e2 :
e1.name 6= e2.name
Rough sketch: a directory does not contain two entries (with different names) for the same
FSObject. This constraint is implied by the multiplicity annotation on the source of entries.
However, it is sufficiently important that we restate it explicitly.
∀ d : d ∈ Dir :
∀ e1, e2 : e1, e2 ∈ d.entries ∧ e1 6= e2 :
e1.contents 6= e2.contents
148CHAPTER 14. EXAMPLE REQUIREMENTS SPECIFICATION FOR A FILE SYSTEM
We must show that all global constraints hold before and after every operation execution. That
is, constraints are really system-wide invariants. We say that an operation O preserves a con-
straint C iff whenever the operation is executed with the constraint true initially, it terminates
with the constraint true, i.e., {C} O {C} is a valid Hoare triple. It is permissible for the con-
straint to be temporarily violated in the middle of the operation’s execution. To show that a
constraint C is an invariant, we show that (1) C holds initially, i.e., when the system is started,
and (2) every operation preserves C. Likewise, to show that several constraints C1 , . . . , Cn are
invariant, we show that (1) C1 ∧ · · · ∧ Cn holds initially, i.e., when the system is started, and
(2) every operation preserves C1 ∧ · · · ∧ Cn . That is, we treat them like one “large” constraint
which is their conjunction.
The work required to verify these conditions depends on the number and complexity of the
constraints. Thus we wish to have as few and as simple constraints as possible. Now suppose
that a new constraint C 0 is implied by C1 , . . . , Cn , i.e., C1 ∧ · · · ∧ Cn ⇒ C. Then, if C1 ∧ · · · ∧ Cn
holds initially, then so does C1 ∧ · · · ∧ Cn ∧ C 0 . Also, if {C1 ∧ · · · ∧ Cn } O {C1 ∧ · · · ∧ Cn }, then
{C1 ∧ · · · ∧ Cn ∧ C 0 } O {C1 ∧ · · · ∧ Cn ∧ C 0 }. Hence, no extra work needs to be done for constraint
C 0!
Thus, we partition the set of constraints into a set of basic constraints, and a set of implied
constraints, such that every implied constraint follows from the conjunction of the basic con-
straints. In general, there are several ways to do this partitioning. We seek to minimize the
number and complexity of the basic constraints, since this affects the amount of verification
work required.
Consider the constraints P0, P1, P2, A, and R given above. We can show
P0 ∧ P1 ∧ P2 ∧ A ⇒ R
P0 ∧ A ∧ R ⇒ P1 ∧ P2
We thus have a choice of basic constraint sets: {P 0, P 1, P 2, A} and {P 0, A, R}. The second
has fewer constraints, but the first has simpler ones, since R mentions the entire ancestors set
of a directory, whereas P 1, P 2 mention only the parent. Thus we choose {P 0, P 1, P 2, A}.
Note however that P0 follows from the source multiplicity constraints of entries and contents.
So we will replace P 0 by these constraints in the basic set.
All of the operations of a file system specification are interactive, i.e., they are invoked by a user
rather than by another program. Hence, all parameters must be strings since they are supplied
as input from the keyboard (see Section 12.2 of the text). Often, such string inputs must be in
a certain format. For our file system, we define restrictions on names of files and directories,
and on pathnames, as follows:
NAME(n): true iff n is a nonempty string of printable characters not containing /.
Note that a NAME (i.e., an n such that NAME(n) holds) does not include the case of /, i.e., it is a
14.4. FILE SYSTEM OPERATIONS 149
regular name. Thus, Name is the set of regular names (strings for which NAME is true) together
with the root name /.
PATHNAME(n): true iff n is a nonempty sequence of proper NAME’s separated by / and beginning
with either / or with a NAME, and not ending with /.
A pathname starting with / is called absolute, otherwise it is called relative.
As usual, we give the precondition and the effect, for each operation. Operations are partitioned
into two categories: static operations, which are invoked when the application is not already
running, e.g., to create and initialize a new instance of the application, and dynamic operations,
that are invoked after the application is running.
We have one static operation:
operation start()
effects: Creates a new file system consisting of Root only, which is empty
The remaining operations are all dynamic, i.e., that are invoked within a particular file system.
We partition these operations into three sets: (1) operations within the current directory Cur,
(2) operations that take an absolute pathname as input, and (3) operations that take a relative
pathname as input.
We usually give the effects as a relation between the initial (before the operation executes)
and final (after the operation executes) values of the data in the data model graph. For any
data item x, we use xpre for the initial value, and xpost for the final value. For simple operations
like inserting and removing elements from sets, there is not much difference between this style
and using assignment statements to indicate the effects. However, for operations that involve
a complex series of steps, such as sorting an array, there is a significant difference. For array
sorting, “declarative” specification that relates initial and final values is the only reasonable
approach, since any “operational” specification (sequence of assignment statements) is in effect
a particular sorting algorithm, which is really an implementation rather than a specification,
since it is constrained to a particular method for sorting an array, of which there are many.
The first operation we define is for creating a new subdirectory within the current directory.
operation createDirInCur(String n)
checks: NAME(n) and there is a current directory c and c has no subdirectory with name n
effects: creates a new directory that is a subdirectory of c, and has name n and is empty
(∃e : e 6∈ DirEntrypre :
(∃d : d 6∈ Dirpre :
DirEntrypost = DirEntrypre ∪ {e} ∧
Dirpost = Dirpre ∪ {d} ∧
c.entriespost = c.entriespre ∪ {e} ∧
e.namepost = n ∧
e.contentspost = d ∧
d.entriespost = ∅
)
)
operation createDirInCur(String n)
checks
English: NAME(n) and there is a current directory c such that c contains no FSObject
with name n
Formal: NAME(n) ∧ Cur = {c} ∧ (∀ e : e ∈ DirEntry ∧ e ∈ c.entries : e.name 6= n))
effects
English: creates a new directory that is a subdirectory of c, and has name n
Formal:
let e be a new entry not in DirEntrypre
let d be a new directory not in Dirpre
DirEntrypost = DirEntrypre ∪ {e} ∧
Dirpost = Dirpre ∪ {d} ∧
c.entriespost = c.entriespre ∪ {e} ∧
e.namepost = n ∧
e.contentspost = d ∧
d.entriespost = ∅
14.4. FILE SYSTEM OPERATIONS 151
The next operation deletes an empty subdirectory from the current directory.
operation deleteDirInCur(String n)
checks
English: NAME(n) and there is a current directory c such that c has an empty subdirectory
with name n
Formal: NAME(n) ∧ Cur = {c} ∧
(∃ e : e ∈ c.entries : e.name = n ∧ e.contents.entries = ∅)
effects
English: removes the entry for the subdirectory with name n from its parent c
Formal:
let e be such that e ∈ c.entries ∧ e.name = n
let d = e.contents
c.entriespost = c.entriespre − {e} ∧
DirEntrypost = DirEntrypre − {e} ∧
Dirpost = Dirpre − {d}
We now present several operations that take absolute pathnames as input. The first such
operation is makeCurFromRoot(String p) which changes the current directory to one that is
given by the absolute pathname p. In addition to the format restriction, we must check that p
is a valid pathname in the filesystem, i.e., that it names a sequence of directories (starting from
Root) that actually exist in the filesystem. We also need a helper function that determines the
last directory in this sequence.
operation makeCurFromRoot(String p)
checks: p is an absolute pathname leading from Root to some directory d
effects: makes d the current directory
We first consider how to formalize the checks clause of makeCurFromRoot. Consider an ab-
solute path p = /n1 /n2 / . . . /n`−1 /n` . We need to check that p “leads from” Root to some
directory d, i.e., each name ni along p names a directory di that exists as a subdirectory of
the directory di−1 named by the previous name ni−1 . Note that we must be careful to distin-
guish between names, entries, and directories, as different relations apply to them, and they
have different properties. Being careful about this distinction (and refering to the data model
graph) will help us avoid writing undefined expressions, such as d.name (where d is a directory)
or n.parent (where n is a name). Also, remember that names are not globally unique, but
directories are (by definition).
Let us first define the “leads from” condition for the first pair of names along p, namely /
and n1 .1 We require that Root contains some directory (call it d1 ) which has name n1 . Using
the data model sets and relations that we have already defined, we see that we need to use a
1
Notice that the convention of using a leading / to denote the Root is awkward because it makes the meaning
of / context sensitive: the leading / denotes Root, but all the other /’s are just separators between the successive
names along the pathname.
152CHAPTER 14. EXAMPLE REQUIREMENTS SPECIFICATION FOR A FILE SYSTEM
directory entry (call it e) that associates directory d1 (i.e., e.contents = d1 ) with name n1 (i.e.,
e.name = n1 ), so that d1 has name n1 . Also, e must be in the entries of Root (e ∈ Root.entries)
so that d1 is a subdirectory of Root. Putting this together, the correct condition is:
∃ e : e ∈ DirEntry : e ∈ Root.entries ∧ e.contents = d1 ∧ e.name = n1
In a similar way, we define the “leads from” condition for an arbitrary pair of successive names
ni and ni+1 . We assume, “inductively,” that the directory di that is named by ni has been
determined. We then check:
∃ ei : ei ∈ DirEntry : ei ∈ di .entries ∧ ei .contents = di+1 ∧ ei .name = ni+1
Note that ei is uniquely determined by di and ni+1 , since the naming constraint allows at most
one entry in a directory with a given name. Now the next directory di+1 in the sequence is
determined as di+1 = ei .contents. The base case of the induction is of course given by Root
and n1 , i.e., consider d0 = Root.
So, here is the definition of the condition that an absolute pathname is valid.
Definition (valid absolute pathname, absolute path valid(p), dir from root(p)).
A string p is a valid absolute pathname iff the following all hold:
1. PATHNAME(p)
2. p.first = /
3. Let p = /n1 /n2 / . . . /n` . Then there exist directories d0 , d1 , . . . , d` such that
(a) d0 = Root
(b) ∀ i : 0 6 i 6 ` − 1 :
∃ ei : ei ∈ DirEntry : ei ∈ di .entries ∧ ei .name = ni+1 ∧ ei .contents = di+1
we define the predicate absolute path valid(p) to hold iff p is a valid absolute pathname.
We also define the function dir from root(String p) to return d` when
absolute path valid(p) holds, and to be undefined otherwise. We say that p deter-
mines d` .
Notice that dir from root(p) is not defined for all values of the string p, but only for those
such that absolute path valid(p) holds. That is, dir from root(p) is a partial function.
Using partial functions is perfectly fine, and often necessary, as long as you are careful to apply
them only to arguments for which they are defined. In the definition of operations, this is
ensured by using the appropriate precondition (checks or requires clause).
We can now give the full definition of makeCurFromRoot.
14.4. FILE SYSTEM OPERATIONS 153
operation makeCurFromRoot(String p)
checks
English: p is a valid absolute pathname
Formal: absolute path valid(p)
effects
English: sets the current directory to the directory that p determines
Formal: Curpost = dir from root(p)
The next operation deletes an empty subdirectory which is given by an absolute pathname.
operation deleteDirFromRoot(String p)
checks:
English: p is a valid absolute pathname that determines an empty directory d
Formal: absolute path valid(p) ∧ dir from root(p).entries = ∅
effects:
English: removes the entry for d from its parent
Formal:
let d = dir from root(p)
let e ∈ DirEntry be such that e.contents = d
let d0 = e. ∼ entries
d0 .entriespost = d0 .entriespre − {e} ∧
DirEntrypost = DirEntrypre − {e} ∧
Dirpost = Dirpre − {d}
Note the use of relation inverse in e. ∼ entries to indicate the directory d0 that contains
directory entry e. Alternatively, we could have used d0 = d.parent.
The next operation creates a subdirectory with name n inside a directory that is given by an
absolute pathname. The subdirectory is initially empty.
154CHAPTER 14. EXAMPLE REQUIREMENTS SPECIFICATION FOR A FILE SYSTEM
Definition (valid relative pathname, relative path valid(p), dir from cur(String p)).
A string p is a valid relative pathname iff the following conditions all hold:
1. PATHNAME(p)
2. Cur consists of a single directory c (i.e., is nonempty) and p.first = / iff c = Root.
3. Let p = n1 /n2 / . . . /n` or p = /n1 /n2 / . . . /n` , as the case may be. Then there exist
directories d0 , d1 , . . . , d` such that
(a) d0 = c
(b) ∀ i : 0 6 i 6 ` − 1 :
∃ ei : ei ∈ DirEntry : ei ∈ di .entries ∧ ei .name = ni+1 ∧ ei .contents = di+1
We define the predicate relative path valid(p) to hold iff p is a valid relative pathname.
We also define the function dir from cur(String p) to return d` when relative path valid(p)
holds, and to be undefined otherwise. We say that p determines d` .
The next operation is makeCurFromCur(String p) which changes the current directory to one
that is given by a relative pathname.
14.4. FILE SYSTEM OPERATIONS 155
operation makeCurFromCur(String p)
checks
English: p is a valid relative pathname
Formal: relative path valid(p)
effects
English: sets the current directory to the directory that p determines
Formal: Curpost = dir from cur(p)
The next operation deletes an empty subdirectory which is given by a relative pathname.
operation deleteDirFromCur(String p)
checks:
English: p is a valid relative pathname that determines an empty directory d
Formal: relative path valid(p) ∧ dir from cur(p).entries = ∅
effects:
English: removes the entry for d from its parent
Formal:
let d = dir from cur(p)
let e ∈ DirEntrypre be such that e.contents = d
let d0 = e. ∼ entries
d0 .entriespost = d0 .entriespre − {e} ∧
DirEntrypost = DirEntrypre − {e} ∧
Dirpost = Dirpre − {d}
The next operation creates a subdirectory with name n inside a directory that is given by a
relative pathname. The subdirectory is initially empty.
156CHAPTER 14. EXAMPLE REQUIREMENTS SPECIFICATION FOR A FILE SYSTEM
We now notice that we can simplify the specification considerably by having a single operation
handle both absolute and relative pathnames. The starting point for this is to define the notion
of a valid pathname, which can be either absolute or relative.
Definition (valid pathname, path valid(p)). A string p is a valid pathname iff either p
is a valid absolute pathname or p is a valid relative pathname. We define the predicate
path valid(String p) , absolute path valid(p) ∨ relative path valid(p).
Definition (dir from path(p)). We define the helper function dir from path(String p) as
follows:
Note that if absolute path valid(p) and relative path valid(p) both hold, then
dir from root(p) = dir from cur(p). If d = dir from path(p), we say that p determines
d.
The next operation is makeCur(String p) which changes the current directory to one that is
given by a (relative or absolute) pathname.
14.4. FILE SYSTEM OPERATIONS 157
operation makeCur(String p)
checks
English: p is a valid pathname
Formal: path valid(p)
effects
English: sets the current directory to the directory that p determines
Formal: Curpost = dir from path(p)
operation deleteDir(String p)
checks:
English: p is a valid pathname that determines an empty directory d
Formal: path valid(p) ∧ dir from path(p).entries = ∅
effects:
English: removes the entry for d from its parent
Formal:
let d = dir from path(p)
let e ∈ DirEntrypost be such that e.contents = d
let d0 = e. ∼ entries
d0 .entriespost = d0 .entriespre − {e} ∧
DirEntrypost = DirEntrypre − {e} ∧
Dirpost = Dirpre − {d}
The next operation creates a subdirectory with name n inside a directory that is given by a
pathname. The subdirectory is initially empty.
158CHAPTER 14. EXAMPLE REQUIREMENTS SPECIFICATION FOR A FILE SYSTEM
So far we have not dealt with moving or copying file system objects.
The following operation moves an object from the current directory to the directory d given by
a pathname (either absolute or relative).
14.4. FILE SYSTEM OPERATIONS 159
Finally, we wish to specify recursive copy: copy a directory, including all of its children files and
(recursively) all of its children subdirectories.
Consider first the base case, namely files. What does it mean for a file f 0 to be a copy of
another file f ? It means that they have the same name, and the same “contents”, but occur
in different directories. We introduce the relation insides to refer to the “contents” of a file,
i.e., what appears when the file is viewed. insides maps a file f to a bitstring, which gives
what is “inside” the file. Note that we would usually use contents for this, but that term is
already used here with a different meaning. We have included insides in the data model graph
in Figure 14.1, and also the Data Set BitString, for Bitstrings.
We now formalize the notion of “copy of a file”, as a relation fcopy(f, f 0 : File):
fcopy(f, f 0 : File) ,
f 6= f 0 ∧ f. ∼ contents.name = f 0 . ∼ contents.name ∧ f.insides = f 0 .insides
So, fcopy(f, f 0 ) holds when f and f 0 are different objects with the same name and the same
insides. By the naming constraint, f and f 0 must have different parent directories, since a
directory cannot contain two or more objects with the same name. Note the use of f. ∼ contents
to obtain the DirEntry that points to f . This works because of the constraint in Section 14.3.4:
every file occurs in exactly one directory, and so is pointed to by exactly one DirEntry. Similarly,
the constraints in Sections 14.3.2 and 14.3.3 imply that every directory (except Root) occurs
in exactly one directory, and so is pointed to by exactly one DirEntry. Thus, we can give a
multiplicity indicator of ? for the source of contents (! for all objects except Root, and 0 for
Root), as shown in Figure 14.1. The textbook gives a constraint of *.
160CHAPTER 14. EXAMPLE REQUIREMENTS SPECIFICATION FOR A FILE SYSTEM
Using fcopy, we can define dcopy(d, d0 : Dir) which holds when d0 is a copy of d:
dcopy(d, d0 : Dir) ,
d 6= d0 ∧
d. ∼ contents.name = d0 . ∼ contents.name ∧
∀ f : f ∈ File ∧ f ∈ d.children : (∃ f 0 : f 0 ∈ File ∧ f 0 ∈ d0 .children : fcopy(f, f 0 )) ∧
∀ f 0 : f 0 ∈ File ∧ f 0 ∈ d0 .children : (∃ f : f ∈ File ∧ f ∈ d.children : fcopy(f, f 0 )) ∧
∀ d1 : d1 ∈ Dir ∧ d1 ∈ d.children : (∃ d10 : d10 ∈ Dir ∧ d10 ∈ d0 .children : dcopy(d1, d10 )) ∧
∀ d10 : d10 ∈ Dir ∧ d10 ∈ d0 .children : (∃ d1 : d1 ∈ Dir ∧ d1 ∈ d.children : dcopy(d1, d10 ))
where children , ∼ parent. So, dcopy(d, d0 ) holds when d and d0 are different objects with
the same name and every file in d has a copy in d0 (and vice-versa) and, recursively, every
subdirectory in d has a copy in d0 and vice-versa.
Using the above, we define ocopy(o, o0 : FSObject) which defines copying for FSObjects in
general, i.e., either files or directories.
ocopy(o, o0 : FSObject) , (o, o0 ∈ File ∧ fcopy(o, o0 )) ∨ (o, o0 ∈ Dir ∧ dcopy(o, o0 ))
We can now specify the copyObj operation, where descendants , ∼ ancestors:
Design
15.1 Overview
By the structure of a program, we mean: (1) the set of modules (classes, data abstractions,
procedures) of the program, and (2) information about how the modules are connected to each
other.
Goals of design:
• To define the structure of a program that satisfies the specification and is reasonably
efficient. The program, together with the underlying system software (OS, compilers, etc)
and hardware, constitutes the machine.
1. Start with the specification. For each operation given in the specification, “implement”
the operation by providing a procedure (i.e., a method) for it. The user then invokes this
procedure. The requires and effects clauses of this procedure are those given by the
specification of the operation.
The specification operations define an initial set of (procedural) abstractions that must
be implemented. Starting with this initial set, the design proceeds as follows.
2. Repeatedly pick a target abstraction and “design” it, i.e., do the following:
(a) Identify helping abstractions, or helpers. These are auxiliary modules that are useful
in implementing the target.
161
162 CHAPTER 15. DESIGN
Often during design, we choose between different alternatives for the structure. An incorrect
choice means that we have to “back up” to an earlier stage of design to correct the mistake,
and then work forwards from that point.
• The introductory section: this lists all the abstractions introduced so far, and gives the
relationships between them, as a module dependency diagram.
• An entry for each abstraction, containing the following four parts:
1. A Module Specification giving the functional behavior:
(a) For a procedural abstraction, this is a precondition and postcondition.
(b) For a data abstraction, this is a precondition/postcondition for each method,
stated in terms of the abstraction function and the method’s parameters. We do
not need the definition of the abstration function; we just rely on the value of
the abstraction function being in the abstract domain, e.g., AF = AF ∪ {x} for
the insert method of IntSet.
2. Description of performance constraints, e.g., running time and space, as discussed in
CMPS 212 and 256.
3. Information about how the abstraction will be implemented:
(a) For a procedural abstraction, this is an implementation sketch.
(b) For a data abstraction, this consists of:
– Defintions of the abstraction function and representation invariant.
– An implementation sketch for each method.
The implementation skeches can (and often will) refer to helper abstractions.
4. Miscellaneous information.
The initial abstractions are identified using the requirements specification (see Section 15.3)
and listed. As helper abstractions are added, these are added to the list.
The module dependency diagram is a directed graph as follows:
• The nodes are abstractions, i.e., procedural and data abstractions. Procedural abstrac-
tions are implemented by methods in some class. A data abstraction is implemented as a
class by itself.
15.3. THE DESIGN PROCESS 163
– Using arc: goes from a source abstraction to a helper abstraction. Indicates that the
implementation of the source uses the helper. Drawn with an open head.
– Extension arc: indicates a subtype relationship, which is defined only between data
abstractions. Goes from a data abstraction that is a subtype to the data abstraction
that is its supertype. Drawn with a closed head.
So, we use the operations in the specification to guide us as to what modules to introduce
initially. Modules that we introduce later are helpers.
If we did a good job in requirements analysis. the specification will reflect the structure of the
problem. Thus, we let the problem structure determine the program structure. This is a major
design principle.
Next, we construct an initial module dependency diagram and enter it into the design notebook.
A good way to lay out the diagram is to place a using module above the modules it uses.
164 CHAPTER 15. DESIGN
Next, we choose a target abstraction and invent helpers for it, etc. We repeat until we have
sketched an implementation for every module. The sketch can be in English (as a list of steps)
or in pseudocode.
To identify the helpers needed to implement a target abstraction, list the (sub)tasks that the
implementation must accomplish (the order only needs to be approximately correct at this
stage). Use the list to guide the introduction of helper abstractions. The main criterion for this
is: seek to hide details of processing that are not at the current level of the design.
For each helper data abstraction introduced, we specify the operations within it that are needed
by the target whose implementation we are designing. If later targets “reuse” this helper, then
we may add operations to the helper which are useful to the later target’s implementation.
To summarize:
Key issue: what detail is appropriate at each level? Answer: largely a matter of judgment.
Each module’s implementation should not be “too large”, or it will be difficult to understand,
maintain, and modify. The different parts of the implementation (at each level) should all be at
roughly the same “level of detail.” Finally, each module should have only a single, well-defined
purpose.
The “higher level” modules should be concerned primarily with organizing the computation,
while the “lower level” modules deal with the details of manipulating the actual data, and
performing small, well-understood steps, such as sorting, searching, inserting, deleting, etc..
15.3.3 Continuing the design: how to select the next target for design
First, we identify all the candidate abstractions: those whose specification is complete but which
have not yet been designed.
The specification of an abstraction is complete when we are (reasonably) sure that no more
operations need to be added to the abstraction.
Second, we choose among the candidate abstractions according to the following criteria:
• We are uncertain about how to implement an abstraction. Designing this abstraction will
often reveal problems, e.g., it may be impossible to implement the abstraction efficiently
with the given hardware and software resources. The sooner these are caught, the better.
• An abstraction may be very central to the overall design, and so inverstigating it will give
insight into the overall design, which will help catch design errors.
• We may wish to finish designing one part of the system. We could then even start
implementing this part, if we are reasonably sure that the design of this part will not be
changed due to problems with the design of other parts.
Chapter 16
• The operations now return the match results (to the UI) instead of updating the Match
data structure.
• Preconditions are handled by throwing an exception, i.e., evaluate the precondition and
throw an exception if it is false.
• The makeCurrent(t) and makeCurMatch(i) operations are omitted, since they can be han-
dled by UI. Operation findDoc(t) is added, to help the UI in implementing makeCurrent(t).
Figure 16.1 and Textbook Figure 13.3 give the module specification for Engine. There is a
single constructor, which initializes the seach engine. We continue to use the data model of
165
166 CHAPTER 16. EXAMPLE DESIGN FOR A WEB SEARCH ENGINE
Section 12.4, and retain its terminology. WORD(w) is a predicate that is true iff the string w
consists entirely of alphabetic characters. The instance methods are:
As indicated, the first three methods return a result that is a Query, i.e., the result of matching
keywords against the collection of documents. The last method returns a document (Doc).
These are major data abstractions. Thus, we introduce helper modules to define and implement
them:
We start the design by sketching the module dependency diagram (MDD). Note that Query
uses Doc, and so we have the MDD given in Figure 16.2 and Textbook Figure 13.5.
Figures 16.3, 16.4 and Textbook Figure 13.4 give an initial specification for the helper data
abstractions Doc and Query. Note that these contain only some getter methods, as the other
functionality required from these abstrations is unclear at this point. It will become clear as we
implement the methods of Engine and other helpers.
Next, we sketch implementations of the instance methods of Engine.
Principle: do not introduce a procedure for each task. Instead, look for abstractions, especially
data abstractions, to take care of and hide details.
What is appropriate detail at each level? There are several answers:
• It is a matter of judgment.
• Modules should not be too large. A data abstraction should be a few pages. A single
method should be aboput one page.
A list of the tasks that the implementation of queryFirst(w) must carry out is (see also p.
316):
class Engine {
overview: An engine has a state as described in the search engine data model.
The methods throw the NotPossibleException when there is a problem;
the exception contains a string explaining the problem.
All instance methods modify the state of this.
constructors
Engine () throws NotPossibleException
effects: If the uninteresting words cannot be read from the private file
throws NotPossibleException else creates NK and initializes the
application state appropriately.
methods
Query queryFirst(String w) throws NotPossibleException
effects: If ¬WORD(w) or w ∈ NK throws NotPossibleException else
sets Key = {w}, performs the query, and returns the result.
Query queryMore(String w) throws NotPossibleException
effects: If ¬WORD(w) or w ∈ NK or Key = ∅ or w ∈ Key throws
NotPossibleException else adds w to Key and returns the query result.
Query addDocFromFile(String f ) throws NotPossibleException
effects: If f is not the name of a file in the current directory that contains a document,
or f ∈ FILES, throws NotPossibleException, else adds the document in file f to Doc.
If Key is nonempty and the document matches the keywords, then
adds the document to Match and clears CurMatch.
Doc findDoc(String t) throws NotPossibleException
effects: If t 6∈ Title throws NotPossibleException
else returns the document with title t.
}
Engine
version 1
Query
version 1
Doc
version 1
class Doc {
overview: A document contains a title and a text body.
methods
String title()
effects: Returns the title of this.
String body()
effects: Returns the body of this.
}
class Query {
overview: provides information about the keywords of a query and the documents
that match those keywords. size returns the number of matches.
Documents can be accessed using indexes between 0 and size-1 inclusive.
Documents are ordered by the number of matches they contain,
with document 0 containing the most matches.
methods
String[ ] keys()
effects: Returns the keywords of this.
int size()
effects: Returns a count of the documents that match the query.
Doc fetch(int i) throws IndexOutOfBoundsException
effects: If 0 6 i < size returns the i’th matching document else
throws IndexOutOfBoundsException
}
Now consider how to implement each task. In particular, what abstractions can we introduce
to help carry out tasks while hiding the details that are not approproate at the current level?
We do not have to consider these tasks in order. Tasks 1–3 are straightforward. We consider
task 4 first.
number of documents, which could be large. Neither should it be linear in the size of the
documents. These are reasonable requirements, since there exists a well-known data-structure
that provides contant-time lookup, provided there is enough memory, namely a hash table.
Since main memory is cheap, it is reasonable to assume that our hash table will be large enough
for contant (expected) time lookup. Note that we should not worry about the full details of
implementing search at this stage, but some preliminary considerations, to check that we will
be able to implement with the required efficiency, are worthwhile.
We add information to ITable when a document is fetched
this is better that adding the information when queryFirst(w) is first executed, why?
Now, queryFirst(w) just looks at ITable
Task 5: Just count the number of occurrences while you are searching the document
Task 6: Sort the matches by count. To hide the details of sorting, etc. we introduce a data
abstraction: MatchSet
Have MatchSet handle tasks 3–6, since they are “tightly coupled.”
Task 2: check if w is an interesting word. We need a data structure to store all the unintersting
words (those in NK) and to compare them with w. We already have a data structure (ITable)
for comeparing w with the intersting words in documents. We can resue ITable to compare w
with words in NK. Hence we rename ITable to WordTable to better describe its new function
(handle both intersting and uninteresting words).
Task 1: check that input w is a word. We simpy check that w is a sequence of alphabetic
characters, possibly including hyphens.
So, queryFirst(w) can be implemented simply by calling:
a method of WordTable
a method of MatchSet
Now, we only look at documents that matched previous keywords. In more detail:
We have keywords Key = {w0 , w1 , . . . , wk−1 } and matching documents (MatchSet) in order
d0 , d1 , . . . , dn−1 .
We wish to add the next keyword wk to the query.
Our choices are:
1. Do a separate match on wk , i.e., get a second MatchSet just for wk and intersect it with
the existing MatchSet d0 , d1 , . . . , dn−1 .
Both approaches are correct, i.e., satisfy the specification of queryMore(w). (2) may give
better performance than (1), since e.g., it becomes possible to do the merge and intersection
incrementally using hash tables. Hence we pick (2). Since it will be implemented in MatchSet,
we defer the implementation sketch until the discussion of MatchSet.
We cannot use WordTable, since it will match t with the body of documents. So, we introduce
a new abstraction to deal with titles: TitleTable. As with WordTable, we add information to
TitleTable when a document is added to the collection.
16.4 Next major step: document and specify all the abstrac-
tions introduced so far
• WordTable: see Figure 16.6 and Textbook Figure 13.7. The constructor handles uninter-
esting words. Provides methods isInteresting(w): check if w is an intersting word, and
addDoc(d): add document d to WordTable. The module specification for WordTable is
incomplete. We will add more methods later, as needed.
• TitleTable: see Figure 16.7 and Textbook Figure 13.7. Stores documents and their titles.
Provides methods addDoc(d): add document d to TitleTable (checks for duplicate titles),
and lookup(t): find the document with title t.
• MatchSet: The methods of MatchSet construct queries, and the methods of Query return
information about already constructed queries. Since these are very tightly coupled, it
makes sense to merge these two abstractions. Call the resulting abstraction Query also.
The specification is given in Figure 16.8 and Textbook Figure 13.8. New queries are
handled by the constructor. Uninteresting keywords generate empty matches. There are
no performance constraints, but we want fast lookup, so use hash tables.
• Doc: The second version of the specification is given in Figure 16.9 and Textbook Figure
13.8. The constructor converts a string to Doc and determines that title and body are
present.
The resulting MDD is given in Figure 16.5 and textbook Figure 13.9.
172 CHAPTER 16. EXAMPLE DESIGN FOR A WEB SEARCH ENGINE
Engine
version 1
TitleTable Query
version 1 version 2
WordTable
version 1
Doc
version 2
Candidates: those abstractions suitable as the next target (not necessarily all abstractions)
Engine is not a candidate since it has already been designed
Doc and WordTable are not candidates since modules that use them have not been designed,
and so their specification may be incomplete, e.g., it may need more methods
Hence the candidates are: getDocs, TitleTable, Query
How to choose? Guidelines are:
Issues:
16.4. NEXT MAJOR STEP: DOCUMENT AND SPECIFY ALL THE ABSTRACTIONS INTRODUCED SO F
class WordTable {
overview: Keeps track of both interesting and uninteresting words. The uninteresting
words are obtained from a private file.
Records the number of times each interesting word occurs in each document.
constructors
WordTable () throws NotPossibleException
effects: If the private file cannot be read throws NotPossibleException else initializes
the table to contain all the words in the file as uninteresting words (NK).
methods
boolean isInteresting(String w)
effects: If w is null or ¬WORD(w) or w ∈ NK
returns false else returns true
helps: Engine.queryFirst(w), Engine.queryMore(w)
void addDoc(Doc d)
requires: d is not null
modifies: this
effects: Adds all the interesting words of d to this with a count of their
number of occurrences.
helps: Engine.addDocFromFile(f )
}
class TitleTable {
overview: Keeps track of documents with their titles.
constructors
TitleTable ()
effects: Initializes this to be an empty table.
methods
void addDoc(Doc d) throws DuplicateException
requires: d is not null
modifies: this
effects: If a document with d’s title is already in this throws
DuplicateException else adds d with its title to this.
helps: Engine.addDocFromFile(f )
Doc lookup(String t) throws NotPossibleException
effects: If t is null or there is no document with title t in this
throws NotPossibleException else returns the document with title t
helps: Engine.findDoc(t)
class Query {
overview: provides information about the keywords of a query and the documents
that match those keywords. size returns the number of matches.
Decuments can be accessed using indexes between 0 and size.
Documents are ordered by the number of matches they contain,
with document 0 containing the most matches.
constructors
Query()
effects: Returns the empty query.
helps: Engine.addDocFromFile(f )
Query(WordTable wt, String w)
requires: wt and w are not null
effects: Makes a query for the single keyword w.
helps: Engine.queryFirst(w)
methods
void addKey(String w) throws NotPossibleException
requires: w is not null
modifies: this
effects: If this is empty or w ∈ Key throws NotPossibleException else
modifies this to contain the query for Key ∪ {w},
i.e., w plus the keywords already in the query
helps: Engine.queryMore(w)
void addDoc(Doc d)
requires: d is not null
modifies: this
effects: If this is not empty and d contains all the keywords of this
adds it to this as a query result else does nothing
helps: Engine.addDocFromFile(f )
String[ ] keys()
effects: Returns the keywords of this.
int size()
effects: Returns a count of the documents that match the query.
Doc fetch(int i) throws IndexOutOfBoundsException
effects: If 0 6 i < size returns the i’th matching document else
throws IndexOutOfBoundsException
}
class Doc {
overview: A document contains a title and a text body.
constructors
Doc(String d) throws NotPossibleException
effects: if d cannot be processed as a document throws NotPossibleException
else makes this be the Doc corresponding to d.
helps: Engine.addDocFromFile(f )
methods
String title()
effects: Returns the title of this.
String body()
effects: Returns the body of this.
}
In addDoc(Doc d) we need to get the title of d, to check for duplicate titles. We use the method
Doc.title().
Fast lookup: use a hash table. The key is a string (the title of a document). The value is a Doc.
Task 1: Query(WordTable wt, String w), compute a new query with keyword w
We must:
Task 1 should be done by WordTable, which was introduced exactly for the purpose of searching
documents for keywords. So, we introduce a lookup(w) method of WordTable, which returns
the documents that match w.
176 CHAPTER 16. EXAMPLE DESIGN FOR A WEB SEARCH ENGINE
For task 2, sort based on occurrence count: the number of documents could be large and varying
(as documents are added), hence we must be efficient. So, use a binary search tree. Note that
we could do task 2 within the lookup(w) method, but it is better for lookup(w) to return the
documents unsorted, and to sort them in another module (see p. 327).
Q: why? A: first, separation of concerns. Second, sorting in lookup(w) does not help with
queryMore(w).
4. Sort these documents by the total occurrence count of all keywords in the query.
For task 1, we could use lookup in WordTable, assuming d is processed first. But, this returns
a long list of documents, and we must then do a linear search to check if d is present. Also, we
have to repeat this for each new keyword. This is inefficient. A better solution is:
In the addDoc method of WordTable, generate a hashtable Table for the words of
document d only. Table maps each word to the number of times that it occurs in
d. Now simply look up each of w0 , w1 , . . . , wk−1 in Table. Table is an argument
to the addDoc method of Query. Hence, we change the specifications of Query and
WordTable.
16.4. NEXT MAJOR STEP: DOCUMENT AND SPECIFY ALL THE ABSTRACTIONS INTRODUCED SO F
Now consider observers, i.e., the fetch(i) method of Query. A search tree does not help. We
really need an array, so that we can index on i. Also, an array is easily sorted. So, replace the
search tree by an array (actually by a Java vector).
Implementation sketches for some of the methods of Query are given in Textbook Figure 13.10.
The revised specification for Query is given in Textbook Figure 13.11. In Figure 16.10 we show
the revised specification along with the implementation sketches.
Notice how the representation invariant is crucial for the correctness of the methods of Query,
and that these methods preserve the representation invariant and all the global constraints from
the data model.
Having done TitleTable and Query, only WordTable is left.
The revised specification for WordTable is given in Figure 16.11 and Textbook Figure 13.11.
addDoc(Doc d) adds the interesting words of d to WordTable, and also returns a hashtable H
for d only. H maps each word in d to its number of occurrences in d only. H is used in the
addDoc method of Query.
class Query {
overview: as before
WordTable k;
Vector matches; //Vector of DocCnt objects
String[ ] keys; //The keywords used in the current query
constructors
Query()
effects: Returns the empty query.
Query(WordTable wt, String w)
requires: wt and w are not null
effects: Makes a query for the single keyword w.
helps: Engine.queryFirst(w)
implementation sketch:
look up the key in the WordTable by invoking wt.lookup(w)
sort the matches using quickSort
methods
void addKey(String w) throws NotPossibleException
requires: w is not null
modifies: this
effects: If this is empty or w ∈ Key throws NotPossibleException else
modifies this to contain the query for Key ∪ {w},
i.e., w plus the keywords already in the query
helps: Engine.queryMore(w)
implementation sketch:
add w to keys
look up w in the WordTable
store the information about matches in a hash table
for each current match, look up the document in the hash table and
if it is there, store it in a vector
sort the vector using quickSort
void addDoc(Doc d, Hashtable h)
requires: d is not null and h maps strings (the interesting words in d)
to integers (the occurrence count of the word in d).
modifies: this
effects: If each keywords of this is in h, adds d to the matches of this.
helps: Engine.addDocFromFile(f )
implementation sketch:
use the argument h to get the number of occurrences of each keyword (∈ Key)
if the document d contains all the keywords, compute the total occurrence
count sum for all keywords and insert the hd, sumi pair in the vector of matches.
String[ ] keys()
effects: Returns the keywords of this, i.e., Key.
int size()
effects: Returns a count of the documents that match the query.
Doc fetch(int i) throws IndexOutOfBoundsException
effects: If 0 6 i < size returns the i’th matching document else
throws IndexOutOfBoundsException
}
class WordTable {
overview: Keeps track of both interesting and uninteresting words. The uninteresting
words are obtained from a private file.
Records the number of times each interesting word occurs in each document.
constructors
WordTable () throws NotPossibleException
effects: If the file cannot be read throws NotPossibleException else initializes
the table to contain all the words in the file as uninteresting words.
methods
boolean isInteresting(String w)
effects: If w is null or a nonword or an uninteresting word
returns false else returns true
helps: Engine.queryFirst(w), Engine.queryMore(w)
Hashtable addDoc(Doc d)
requires: d is not null
modifies: this
effects: Adds all the interesting words of d to this with a count of their
number of occurrences. Also returns a hashtable mapping each
interesting word in d to its number of occurrences.
helps: Engine.addDocFromFile(f )
Vector lookup(String k)
requires: k is not null.
effects: Returns a vector of DocCnt objects where the occurrence count of
word k in Doc is Cnt.
helps: Query. Query(wt, w)
}
class Doc {
overview: A document contains a title and a text body. Doc is immutable and
provides an iterator.
constructors
Doc(String d) throws NotPossibleException
effects: if d cannot be processed as a document throws NotPossibleException
else makes this be the Doc corresponding to d.
methods
String title()
effects: Returns the title of this.
String body()
effects: Returns the body of this.
Iterator words()
effects: Returns a generator that will yield all the words in the document
as strings in the order they appear in the text.
}
Words must be converted to canonical forms (e.g., all lowercase) so that word matching is
accurate. Use a single canon procedure, so that the definition of “canonical” can be easily
changed.
To finalize the design, we add implementation sketches to the methods of Engine, as shown in
Figure 16.13. The resulting MDD is given in Figure 16.14. We omit some details w.r.t. sorting,
which are provided in the textbook. Also, we omit the actual methods in Engine since the
implementation sketches will be assigned as homework. The method specifications remain the
same.
16.5. THE FINAL DESIGN 181
class Engine {
overview: An engine has a state as described in the search engine data model.
The methods throw the NotPossibleException when there is a problem;
the exception contains a string explaining the problem.
All instance methods modify the state of this.
WordTable wt;
TitleTable tt;
Query q;
String[ ] urls;
constructors
Engine () throws NotPossibleException
effects: If the uninteresting words cannot be read from the private file
throws NotPossibleException else creates NK and initializes the
application state appropriately.
implementation sketch:
wt := new WordTable()
tt := new TitleTable()
q := null
urls is initially empty
methods
Query queryFirst(String w) throws NotPossibleException
effects: If ¬WORD(w) or w ∈ NK throws NotPossibleException else
sets Key = {w}, performs the query, and returns the result.
implementation sketch:
q := new Query(wt, w)
return q
Query queryMore(String w) throws NotPossibleException
effects: If ¬WORD(w) or w ∈ NK or Key = ∅ or w ∈ Key throws
NotPossibleException else adds w to Key and returns the query result.
implementation sketch:
q := q.addKey(w)
return q
Query addDocFromFile(String f ) throws NotPossibleException
effects: If f is not the name of a file in the current directory that contains a document,
or f ∈ FILES, throws NotPossibleException, else adds the document in file f to Doc.
If Key is nonempty and the document matches the keywords, then
adds the document to Match and clears CurMatch.
implementation sketch:
add f to FILES
for the document d read from file f do
tt.addDoc(d)
h := wt.addDoc(d)
if q 6= null then q := q.addDoc(d, h)
if q = null then q := new Query()
return q
Doc findDoc(String t) throws NotPossibleException
effects: If t 6∈ Title throws NotPossibleException
else returns the document with title t.
implementation sketch:
return tt.lookup(t)
182 CHAPTER 16. EXAMPLE DESIGN FOR A WEB SEARCH ENGINE
Engine
version 2
TitleTable Query
version 1 version 3
WordTable
version 2
Doc
version 3
17.1 Specification
We wish to specify, design, and write a program that reads a single paragraph of left-justified
English text from a text file “in.txt” and writes a fully justified (both left and right justified)
version to a text file “out.txt.” The files in.tx and out.txt should be present in the same directory
as the program.
The fully justified output paragraph must satisfy the following requirements:
1. The non-whitespace text is not changed.
2. The output has a uniform line length of 80 characters, for all lines except possibly the
last. Include spaces in the count, but not the newline character.
3. The number of spaces between words on the same line are as uniform as possible.
4. A line should not contain more “extra” blanks than the length of the first word of the
following line.
183
184 CHAPTER 17. EXAMPLE: TEXT JUSTIFICATION
4. Define the excess space in a line to be the sum of lengths of all interword spacings, minus
the number of interword spacings. For each line except the last, the excess space should
be less than the length of the first word in the following line.
To formalize the requires and effects clauses, we must introduce the appropriate technical vo-
cabulary. First, what do we need to state: what consistutes a word, whitespace, a line, a
paragraph, as these are the basic constructs. After, we need to define some attributes of these
constructs: the words in a line, the lines in a paragraph, the length of lines in a paragraph, the
spacing of words in a line, etc.
We will follow an approach of defining predicates and functions to state the above. Often, the
definitions will be recursive, with a base case and a recursive (“inductive”)) case. This follows
the natural inductive structure that much data, including text paragraphs, has.
We introduce the following notation:
+ : the string concatenation operator
6 b : the space character
0 \n0 : the newline character
1. We do not permit leading blanks in a line. This makes it easier to state the condition
that the input be left-justified, since this condition is implicit in the definition of a line.
A more general solution would allow leading spaces and state the left-justification condi-
tion separately. This would allow, e.g., for paragraph indentation, but would make the
definitions more complex.
2. We include punctuation following a word as part of the word itself. (Q: why is this
17.1. SPECIFICATION 185
3. line and para are recursively defined. The base case is a single word, not the empty string.
This is for two reasons:
• Allowing empty lines and paragraphs would make some functions definitions return
more than one value, i.e., they would define a relation and not a function. This
would be very awkward. Fixing this would require repeatedly adding conjuncts like
w 6= and ` 6= , etc., which would make our definitions longer and more verbose. It
is more concise to state the non-emptiness requirement in one place: the definitions
of word and line.
• Empty lines and paragraphs do not correspond to our intuition of a what a single
paragraph looks like.
That is, the contents of in.txt are either empty or constitute a paragraph.
We now formalize the effects clause. We start with condition (1): The sequence of words of the
output is the same as that of the input. To formalize this, we need to recurse over the sequence
of words of the input and output (i.e., the sequence of words in a paragraph), and check that
they are equal. Let out be the output, considered as a single string.
We use standard terminology for sequences: hd for the first element, and tl for the remaining
elements. We now define these. Note that we use the predicates line and para as “types”.
hd(para : p) , w such that
word(w) ∧ [p = w +0 \n0 ∨ (∃ x, p2 : (blank(x) ∨ x =0 \n0 ) ∧ para(p2) : p = w + x + p2)]
tl(para : p) , p2 such that
para(p2) ∧ (∃ w, x : word(w) ∧ (blank(x) ∨ x =0 \n0 ) : p = w + x + p2)
The above two definitions ignore the line structure of a paragraph, since the line structure of
a paragraph is irrelevant here: we are only interested in the sequence of words that make up
the paragraph. Note that tl(p) is defined only when p consists of at least two words, and so we
must be careful to use tl(p) only for such p.
We formalize condition (1) of the effects clause as:
(1) the predicate sameWords(in, out), where
sameWords(para : p, p0 ) ,
(∃ w : word(w) : p = p0 = w +0 \n0 ) ∨ // base case
(hd(p) = hd(p0 ) ∧ sameWords(tl(p), tl(p0 ))) //inductive case.
Conditions (2) and (3) concern individual lines, and so can be handled by a recursive definition
that follows the structure of para. The requirement in (2) that line length is exactly 80 can
be stated directly, since lines are strings, and string length is assumed to be defined. To state
the requirement in (3) of uniform line spacing, we need to extract the length of the interword
186 CHAPTER 17. EXAMPLE: TEXT JUSTIFICATION
spacings in a line. A nice way to do this is to define functions that return the lengths of the
smallest and largest interword spaces. Then unformity means that the difference between these
is at most 1.
minSp(line : `) , c such that
(∃ w : word(w) : ` = w +0 \n0 ∧ c = +∞) ∨ // base case
(∃ w, x, `2 : word(w) ∧ blank(x) ∧ line(`2) : ` = w + x + `2 ∧ c = min(|x|, minSp(`2)))
// inductive case
The minimum interword spacing in a line ` is plus infinity if the line consists of a single word.
Otherwise, it is the minimum of the length of the first interword spacing, and the minimum
(found recursively) of the interword spacings in the rest of the line (`2).
maxSp(line : `) , c such that
(∃ w : word(w) : ` = w +0 \n0 ∧ c = −∞) ∨ // base case
(∃ w, x, `2 : word(w) ∧ blank(x) ∧ line(`2) : ` = w + x + `2 ∧ c = max(|x|, maxSp(`2)))
// inductive case
The maximum interword spacing in a line ` is minus infinity if the line consists of a single word.
Otherwise, it is the maximum of the length of the first interword spacing, and the maximum
(found recursively) of the interword spacings in the rest of the line (`2).
Note how these definitions follow the inductive structure of the definition of line. This makes
them easy to write correctly.
While having a minimum spacing of +∞ is fine from a defninitional point of view, it seems
strange to have a line in the output with such a spacing. So, we examine this carefully. Such
a line consists of one word. This is certainly allowed in the input. How about the output? a
line in the output must be fully justified, i.e., it must start and end with a word. To consist of
one word w, therefore, the word w must have length exactly 80. What if the length is a little
different? e.g., slightly more or slightly less than 80:
• If w has length > 80 then the line that consists of w violates condition (2) of the effects
clause. Since condition (1) prohibits breaking up a word, there is no way for these two
conditions to be satisfied at the same time. We conlclude that all words must have length
6 80. Obviously, this is a constraint on the input, and so belongs in the requries clause.
• If w has length just less than 80, say 75, then there is still a problem. If the words before
and after w has length > 5, then the line that contains w cannot be justified properly: it
will contain space characters either at its beginning or at its end.
• We now observe that, to guarantee that a line can be fully justified, it must contain at
least two words: one to start the line, and one to end it. This guarantees that there will
be no space characters at the beginning or the end of the line. Since we cannot control
the final location of words, as this depends on the length of all words, we see that the only
reasonable condition which guarantees at least two words per line is a length restriction:
every word must have length < 40. As noted in (1), this condition belongs in the requires
clause.
Thus we modify the requires clause to add the condition that all words have length < 40. We
can do this in two ways. The simpler is to modify the definition of word to add the upper
bound of < 40 to the length:
word(string : s) , (∀ i : 0 6 i < |s| : alph(s[i]) ∧ 0 < |s| < 40
17.1. SPECIFICATION 187
Note that the first method does not require a change in the requires clause since the modificaiton
of word is taken into account since para depends on word.
We now formalize the conjunction of (2) and (3) as
(2,3) the predicate justL(out) where
justL(para : p) ,
(line(p) ∧ minSp(p) = maxSp(p) = 1 ∧ |p| 6 80) ∨ // base case
(∃ `, p2 : line(`) ∧ para(p2) : p = ` + p2 ∧ justL(p2) ∧ |`| = 80 ∧ maxSp(`) − minSp(`) 6 1)
//inductive case.
To formalize (4), we have to extract the total “excess” space in a line, and the number of
interword spacings (i.e., positions between words, i.e., number of words minus 1). This is done
using definitions that follow the structure of line.
totSp(line : `) , c such that
(∃ w : word(w) : ` = w +0 \n0 ∧ c = 0) ∨ // base case
(∃ w, x, `2 : word(w) ∧ blank(x) ∧ line(`2) : ` = w + x + `2 ∧ c = |x| + totSp(`2)
// inductive case
nWords(line : `) , c such that
(∃ w : word(w) : ` = w +0 \n0 ∧ c = 1) ∨ // base case
(∃ w, x, `2 : word(w) ∧ blank(x) ∧ line(`2) : ` = w + x + `2 ∧ c = 1 + nWords(`2)
// inductive case
excess(line : `) , totSp(`) − (nWords(line`) − 1) gives the total “excess space” in line `
To state (4), we need to talk about successive pairs of lines. We follow the recursive structure
of para. Thus we formalize (4) as
(4) the predicate goodSp(out) where
goodSp(para : p) ,
line(p) ∨ // base case
(∃ `, `2, p2 : line(`) ∧ line(`2) ∧ [p2 = ∨ para(p2)] :
p = ` + `2 + p2 ∧ goodSp(`2 + p2) ∧ excess(`) < |hd(`2)|) // inductive case
1
I would like to acknowledge Ara Hayrabedian for pointing out this modification.
188 CHAPTER 17. EXAMPLE: TEXT JUSTIFICATION
EFFECTS (Formal): The output text out considered as a single string must satisfy:
(in = ∧ out = ) ∧ (in 6= ∧ para(out) ∧ sameWords(in, out) ∧ justL(out) ∧ goodSp(out)).
Having formalized the specification, we now turn to the design and the implementation sketch.
Since this specification is somewhat complex, we introduce an intermediate step between the
specification and the implementation (code) sketch. We first write down a rough list of the
tasks that must be acomplished, without worrying about the ordering of these tasks.
Task list:
• Eliminate excess spaces in the input (i.e., more than one space between successive words).
• Check excess space in every line versus the length of the first word on the next line, and
move the word up if necessary.
We can now write the implementation sketch. We include the header of the top-level method
justify. We implicitly include in every step below the requirement that it preserves the
sequence of words in the input, i.e., sameWords(in, s) holds for all intermediate results s. The
phrase “Scan through . . . ” means to scan through from the beginning (character indexed at 0)
to the end.
17.2. DESIGN: IMPLEMENTATION SKETCH 189
public static void justify()
Implementation sketch
From the above implementation sketch, we observe that the whitespace that exists in the input
has no effect on the output. Such an observation can be made from the specification, but
is typically easier to make in the implementation sketch. making such observations earlier
minimizes the amount of revision and re-work that must be done. Hence, a good data structure
to store the input is an arraylist of words; we do not have to store the whitespce!
Rather than write actual code at this point, we will produce a second, more refined implemen-
tation sketch, based on the abiove discussion, and on the decision to store the words in an
ArrayList.
190 CHAPTER 17. EXAMPLE: TEXT JUSTIFICATION
public static void justify()
Implementation sketch (2’nd level)
1. Read the entire input file word by word into arraylist a, so that a[0], a[1], . . . constitutes
the sequence of words in the input.
3. Scan through b:
For each line ` except the last:
Compute the number x of extra spaces needed for length of ` to be 80
Insert this number of spaces into ` as follows:
let w be the number of interword spacings in `;
compute q, r such that x = wq + r ∧ 0 6 r < w;
add q spaces to each interword spacing;
add 1 space to the first r interword spacings.
Call the resulting arraylist of lines c.
c satisfies: goodSp(c) ∧ justL(c).
We see that step (2) above has replaced steps (2,3,4) in the first sketch, so that the choice of
this data structure has resulted in considrable simplification. We also notice that step (2) is
given in considerably less detail than step (3). Step (3) is ready for coding, while step (2) is
not. So we produce one more refinement of the implementation sketch.
17.3. CODE 191
public static void justify()
Implementation sketch (3’rd level)
1. Read the entire input file word by word into arraylist a, so that a[0], a[1], . . . constitutes
the sequence of words in the input.
3. Scan through b:
For each line ` except the last:
Compute the number x of extra spaces needed for length of ` to be 80.
Insert this number of spaces into ` as follows:
let w be the number of interword spacings in `;
compute q, r such that x = wq + r ∧ 0 6 r < w;
add q spaces to each interword spacing;
add 1 space to the first r interword spacings.
Call the resulting arraylist of lines c.
c satisfies: goodSp(c) ∧ justL(c).
17.3 Code
import java.io.*;
import java.util.*;
/* REQUIRES: There exists a text file with name in.txt in the current
* directory. The contents of the file are a sequence of words,
* separated by a whitespace (blank or newline characters). A word is
* a nonempty string consisting solely of alphanumeric characters
* (a--z, A--Z, 0--9) and punctuation (period, comma). Every word has
* length < 40. The text is left-justified: every line starts with a
* word.
*
*
* EFFECTS: Define a line to be the sequence of characters between
* either the beginning of the file and a newline character, or
* between two successive newline characters. Define an interword
* spacing to be the sequence of space characters between successive
* words in the same line.
*
* Prints to standard output a text that satisfies the following conditions:
*
* 1. The sequence of words of the output is the same as that of the
* input. Words are not to be broken across a line.
*
* 2. Every line except the last of the output must have a length of
* 80 characters. The last line must have a length $\le 80$
* characters.
*
* 3. For each line except the last, any two interword spacings differ
* in length by at most 1. In the last line, all interword spacings
* have length 1.
*
* 4. Define the excess space in a line to be the sum of lengths of
* all interword spacings, minus the number of interword spacings.
* For each line except the last, the excess space should be less
* than the length of the first word in the following line.
*/
/* IMPLEMENTATION SKETCH
* (Note: identifiers are enclosed in $..$. Predicates are as in solution to problem set 2).
*
* 1. Read the entire input file word by word into arraylist $a$, so that
* $a[0], a[1],....$ constitutes the sequence of words in the input.
*
* 2. Create a new arraylist $b$.
* Scan through $a$ from bottom to top:
17.3. CODE 193
//1. Read the entire input file word by word into arraylist $a$
String w;
while (src.hasNext()) {
w = src.next();
a.add(w);
}
//{sameWords(b,a) /\ goodSp(b) /\
// (FORALL i : 0 <= i < b.size() : |b[i]| <= L)
//3. Pad out each line except the last to a length of exactly L
l = b.get(i); //Last line. Copy over with 1 space in between successive words.
String s = l.get(0); //Base case: first word.
for(int j = 1; j < l.size(); j++)
s = s + " " + l.get(j); //Inductive step: successive words.
//{sameWords(s,l)}
int len = 0;
len = l.get(0).length();
for(int i = 1; i < l.size(); i++)
len = len + 1 + l.get(i).length();
//{len = |l| <= L} see code for item 2 above
ww.set(l.size()-1);
xx.set(L - len);
public static void divide(MuInt xx, MuInt ww, MuInt qq, MuInt rr) {
// REQUIRES: xx >= 0 /\ ww > 0
// MODIFIES: qq, rr
// EFFECTS: xx = ww * qq_post + rr_post /\ 0 <= rr_post < ww
int w = ww.get();
//<0 <= xx>
int x = xx.get();
198 CHAPTER 17. EXAMPLE: TEXT JUSTIFICATION
justify();
}
}
Bibliography
[2] Ferderick Brooks. The Mythical Man-month. Addison Wesley, 1995. 20’th anniversary
edition.
[3] C. A. R. Hoare. An axiomatic basis for computer programming. Commun. ACM, 12(10):576–
580, October 1969.
[4] Barbara Liskov and John Guttag. Program Development in Java. Addison Wesley, June
2000.
199
Index
false proof, 18
truth-table for, 23 proof tableau, 53
true complete, 53
truth-table for, 23 valid, 54
proposition, 13
assignment axiom, 47, 54 compound, 13
axiom, 17 constant, 24
simple, 13
calculus, 17
propositional formula, 15
conclusion, 14, 47
Conditional correctness, 45 quantification, 34
conjunction, 14 quantifier
truth-table for, 23 existential, 35
conjunctive normal form, 30 logical, 35
contingency, 28 universal, 35
contradiction, 27
rule of inference, 17, 47
disjunction, 14 rule of substitution, 20
truth-table for, 24
disjunctive normal form, 30 satisfiable, 28, 42
double-implication, 14 state, 25
truth-table for, 24 subproposition, 16
symbolic manipulation, 17
Hoare triple, 45, 63 symbols, 13
validity of, 46, 63
hypothesis, 47 tautology, 27
truth-table, 23, 26
implication, 14 truth-value, 23
truth-table for, 24 truth-value assignment, 26
literal, 30 valid, 27, 42
logical connective, 14 valuation, 26
logical operators, 14 verification condition, 53
negation, 14 well-defined, 26
truth-table for, 23
yields, 18
postcondition, 45, 46
precedence rules, 16
precondition, 45, 46
predicate, 31
atomic, 31
constant, 41
premise, 14
200