978-3-642-03359-9
978-3-642-03359-9
Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
University of Dortmund, Germany
Madhu Sudan
Microsoft Research, Cambridge, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Stefan Berghofer Tobias Nipkow
Christian Urban Makarius Wenzel (Eds.)
Theorem Proving
in Higher Order Logics
13
Volume Editors
Stefan Berghofer
Tobias Nipkow
Christian Urban
Makarius Wenzel
E-mail: {berghofe,nipkow,urbanc,wenzelm}@in.tum.de
CR Subject Classification (1998): F.4, F.3, F.1, D.2.4, B.6.3, B.6.1, D.4.5, G.4, I.2.2
ISSN 0302-9743
ISBN-10 3-642-03358-X Springer Berlin Heidelberg New York
ISBN-13 978-3-642-03358-2 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
springer.com
© Springer-Verlag Berlin Heidelberg 2009
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper SPIN: 12727186 06/3180 543210
Preface
TPHOLs is the premier forum for interactive theorem proving. ITP 2010 will be
part of the Federated Logic Conference, FLoC, in Edinburgh.
Programme Chairs
Tobias Nipkow TU München, Germany
Christian Urban TU München, Germany
Programme Committee
Thorsten Altenkirch David Aspinall Jeremy Avigad
Gilles Barthe Christoph Benzmüller Peter Dybjer
Jean-Christophe Filliâtre Georges Gonthier Mike Gordon
Jim Grundy Joe Hurd Reiner Hähnle
Gerwin Klein Xavier Leroy Pete Manolios
César Muñoz Michael Norrish Sam Owre
Larry Paulson Frank Pfenning Randy Pollack
Sofiène Tahar Laurent Théry Freek Wiedijk
Local Organisation
Stefan Berghofer
Makarius Wenzel
External Reviewers
Naeem Abbasi Martin Giese Zhaohui Luo
Behzad Akbarpour Alwyn Goodloe Kenneth MacKenzie
Knut Akesson Thomas Göthel Jeff Maddalon
June Andronick Osman Hasan Lionel Mamane
Bob Atkey Daniel Hedin Conor McBride
Stefan Berghofer Hugo Herbelin James McKinna
Yves Bertot Brian Huffman Russell O’Connor
Johannes Borgstrom Clment Hurlin Steven Obua
Ana Bove Ullrich Hustadt Anne Pacalet
Cristiano Calcagno Rafal Kolanski Florian Rabe
Harsh Raju Chamarthi Alexander Krauss Bernhard Reus
Benjamin Chambers Sava Krstic Norbert Schirmer
Nils Anders Danielsson Cesar Kunz Stefan Schwoon
William Denman Stphane Lescuyer Jaroslav Sevcik
Peter Dillinger Rebekah Leslie Thomas Sewell
Bruno Dutertre Pierre Letouzey Natarajan Shankar
VIII Organisation
Invited Papers
Let’s Get Physical: Models and Methods for Real-World Security
Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
David Basin, Srdjan Capkun, Patrick Schaller, and Benedikt Schmidt
Invited Tutorials
HOL Light: An Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
John Harrison
Regular Papers
Hints in Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, and
Enrico Tassi
Psi-calculi in Isabelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Jesper Bengtson and Joachim Parrow
1 Introduction
Situating Adversaries in the Physical World. There are now over three decades of
research on symbolic models and associated formal methods for security protocol
verification. The models developed represent messages as terms rather than bit
strings, take an idealized view of cryptography, and focus on the communication
of agents over a network controlled by an active intruder. The standard intruder
model used, the Dolev-Yao model, captures the above aspects. Noteworthy for
our work is that this model abstracts away all aspects of the physical environ-
ment, such as the location of principals and the speed of the communication
medium used. This is understandable: the Dolev-Yao model was developed for
authentication and key-exchange protocols whose correctness is independent of
the principals’ physical environment. Abstracting away these details, effectively
by identifying the network with the intruder, results in a simpler model that is
adequate for verifying such protocols.
With the emergence of wireless networks, protocols have been developed whose
security goals and assumptions differ from those in traditional wireline networks.
A prominent example is distance bounding [1,2,3,4,5], where one device must
determine an upper bound on its physical distance to another, potentially un-
trusted, device. The goal of distance bounding is neither message secrecy nor
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 1–22, 2009.
c Springer-Verlag Berlin Heidelberg 2009
2 D. Basin et al.
Verifying distance bounding protocols. Our starting point in this paper is a family
of distance bounding protocols proposed by Meadows [4]. The family is defined
by a protocol pattern containing a function variable F , where different instances
of F result in different protocols. We present two security properties, which
distinguish between the cases of honest and dishonest participants. For each
property, we reduce the security of a protocol defined by an instance of F to
conditions on F . Afterwards, we analyze several instances of F , either showing
that the conditions are fulfilled or presenting counterexamples to the security
properties.
This protocol family is interesting as a practically-relevant case study in apply-
ing our framework to formalize and reason about nontrivial physical protocols.
Moreover, it also illustrates how we can extend our framework (originally defined
over a free term algebra) to handle protocols involving equationally-defined op-
erators on messages and how this can be done in a general way. Altogether,
we have worked with five different protocols and two different message theories.
To support this, we have used Isabelle’s locales construct to formalize an ab-
stract message theory and a general theory of protocols. Within the locales, we
prove general, protocol-independent facts about (abstract) messages, which hold
when we subsequently instantiate the locales with our different concrete message
theories and protocols.
Let’s Get Physical: Models and Methods for Real-World Security Protocols 3
Contributions. First, we show that our framework for modeling physical security
protocols can be extended to handle protocols involving equationally-defined op-
erators. This results in a message theory extended with an XOR operator and
a zero element, consisting of equivalence classes of messages with respect to the
equational theory of XOR. We use normalized terms here as the representatives
of the equivalence classes. With this extension, we substantially widen the scope
of our approach. Note that this extension is actually independent of our “phys-
ical” refinement of communication and also could be used in protocol models
based on the standard Dolev-Yao intruder.
Second, we show how such extensions can be made in a generic, modular way.
Noteworthy here is that we could formulate a collection of message-independent
and protocol-independent facts that hold for a large class of intended extensions.
An example of such a fact is that the minimal message-transmission time between
two agents A and B determines a lower bound on the time difference between
A creating a fresh nonce and B learning it.
Finally, physical protocols often contain time-critical steps, which must be op-
timized to reduce computation and communication time. As a result, these steps
typically employ low-level operations like XOR, in contrast to more conventional
protocols where nanosecond time differences are unimportant. Our experience
indicates that the use of such low-level, equationally-defined operators results
in substantial additional complexity in reasoning about protocols in compari-
son to the standard Dolev-Yao model. Moreover, the complexity is also higher
because security properties are topology dependent and so are attacks. Attacks
now depend not only on what the attackers know, but also their own physical
properties, i.e., the possible constellations of the distributed intruders. Due to
this complexity, pencil-and-paper proofs quickly reach their limits. Our work
highlights the important role that Formal Methods can play in the systematic
development and analysis of physical protocols.
2 Background
2.1 Isabelle/HOL
Isabelle [8] is a generic theorem prover with a specialization for higher-order logic
(HOL). We will avoid Isabelle-specific details in this paper as far as possible or
explain them in context, as needed.
We briefly review two aspects of Isabelle/HOL that are central to our work.
First, Isabelle supports the definition of (parameterized) inductively-defined sets.
An inductively-defined set is defined by sets of rules and denotes the least set
closed under the rules. Given an inductive definition, Isabelle generates a rule
for proof by induction.
4 D. Basin et al.
Second, Isabelle provides a mechanism, called locales [11] that can be used
to structure generic developments, which can later be specialized. A locale can
be seen as either a general kind of proof context or, alternatively, as a kind of
parameterized module. A locale declaration contains:
– a name, so that the locale can be referenced and used,
– typed parameters, e.g., ranging over relations or functions,
– assumptions about the parameters (the module axioms), and
– functions defined using the parameters.
In the context of a locale, one can make definitions and prove theorems that
depend on the locale’s assumptions and parameters. Finally, a locale can be
interpreted by instantiating its parameters so that the assumptions are theorems.
After interpretation, not only can the assumptions be used for the instance, but
also all theorems proved and definitions made in the locale’s context.
V P
V,request
Setup Phase /
NV /
Measurement Phase
F (NV,NP,P )
o
Validation Phase
P,NP,NV,MAC K (P,NP,NV )
o VP
3 Formal Model
In this section, we present our model of physical protocols. To support the verifi-
cation of multiple protocols, we use locales to parameterize our model both with
respect to the concrete protocol and message theory. Figure 2 depicts the theories
we formalized in Isabelle and their dependencies. Some of these theories are con-
crete to begin with (e.g. Geometric Properties of R3 ) whereas other theories con-
sist of locales or their interpretations. For example, the Abstract Message Theory
contains a locale describing message theories, which is interpreted in our two con-
crete message theories (Free and XOR). In the theory Parametrized Communica-
tion Systems, we abstractly define the set of valid traces as a set of (parametric)
inductive rules. In formalizations of concrete protocols using either of the two
concrete message theories, we can therefore use both message-theory indepen-
dent and message-theory specific facts by importing the required theories.
6 D. Basin et al.
Abstract
Message
Theory
Protocol
Independent
Properties
Agents are either honest agents or dishonest intruders. We model each kind
using the natural numbers nat. Hence there are infinitely many agents of each
kind.
datatype agent = Honest nat | Intruder nat
We refer to agents using capital letters like A and B. We also write HA and HB
for honest agents and IA and IB for intruders, when we require this distinction.
In contrast to the Dolev-Yao setting, agents’ communication abilities are subject
to the network topology and physical laws. Therefore, we cannot reduce a set of
dishonest users at different locations to a single one.
3.2 Messages
Instead of restricting our model to a concrete message theory, we first define a
locale that specifies a collection of message operators and their properties. In
the context of this locale, we prove a number of properties independent of the
protocol and message theory. For example, cdist LoS (A, B) is a lower bound on
the time required for a nonce freshly created by A to become known by another
agent B, since the nonce must be transmitted. For the results in [6], we have
instantiated the locale with a message theory similar to Paulson’s [7], modeling
a free term algebra. In Section 3.6, we describe the instantiation with a message
theory that includes the algebraic properties of the XOR operator, which we use
in Section 4.
The theory of keys is shared by all concrete message theories and reuses
Paulson’s formalization. Keys are represented by natural numbers. The function
8 D. Basin et al.
inv : key → key partitions the set of keys into symmetric keys, where inv k = k,
and asymmetric keys. We model key distributions as functions from agents to
keys, e.g. the theory assumes that KAB returns a shared symmetric key for a
pair of agents A and B.
This formalizes that every interpretation of the Message Theory locale de-
fines the seven given message construction functions and three functions on
message sets. A Nonce is tagged with a unique identifier and the name of the
agent who created it. This ensures that independently created nonces never col-
lide. Indeed, even colluding intruders must communicate to share a nonce. The
constructor Crypt denotes signing, asymmetric, or symmetric encryption, de-
pending on the key used. We also require that functions for pairing (MPair ),
hashing (Hash), integers (Int ), and reals (Real ) are defined. We use the ab-
breviations A, B for MPair A B and {m}k for Crypt k m. Moreover, we de-
fine MAC k (m) = HashKey k, m as the keyed MAC of the message m and
MACM k (m) = MAC k (m), m as the pair consisting of m and its MAC . Ad-
ditionally, every interpretation of Message Theory must define the functions
subterms, parts, and dm. These respectively formalize the notions of subterms,
extractable subterms, and the set of messages derivable from a set of known mes-
sages by a given agent. In the free message theory, subterms corresponds to syn-
tactic subterms, for example x ∈ subterms({Hash x}) while x ∈ / parts({Hash x}).
We assume that the following properties hold for any interpretation of parts.
These properties allow us to derive most of the lemmas about parts from Paul-
son’s formalization [7] in our abstract setting. For example,
{m}k ∈ subterms(dm A H)
{m}k ∈ subterms(H) ∨ Key k ∈ parts(H)
A trace is a list of timed events, where a timed event (t, e) ∈ real × event pairs
a time-stamp with an event.
A timed event (tS , Send Tx iA m L) denotes that the agent A has sent the
message m using his transmitter Tx iA at time tS and has associated the protocol
data L with the event. The list of messages L models local state information
and contains the messages used to construct m. The sender may require these
messages in subsequent protocol steps. Storing L with the Send event is necessary
since we support non-free message construction functions like XOR where a
function’s arguments cannot be recovered from the function’s image alone.
A send event like the above may result in multiple timed Recv -events of the
form (tR , Recv RxjB m), where the time-stamps tR and the receivers RxjB must
be consistent with the network topology. Note that the protocol data stored in
L when sending the message does not affect the events on the receiver’s side.
A Claim-event models a belief or conclusion made by a protocol participant,
formalized as a message. For example, after successfully completing a run of a
distance bounding protocol with a prover P , the verifier V concludes at time t
that d is an upper bound on the distance to P . We model this by adding the
timed event (t, Claim V P, Real d ) to the trace. The protocol is secure if the
conclusion holds for all traces containing this claim event.
Note that the time-stamps used in traces and the rules use the notion of
absolute time. However, agents’ clocks may deviate arbitrarily from absolute
time. We must therefore translate the absolute time-stamps to model the local
views of agents. We describe this translation in Section 3.4.
10 D. Basin et al.
tr ∈ Tr tR ≥ maxtime(tr)
(t , Send Tx iA m L) ∈ tr
S
Each agent can derive all messages in the set dm A (knowsA (tr)) by applying the
derivation operator to the set of known messages. We use the subterms function
to define the set of messages used in a trace tr.
A nonce is fresh for a trace tr if it is not in used (tr). Note that since a nonce is
not fresh if its hash has been sent, we cannot use parts instead of subterms in
the above definition.
We now describe the rules used to inductively define the set of traces Tr for a sys-
tem parameterized by a protocol proto, an initial knowledge function initKnows,
and the parameters from the abstract message theory. The base case, modeled
by the Nil rule in Figure 3, states that the empty trace is a valid trace for
all protocols. The other rules describe how valid traces can be extended. The
rules model the network behavior, the possible actions of the intruders, and the
actions taken by honest agents following the protocol steps.
Let’s Get Physical: Models and Methods for Real-World Security Protocols 11
Each step function takes the local view and time of an agent as input and returns
all possible actions consistent with the protocol specification.
There are two types of possible actions, which model an agent either sending
a message with a given transmitter id and storing the associated protocol data
or making a claim.
Note that message reception has already been modeled by the Net-rule.
An action associated with an agent and a message can be translated into the
corresponding trace event using the translateEv function.
A protocol step is therefore of type agent × trace × real → (action × msg) set.
Since our protocol rule Proto (described below) is parameterized by the proto-
col, we define a locale Protocol that defines a constant proto of type step set
and inductively define Tr in the context of this locale.
Since the actions of an agent A only depend on his own previous actions and
observations, we define A’s view of a trace tr as the projection of tr on those
events involving A. For this purpose, we introduce the function occursAt, which
maps events to associated agents, e.g. occursAt(Send Tx iA m L) = A.
Since the time-stamps of trace events refer to absolute time, the view function
accounts for the offset of A’s clock by translating times using the ctime function.
Given an agent and an absolute time-stamp, the uninterpreted function ctime :
agent × real → real returns the corresponding time-stamp for the agent’s clock.
Using the above definitions, we define the Proto-rule in Figure 3. For a given
protocol, specified as a set of the step functions, the Proto rule describes all
possible actions of honest agents, given their local views of a valid trace tr at
a given time t. If all premises are met, the Proto-rule appends the translated
event to the trace. Note that agents’ behavior, modeled by the function step, is
based only on the local clocks of the agents, i.e., agents cannot access the global
time. Moreover, the restriction that all messages must be in dm HA (knowsHA (tr))
ensures that agents only send messages derivable from their knowledge.
Our first lemma specifies a lower bound on the time between when an agent
first uses a nonce and another agent later uses the same nonce. The lemma holds
whenever the initial knowledge of all agents does not contain any nonces.
Lemma 3.1. Let A be an arbitrary (honest or dishonest) agent, N an arbitrary
nonce, and (tSA , Send Tx iA mA LA ) the first event in a trace tr where N ∈
subterms {mA }. If tr contains an event (t, Send Tx jB mB LB ) or (t, Recv Rx jB
mB ) where A = B and N ∈ subterms {mB }, then t − tSA ≥ cdist LoS (A, B).
Our next lemma holds whenever agents’ keys are not parts of protocol messages
and concerns when MACs can be created. Note that we need the notion of
extractable subterms here since protocols use keys in MACs, but never send
them in extractable positions.
Lemma 3.2. Let A and B be honest agents and C a different possibly dishon-
est agent. Furthermore let (tSC , Send Tx iC mC LC ) be an event in the trace
tr where MAC KAB (m) ∈ subterms {mC } for some message m and a shared
secret key KAB . Then, for E either equal to A or B, there is a send event
(tSE , Send Tx jE mE LE ) ∈ tr where MAC KAB (m) ∈ subterms {mE } and
tSC − tSE ≥ cdist LoS (E, C).
Note that the lemmas are similar to the axioms presented in [4]. The proofs of
these lemmas can be found in our Isabelle/HOL formalization [14].
The Free Message Type. We first define the free term algebra of messages.
Messages are built from agent names, integers, reals, nonces, keys, hashes, pairs,
encryption, exclusive-or, and zero.
To faithfully model ⊕,
¯ we require the following set of equations E:
(x ⊕
¯ y) ⊕
¯ z ≈ x ⊕(y
¯ ⊕ ¯ z) (A) x⊕¯ y ≈y⊕¯ x (C)
x⊕ ¯ ZERO ≈ x (U) x⊕
¯ x ≈ ZERO (N)
reduced h
Nonce Hash
reduced (NONCE a na) reduced (HASH h)
x ⊕↓ ZERO =x (1)
↓
ZERO ⊕ x =x (2)
↓ ↓
(a1 ⊕
¯ a2 ) ⊕ (b1 ⊕
¯ b2 ) = if a1 = b1 then a2 ⊕ b2 (3)
↓
else if a1 < b1 then a1 ⊕ (a2 ⊕ (b1 ⊕
? ¯ b2 )) (4)
? ¯ a 2 ) ⊕ ↓ b2 )
else b1 ⊕ ((a1 ⊕ (5)
¯ a2 ) ⊕↓ b
(a1 ⊕ = if a1 = b then a2 (6)
↓
else if a1 < b then a1 ⊕ (a2 ⊕ b)
?
(7)
¯ 1⊕
else b ⊕(a ¯ a2 ) (8)
↓ ↓
a ⊕ (b1 ⊕
¯ b2 ) ¯ b2 ) ⊕ a
= (b1 ⊕ (9)
↓
a⊕ b = if a = b then ZERO (10)
else if a < b then a ⊕
¯ b else b ⊕
¯a (11)
Fig. 5. Definition of ⊕↓
We have proved the following facts about reduction: (1) if reduced x then
(x↓) = x, (2) reduced (x↓), and (3) x →E (x↓). Using these facts we establish:
Lemma 3.3. For all messages x and y, x =E y iff (if and only if ) (x↓) = (y↓).
Furthermore, if reduced x and reduced y, then x =E y iff x = y.
The Message Type, Parts, and dm. Given the above lemma, we use the
function ↓ and the predicate reduced to characterize =E . Isabelle’s typedef mech-
anism allows us to define the quotient type msg with {m | reduced m} as the
representing set. This defines a new type msg with a bijection between the rep-
resenting set in fmsg and msg given by the function Abs msg : fmsg → msg
and its inverse Rep msg : msg → fmsg. Note that =E on fmsg corresponds to
object-logic equality on msg. This is reflected in the following lemma.
Lemma 3.4. For all messages x and y, x = y iff Rep msg(x) =E Rep msg(y).
We define functions on msg by using the corresponding definitions on fmsg and
the embedding and projection functions. That is, we lift the message constructors
to msg using the ↓ function. For example:
In the following, we write 0 for Zero and x ⊕ y for Xor x y. We define a function
fparts on fmsg that returns all extractable subterms of a given message, e.g.
m ∈ fparts({CRYPT k m}), but m ∈ / fparts({HASH m}). The function parts
16 D. Basin et al.
m∈M m ∈ dm A (M )
inj zero hash
m ∈ dm A (M ) 0 ∈ dm A (M ) Hash m ∈ dm A (M )
m, n ∈ dm A (M ) m, n ∈ dm A (M )
fst snd
m ∈ dm A (M ) n ∈ dm A (M )
m ∈ dm A (M ) n ∈ dm A (M ) m ∈ dm A (M ) n ∈ dm A (M )
pair xor
m, n ∈ dm A (M ) m ⊕ n ∈ dm A (M )
m ∈ dm A (M ) Key k ∈ DMA (M )
enc nonce
{m}k ∈ dm A (M ) Nonce A n ∈ dm A (M )
int real
Int n ∈ dm A (M ) Real n ∈ dm A (M )
on msg that is used to instantiate the function of the same name in the message-
theory locale is then defined as
4 Protocol Correctness
P , Recv Rx P NV ) ∈ tr Nonce P NP ∈
r
(tR / used (tr)
(SendA r [NV , Nonce P NP], F (NV , Nonce P NP, Agent P )) ∈ mdb2 (P, tr, t)
Authentication: After a prover P has answered a verifier’s challenge with a
rapid-response, he authenticates the response with the corresponding MAC.
P , Recv Rx P NV ) ∈ tr
r
(tR
(tSP , Send F (NV , Nonce P NP, Agent P ) [NV , Nonce P NP ]) ∈ tr
Tx rP
(SendA r [ ], MACM KV P (NV , Nonce P NP, Agent P )) ∈ mdb3 (P, tr, t)
Claim: Suppose the verifier receives a rapid-response in the measurement phase
at time tR
1 and the corresponding MAC in the validation phase, both involving
the nonce that he initially sent at time tS1 . The verifier therefore concludes that
1 −t1 )∗c/2 is an upper bound on the distance to the prover P , where c denotes
(tR S
2
We have formalized each step in Isabelle/HOL using set comprehension, but present
the steps here as rules for readability. For each rule r, the set we define by compre-
hension is equivalent to the set defined inductively by the rule r.
18 D. Basin et al.
The set of traces Tr is inductively defined by the rules Nil, Fake, Net, and
Proto. Note that the same set of traces can be inductively defined by the Nil,
Fake, and Net rules along with rules describing the individual protocol steps.
See [6] for more details on these different representations.
Definition 4.1. A distance bounding protocol is secure for honest provers (hp-
secure) iff whenever Claim V P, Real d ∈ tr, then d ≥ | loc V − loc P |.
V I V P I
UUUU NV
/ NV
/
UUUU iiii
UUUUiiiiiii
(NI ,I)iiiiUUUUNV
Overall, proving the properties (P0)–(P4) for a given function and applying
Theorems 4.4 and 4.5 is much simpler than the corresponding direct proofs.
However, finding the correct properties and proving these theorems for the XOR
message theory turned out to be considerably harder than proofs for comparable
theorems about a fixed protocol in the free message theory. This additional
complexity mainly stems from the untyped protocol formalization necessary to
realistically model the XOR operator.
References
10. Perrig, A., Tygar, J.D.: Secure Broadcast Communication in Wired and Wireless
Networks. Kluwer Academic Publishers, Norwell (2002)
11. Ballarin, C.: Interpretation of locales in Isabelle: Theories and proof contexts. In:
Borwein, J.M., Farmer, W.M. (eds.) MKM 2006. LNCS (LNAI), vol. 4108, pp.
31–43. Springer, Heidelberg (2006)
12. Porter, B.: Cauchy’s mean theorem and the cauchy-schwarz inequality. The Archive
of Formal Proofs, Formal proof development (March 2006)
13. Clulow, J., Hancke, G.P., Kuhn, M.G., Moore, T.: So near and yet so far: Distance-
bounding attacks in wireless networks. In: Buttyán, L., Gligor, V.D., Westhoff, D.
(eds.) ESAS 2006. LNCS, vol. 4357, pp. 83–97. Springer, Heidelberg (2006)
14. Schmidt, B., Schaller, P.: Isabelle Theory Files: Modeling and Verifying Physical
Properties of Security Protocols for Wireless Networks,
http://people.inf.ethz.ch/benschmi/ProtoVeriPhy/
15. Delzanno, G., Ganty, P.: Automatic Verification of Time Sensitive Cryptographic
Protocols. In: Jensen, K., Podelski, A. (eds.) TACAS 2004. LNCS, vol. 2988, pp.
342–356. Springer, Heidelberg (2004)
16. Evans, N., Schneider, S.: Analysing Time Dependent Security Properties in CSP
Using PVS. In: Cuppens, F., Deswarte, Y., Gollmann, D., Waidner, M. (eds.)
ESORICS 2000. LNCS, vol. 1895, pp. 222–237. Springer, Heidelberg (2000)
17. Acs, G., Buttyan, L., Vajda, I.: Provably Secure On-Demand Source Routing in
Mobile Ad Hoc Networks. IEEE Transactions on Mobile Computing 5(11), 1533–
1546 (2006)
18. Yang, S., Baras, J.S.: Modeling vulnerabilities of ad hoc routing protocols. In:
SASN 2003: Proceedings of the 1st ACM Workshop on Security of Ad Hoc and
Sensor Networks, pp. 12–20. ACM, New York (2003)
19. Courant, J., Monin, J.: Defending the bank with a proof assistant. In: Proceedings
of the 6th International Workshop on Issues in the Theory of Security (WITS
2006), pp. 87–98 (2006)
20. Paulson, L.: Defining functions on equivalence classes. ACM Transactions on Com-
putational Logic 7(4), 658–675 (2006)
21. Basin, D., Constable, R.: Metalogical frameworks. In: Huet, G., Plotkin, G. (eds.)
Logical Environments, pp. 1–29. Cambridge University Press, Cambridge (1993);
Also available as Technical Report MPI-I-92-205
22. Basin, D., Matthews, S.: Logical frameworks. In: Gabbay, D., Guenthner, F. (eds.)
Handbook of Philosophical Logic, 2nd edn., vol. 9, pp. 89–164. Kluwer Academic
Publishers, Dordrecht (2002)
23. Basin, D., Matthews, S.: Structuring metatheory on inductive definitions. Infor-
mation and Computation 162(1–2) (October/November 2000)
24. Nipkow, T.: Reflecting quantifier elimination for linear arithmetic. Formal Logical
Methods for System Security and Correctness, 245 (2008)
VCC: A Practical System
for Verifying Concurrent C
1 Introduction
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 23–42, 2009.
c Springer-Verlag Berlin Heidelberg 2009
24 E. Cohen et al.
Weak Typing. Almost all critical system software today is written in C (or
C++). C has only a weak, easily circumvented type system and explicit memory
(de)allocation, so memory safety has to be explicitly verified. Moreover, address
arithmetic enables many nasty programming tricks that are absent from typesafe
code.
Still, most code in a well-written C system adheres to a much stricter type
discipline. The VCC memory model [2] leverages this by maintaining in ghost
memory a typestate that tracks where the “valid” typed memory objects are.
On each memory reference and pointer dereference, there is an implicit assertion
that resulting object is in the typestate. System invariants guarantee that valid
objects do not overlap in any state, so valid objects behave like objects in a
modern (typesafe) OO system. Well-behaved programs incur little additional
annotation overhead, but nasty code (e.g., inside of the memory allocator) may
require explicit manipulation of the typestate2 .
While C is flexible enough to be used in a very low-level way, we still want
program annotations to take advantage of the meaningful structure provided by
well-written code. Because C structs are commonly used to group semantically
related data, we use them by default like objects in OO verification methodolo-
gies (e.g., as the container of invariants). Users can introduce additional (ghost)
levels of structure to reflect additional semantic structure.
using shadow page tables (SPTs). The SPTs, along with the hardware transla-
tion lookaside buffers (TLBs) (which asynchronously gather and cache virtual
to physical address translations), implement a virtual TLB. This simulation is
subtle for two reasons. First, the hardware TLB is architecturally visible, be-
cause (1) translations are not automatically flushed in response to edits to page
tables stored in memory, and (2) translations are gathered asynchronously and
nonatomically (requiring multiple reads and writes to traverse the page tables),
creating races with system code that operates on the page tables. Even the se-
mantics of TLBs are subtle, and the hypervisor verification required constructing
the first accurate formal models of the x86/x64 TLBs. Second, the TLB simula-
tion is the most important factor in system performance; simple SPT algorithms,
even with substantial optimization, can introduce virtualization overheads of
50% or more for some workloads. The hypervisor therefore uses a very large and
complex SPT algorithm, with dozens of tricky optimizations, many of which
leverage the freedoms allowed by the weak TLB semantics.
3 VCC Methodology
VCC extends C with annotations giving function pre- and post-conditions, asser-
tions, type invariants, and ghost code. Many of these annotations are similar to
those found in ESC/Java [8], Spec# [9], or Havoc [10]. With contracts in place,
VCC performs a static modular analysis, in which each function is verified in iso-
lation, using only the contracts of functions that it calls and invariants of types
used in its code. But unlike the aforementioned systems, VCC is geared towards
sound verification of functional properties of low-level concurrent C code.
We show VCC’s use by specifying hypervisor partitions; the data structure
which keeps state to execute a guest operating system. Listing 1 shows a much
simplified but annotated definition of the data structure. (The actual struct has
98 fields.)
typedef enum { Undefined, Initialized, Active, Terminating } LifeState;
Type Invariants. Type definitions can have type invariants, which are one- or
two-state predicates on data. Other specifications can refer to the invariant of
object o as inv(o) (or inv2(o)). VCC implicitly uses invariants at various
locations, as will be explained in the following subsections.
The invariant of the Partition struct of Listing 1 says that lifeState must
be one of the valid ones defined for a partition, and that if the signaled bit is
set, lifeState is Active.
Ghosts. A crucial concept in the VCC specification language is the division into
operational code and ghost code. Ghost code is seen only by the static veri-
fier, not the regular compiler. Ghost code comes in various forms: Ghost type
definitions are types, which can either be regular C types, or special types for
verification purposes like maps and claims (see Sect. 3.2). Ghost fields of ar-
bitrary type can be introduced as specially marked fields in operational types.
These fields do not interfere with the size and ordering of the operational fields.
Likewise, static or automatic ghost variables of arbitrary type are supported.
Like ghost fields, they are marked special and do not interfere with operational
variables. Ghost parameters of arbitrary type can pass additional ghost state
information in and out of the called function. Ghost state updates perform op-
erations on only the ghost memory state. Any flow of data from the ghost state
to the operational state of the software is forbidden.
One application of ghost code is maintaining shadow copies of implementation
data of the operational software. Shadow copies usually introduce abstractions,
e.g., representing a list as a set. They are also introduced to allow for atomic
update of the shadow, even if the underlying data structure cannot be updated
atomically. The atomic updates are required to enforce protocols on the overall
system using two-state invariants.
Sequential Concurrent
nested nested
claim(o,)
closed(o) closed(o) .
.
owner(o)==o owner(o)==o .
o !=me() unclaim(,o,) o !=me()
ref_cnt(o)==0 ref_cnt(o)==1
structures. The edges of the trees indicate ownership, that is, an aggregate / sub-
object relation. The roots of trees in the ownership forest are objects representing
threads of execution. The set of objects directly or transitively owned by an
object is called the ownership domain of that object.
We couple ownership and type invariants. Intuitively, a type invariant can
depend only on state in its ownership domain. We later relax this notion. Of
course ownership relationships change over time and type invariants cannot al-
ways hold. We thus track the status for each object o in meta-state: owner(o)
denotes the owner of an object, owns(o) specifies the set of objects owned by o
(the methodology ensures that owner() and owns() stay in sync), closed(o)
guarantees that o’s invariant holds.
Figure 1 discusses the possible meta-states of an object (The Concurrent part
of this figure will be explained in Sect. 3.2):
Ghost operations, like wrap, unwrap, etc. update the state as depicted in Fig. 1;
note that unwrapping an object moves its owned object from nested to wrapped,
wrapping the object moves them back.
The function part_send_signal() from Listing 1 respects this meta-state
protocol. The function precondition requires part to be wrapped, i.e., part’s in-
variant holds. The function body first unwraps part, which suspends its
30 E. Cohen et al.
invariant, next its fields are written to. To establish the postcondition, part
is wrapped again; at this point, VCC checks that all invariants of part hold.
The write clauses work accordingly: write access to the root of an owner-
ship domain enables writing to the entire ownership domain. In our example,
writes(part) gives the part_send_signal function write access to all fields
of part (and the objects part owns), and tells the caller, that state updates are
confined to the ownership domain of part. Additionally, one can always write
to objects that are fresh.
spec(claim_t db_claim;)
invariant(keeps(db_claim) && claims_obj(db_claim, db))
} Partition;
Listing 4. Admissibility, volatile fields, shadow fields
Admissibility. A state transition is legal iff, for every object o that is closed in the
transition’s prestate or poststate, if any field of o is updated (including the “field”
indicating closedness) the two-state invariant of o is preserved. An invariant of
an object o is admissible iff it is satisfied by every legal state transition. Stated
differently, an invariant is admissible if it is preserved by every transition that
preserves invariants of all modified objects. Note that admissibility depends only
on type definitions (not function specs or code), and is monotonic (i.e., if an
invariant has been proved admissible, the addition of further types or invariants
cannot make it inadmissible). VCC checks that all invariants are admissible.
Thus, when checking that a state update doesn’t break any invariant, VCC has
to check only the invariants of the updated objects.
Some forms of invariants are trivially admissible. In particular, an invari-
ant in object o that mentions only fields of o is admissible. This applies to
idx < MAXPART. For db->partitions[idx]==this, let us assume that db->
partitions[idx] changes across a transition (other components of that ex-
pression could not change). We know db->partitions[idx] was this in the
prestate. Assume for a moment, that we know db was closed in both the prestate
and the poststate. Then we know db->partitions[idx] was unchanged, it was
NULL in the prestate (but this != NULL), or this was open in the poststate:
all three cases are contradictory. But if we knew that db stays closed, then the
invariant would be admissible.
Claims. The required knowledge is provided by the claim object, owned by the
partition and stored in the ghost field db_claim. A claim, as it is used here, can
be thought of as a handle that keeps its claimed object from opening. If an object
o has a type which is marked with vcc(claimable) the field ref_cnt(o) tracks
the number of outstanding claims that claim o. An object cannot be unwrapped
if this count is positive, and a claim can only be created when the object is closed.
Thus, when a claim to an object exists, the object is known to be closed.4
4
Claims can actually be implemented using admissible two-state invariants. We
decided to build them into the annotation language for convenience.
VCC: A Practical System for Verifying Concurrent C 33
bv_lemma(forall(int i, j; uint64_t v; 0 <= i && i < 64 && 0 <= j && j < 64 ==>
i != j ==> (ISSET(j, v) <==> ISSET(j, v | (1ULL << i)))));
atomic(part, db, c) {
speconly(part->signaled = true;)
InterlockedBitSet(&db->allSignaled, idx);
}
}
Listing 5. Atomic operation
Atomic Blocks. Listing 5 shows how objects can be concurrently updated. The
signaling function now only needs a claim to the partition, passed as a ghost
parameter, and does not need to list the partition in its writes clause. In fact,
the writes clause of the signaling function is empty, reflecting the fact that from
the caller perspective, the actions could have been performed by another thread,
without the current thread calling any function. A thread can read its own non-
volatile state; it can also read non-volatile fields of closed objects, in particular
objects for which it holds claims. On the other hand, the volatile fields can
only be read and written inside of atomic blocks. Such a block identifies the
objects that will be read or written, as well as claims that are needed to establish
closedness of those objects. It can contain at most one physical state update or
read, which is assumed to be performed atomically by the underlying hardware.
In our example, we set the idx-th bit of allSignaled field, using a dedicated
CPU instruction (it also returns the old value, but we ignore it). On top of that,
the atomic block can perform any number of updates of the ghost state. Both
34 E. Cohen et al.
physical and ghost updates can only be performed on objects listed in the header
of the atomic block. The resulting state transition is checked for legality, i.e., we
check the two-state invariants of updated objects across the atomic block. The
beginning of the atomic block is the only place where we simulate actions of
other threads; technically this is done by forgetting everything we used to know
about volatile state. The only other possible state updates are performed on
mutable (and thus open) objects and thus are automatically legal.
In VCC, concurrency primitives (other than atomic operations) are verified (or
just specified), rather than being built in. As an example we present the acqui-
sition of a reader-writer lock in exclusive (i.e., writing) mode.5 In this example,
claims are used to capture not only closedness of objects but also properties of
their fields.
The data structure LOCK (cf. Listing 6) contains a single volatile implemen-
tation variable called state. Its most significant bit holds the write flag that is
set when a client requests exclusive access. The remaining bits hold the number
of readers. Both values can be updated atomically using interlocked operations.
Acquiring a lock in exclusive mode proceeds in two phases. First, we spin on
setting the write bit of the lock atomically. After the write bit has been set, no
new shared locks may be taken. Second, we spin until the number of readers
reaches zero. This protocol is formalized using lock ghost fields and invariants.
The lock contains four ghost variables: a pointer protected_obj identify-
ing the object protected by the lock, a flag initialized that is set after
5
For details and full annotated source code see [11].
VCC: A Practical System for Verifying Concurrent C 35
initialized
Write(state) X
writing X
Readers(state) X decreasing 0 X
ref_cnt(self_claim) X Readers(state) Readers(state)+1 0 Readers(state)
initialization, a flag writing that is one when exclusive access has been granted
(and no reader holds the lock), and a claim self_claim. The use of self_claim
is twofold. First, we tie its reference count to the implementation variables of
the lock. This allows restricting changes of these variables by maintaining claims
on self_claim. Second, it is used to claim lock properties, serving as a proxy
between the claimant and the lock. For this purpose it claims the lock and is
owned by it. It thus becomes writable and claimable in atomic operations on the
lock without requiring it or the lock to be listed in function writes clauses.
Figures 2 and 3 contain a graphical representation of the lock invariants.
Figure 2 shows the setup of ownership and claims. The lock access claim is cre-
ated after initialization. It ensures that the lock remains initialized and allocated,
and clients use it (or a derived claim) when calling lock functions. During non-
exclusive access each reader holds a read access claim on the protected object
and the lock, and the lock owns the protected object, as indicated in gray. Dur-
ing exclusive access the protected object is owned by the client and there may be
no readers. Figure 3 depicts the dynamic relation between implementation and
ghost variables. As long as the write bit is zero, shared locks may be acquired
and released, as indicated by the number of readers. The write bit is set when
the acquisition of an exclusive lock starts. In this phase the number of readers
must decrease. When it reaches zero, exclusive lock acquisition can complete by
activating the writing flag. For each reader and each request for write access
(which is at most one) there is a reference on self_claim.
Listing 7 shows the annotated code for acquisition of an exclusive lock. The
macro claimp wrapped around the parameter lock_access_claim means that
lock_access_claim is a ghost pointer to a wrapped claim; the always clause
says that this claim is wrapped, is not destroyed by the function, and that its
invariant implies that the lock is closed and initialized (and hence, will remain
36 E. Cohen et al.
do
atomic (lock, lock_access_claim) {
done = !Write(InterlockedOr(&lock->state, 0x80000000));
speconly(if (done) {
write_bit_claim = claim(lock->self_claim, lock->initialized &&
stays_unchanged(lock->self_claim) && Write(lock->state));
})
}
while (!done);
do
invariant(wrapped0(write_bit_claim))
atomic (lock, write_bit_claim) {
done = Readers(lock->state)==0;
speconly(if (done) {
giveup_closed_owner(lock->protected_obj, lock);
unclaim(write_bit_claim, lock->self_claim);
lock->writing = 1;
})
}
while (!done);
}
so during the function call). After the function returns it guarantees that the
protected object is unreferenced, wrapped, and fresh (and thus, writable).
In the first loop of the implementation we spin until the write bit could be
atomically set (via the InterlockedOr intrinsic), i.e., in an atomic block the
write bit has been seen as zero and then set to one. In the terminating loop case
we create a temporary claim write_bit_claim, which references the self claim
and states that the lock stays initialized, that the self claim stays, and that the
write bit of the lock has been set. VCC checks that the claimed property holds
initially and is stable against interference. The former is true by the passed-in
lock access claim and the state seen and updated in the atomic operation; the
latter is true because as long as there remains a reference to the self claim,
the writing flag cannot be activated and the write bit cannot be reset. Also,
the atomic update satisfies the lock invariant.
The second loop waits for the readers to disappear. If the number of readers
has been seen as zero, we remove the protected object from the ownership of
the lock, discard the temporary claim, and set the writing flag. All of this
can be justified by the claimed property and the lock’s invariant. Setting the
writing flag is allowed because the write bit is known to be set. Furthermore, the
writing flag is known to be zero in the pre-state of the atomic operation because
the reference count of the self claim, which is referenced by write_bit_claim,
cannot be zero. This justifies the remaining operations.
VCC: A Practical System for Verifying Concurrent C 37
VCC reuses the Spec# tool chain [9], which has allowed developing a compre-
hensive C verifier with limited effort. In addition we developed auxiliary tools
to support the process of verification engineering in a real-world effort.
CCI. The VCC compiler is build using Microsoft Research’s Common Compiler
Infrastructure (CCI) libraries [12]. VCC reads annotated C and turns the input
into CCI’s internal representation to perform name resolution, type and error
check as any normal C compiler would do.
Source Transformations and Plugins. Next, the fully resolved input program un-
dergoes multiple source-to-source transformations. These transformations first
simplify the source, and then add proof obligations stemming from the method-
ology. The last transformation generates the Boogie source.
VCC provides a plugin interface, where users can insert and remove trans-
formations, including the final translation. Currently two plugins have been
implemented: to generate contracts for assembly functions from their C cor-
respondants; and to build a new methodology based on separation logic [13].
Boogie. Once the source code has been analyzed and found to be valid, it is
translated into a Boogie program that encodes the input program according to
our formalization of C. Boogie [14] is an intermediate language that is used by a
number of software verification tools including Spec# and HAVOC. Boogie adds
minimal imperative control flow, procedural and functional abstractions, and
types on top of first order predicated logic. The translation from annotated C to
Boogie encodes both static information about the input program, like types and
their invariants, and dynamic information like the control flow of the program
and the corresponding state updates. Additionally, a fixed axiomatization of C
memory, object ownership, type state, and arithmetic operations (the prelude)
is added. The resulting program is fed to the Boogie program verifier, which
translates it into a sequence of verification conditions. Usually, these are then
passed to an automated theorem prover to be proved or refuted. Alternatively,
they can be discharged interactively. The HOL-Boogie tool [15] provides support
for this approach based on the Isabelle interactive theorem prover.
Z3. Our use of Boogie targets Z3 [16], a state-of-the art first order theorem
prover that supports satisfiability modulo theories (SMT). VCC makes heavy use
38 E. Cohen et al.
of Z3’s fast decision procedures for linear arithmetic and uses the slower fixed-
length bit vector arithmetic only when explicitely invoked by VCC’s bv_lemma()
mechanism (see Listing 5 for an example). These lemmas are typically used when
reasoning for overflowing arithmetic or bitwise operations.
Z3 Axiom Profiler. In the former case a closer inspection of the quantifier in-
stantiation pattern can help to determine inefficiencies in the underlying axiom-
atization of C or the program annotations. This is facilitated by the Z3 Axiom
Profiler, which allows to analyze the quantifier instantiation patters to detect,
e.g., triggering cycles.
Visual Studio. All of this functionality is directly accessible from within the Vi-
sual Studio IDE, including verifying only individual functions. We have found
that this combination of tools enables the verification engineer to efficiently de-
velop and debug the annotations required to prove correctness of the scrutinized
codebase.
VCC: A Practical System for Verifying Concurrent C 39
5 VCC Experience
The methodology presented in this paper was implemented in VCC in late 2008.
Since this methodology differs significantly from earlier approaches, the annota-
tion of the hypervisor codebase had to start from scratch. As of June 2009, four-
teen verification engineers are working on annotating the codebase and verifying
functions. Since November 2008 approx. 13 500 lines of annotations have been
added to the hypervisor codebase. About 350 functions have been successfully
verified resulting in an average of two verified functions per day. Additionally,
invariants for most public and private data types (consisting of about 150 struc-
tures or groups) have been specified and proved admissable. This means that
currently about 20% of the hypervisor codebase has been successfully verified
using our methodology.
A major milestone in the verification effort is having the specifications of
all public functions from all layers so that the verification of the different layers
require no interaction of the verification engineers, since all required information
has been captured in the contracts and invariants. This milestone has been
reached or will be reached soon for seven modules. Also for three modules already
more than 50% of the functions have been successfully verified.
We have found that having acceptable turnaround times for verify-and-fix
cycles is crucial to maintain productivity of the verification engineers. Currently
VCC verifies most functions in 0.5 to 500 seconds with an average of about 25
seconds. The longest running function needs ca. 2 000 seconds to be verified.
The all-time high was around 50 000 seconds for a successful proof attempt.
In general failing proof attempts tend to take longer than successfully verifying a
function. A dedicated test suite has been created to constantly monitor verifica-
tion performance. Performance has improved by one to two orders of magnitude.
Many changes have contributed to these improvements, ranging from changes in
our methodology, the encoding of type state, our approach to invariant checking,
the support of out parameters, to updates in the underlying tools Boogie and
Z3. With these changes, we have, for example, reduced the verification time for
the 50 000s function down to under 1 000s.
Still, in many cases the verification performance is unacceptable. Empirically,
we have found that verification times of over a minute start having an impact
on the productivity of the verification engineer, and that functions that require
one hour or longer are essential intractable. We are currently working on many
levels to alleviate these problems: improvements in the methodology, grid-style
distribution of verification condition checking, parallelization of proof search for a
single verification condition, and other improvements of SMT solver technology.
6 Related Work
References
1. Verisoft XT: The Verisoft XT project (2007), http://www.verisoftxt.de
2. Cohen, E., Moskal, M., Schulte, W., Tobies, S.: A precise yet efficient memory
model for C. In: SSV 2009. ENTCS. Elsevier Science B.V., Amsterdam (2009)
3. Flanagan, C., Freund, S.N., Qadeer, S.: Thread-modular verification for shared-
memory programs. In: Le Métayer, D. (ed.) ESOP 2002. LNCS, vol. 2305, pp.
262–277. Springer, Heidelberg (2002)
4. Jacobs, B., Piessens, F., Leino, K.R.M., Schulte, W.: Safe concurrency for aggregate
objects with invariants. In: Aichernig, B.K., Beckert, B. (eds.) SEFM 2005, pp.
137–147. IEEE, Los Alamitos (2005)
5. Maus, S., Moskal, M., Schulte, W.: Vx86: x86 assembler simulated in C powered
by automated theorem proving. In: Meseguer, J., Roşu, G. (eds.) AMAST 2008.
LNCS, vol. 5140, pp. 284–298. Springer, Heidelberg (2008)
6. Advanced Micro Devices (AMD), Inc.: AMD64 Architecture Programmer’s Man-
ual: Vol. 1-3 (2006)
7. Intel Corporation: Intel 64 and IA-32 Architectures Software Developer’s Manual:
Vol. 1-3b (2006)
8. Flanagan, C., Leino, K.R.M., Lillibridge, M., Nelson, G., Saxe, J.B., Stata, R.:
Extended static checking for Java. SIGPLAN Notices 37(5), 234–245 (2002)
9. Barnett, M., Leino, K.R.M., Schulte, W.: The Spec# programming system: An
overview. In: Barthe, G., Burdy, L., Huisman, M., Lanet, J.-L., Muntean, T. (eds.)
CASSIS 2004. LNCS, vol. 3362, pp. 49–69. Springer, Heidelberg (2005)
10. Microsoft Research: The HAVOC property checker,
http://research.microsoft.com/projects/havoc
11. Hillebrand, M.A., Leinenbach, D.C.: Formal verification of a reader-writer lock
implementation in C. In: SSV 2009. ENTCS, Elsevier Science B.V., Amsterdam
(2009); Source code, http://www.verisoftxt.de/PublicationPage.html
12. Microsoft Research: Common compiler infrastructure,
http://ccimetadata.codeplex.com/
13. Botinĉan, M., Parkinson, M., Schulte, W.: Separation logic verification of C pro-
grams with an SMT solver. In: SSV 2009. ENTCS. Elsevier Science B.V., Amster-
dam (2009)
42 E. Cohen et al.
14. Barnett, M., Chang, B.Y.E., Deline, R., Jacobs, B., Leino, K.R.M.: Boogie: A mod-
ular reusable verifier for object-oriented programs. In: de Boer, F.S., Bonsangue,
M.M., Graf, S., de Roever, W.-P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 364–387.
Springer, Heidelberg (2006)
15. Böhme, S., Moskal, M., Schulte, W., Wolff, B.: HOL-Boogie: An interactive prover-
backend for the Verifiying C Compiler. Journal of Automated Reasoning (to ap-
pear, 2009)
16. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
17. Owicki, S., Gries, D.: Verifying properties of parallel programs: An axiomatic ap-
proach. Communications of the ACM 19(5), 279–285 (1976)
18. Ashcroft, E.A.: Proving assertions about parallel programs. Journal of Computer
and System Sciences 10(1), 110–135 (1975)
19. Jones, C.B.: Tentative steps toward a development method for interfering pro-
grams. ACM Transactions on Programming Languages and Systems 5(4), 596–619
(1983)
20. O’Hearn, P.W.: Resources, concurrency, and local reasoning. Theoretical Computer
Science 375(1-3), 271–307 (2007)
21. Reynolds, J.C.: Separation logic: A logic for shared mutable data structures. In:
LICS 2002, pp. 55–74. IEEE, Los Alamitos (2002)
22. Bornat, R., Calcagno, C., O’Hearn, P.W., Parkinson, M.J.: Permission accounting
in separation logic. In: Palsberg, J., Abadi, M. (eds.) POPL 2005, pp. 259–270.
ACM, New York (2005)
23. Vafeiadis, V., Parkinson, M.J.: A marriage of rely/guarantee and separation logic.
In: Caires, L., Vasconcelos, V.T. (eds.) CONCUR 2007. LNCS, vol. 4703, pp. 256–
271. Springer, Heidelberg (2007)
24. Leino, K.R.M., Müller, P.: A basis for verifying multi-threaded programs. In:
Castagna, G. (ed.) ESOP 2009. LNCS, vol. 5502, pp. 378–393. Springer, Heidelberg
(2009)
25. Leino, K.R.M., Schulte, W.: Using history invariants to verify observers. In: De
Nicola, R. (ed.) ESOP 2007. LNCS, vol. 4421, pp. 80–94. Springer, Heidelberg
(2007)
26. Klein, G.: Operating system verification – An overview. Sādhanā: Academy Pro-
ceedings in Engineering Sciences 34(1), 27–69 (2009)
27. Journal of Automated Reasoning: Operating System Verification 42(2–4) (2009)
28. Hohmuth, M., Tews, H.: The VFiasco approach for a verified operating system. In:
2nd ECOOP Workshop in Programming Languages and Operating Systems (2005)
29. Heiser, G., Elphinstone, K., Kuz, I., Klein, G., Petters, S.M.: Towards trustworthy
computing systems: Taking microkernels to the next level. SIGOPS Oper. Syst.
Rev. 41(4), 3–11 (2007)
30. Ni, Z., Yu, D., Shao, Z.: Using XCAP to certify realistic systems code: Machine
context management. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007. LNCS,
vol. 4732, pp. 189–206. Springer, Heidelberg (2007)
31. Alkassar, E., Hillebrand, M.A., Leinenbach, D.C., Schirmer, N.W., Starostin, A.,
Tsyban, A.: Balancing the load: Leveraging a semantics stack for systems verifica-
tion. Journal of Automated Reasoning: Operating System Verification 27, 389–454
Without Loss of Generality
John Harrison
1 Introduction
Mathematical proofs sometimes state that a certain assumption can be made ‘without
loss of generality’, often abbreviated to ‘WLOG’. The phase suggest that although mak-
ing the assumption at first sight only proves the theorem in a more restricted case, this
does nevertheless justify the theorem in full generality. What is the intuitive justification
for this sort of reasoning? Occasionally the phrase covers situations where we neglect
special cases that are obviously trivial for other reasons. But more usually it suggests
the exploitation of symmetry in the problem. For example, consider Schur’s inequality,
which asserts that for any nonnegative real numbers a, b and c and integer k ≥ 0 one
has 0 ≤ ak (a − b)(b − c) + bk (b − a)(b − c) + ck (c − a)(c − b). A typical proof might
begin:
Without loss of generality, let a ≤ b ≤ c.
If asked to spell this out in more detail, we might say something like:
Since ≤ is a total order, the three numbers must be ordered somehow, i.e. we
must have (at least) one of a ≤ b ≤ c, a ≤ c ≤ b, b ≤ a ≤ c, b ≤ c ≤ a,
c ≤ a ≤ b or c ≤ b ≤ a. But the theorem is completely symmetric between
a, b and c, so each of these cases is just a version of the other with a change of
variables, and we may as well just consider one of them.
Suppose that we are interested in formalizing mathematics in a mechanical theorem
prover. Generally speaking, for an experienced formalizer it’s rather routine to take an
existing proof and construct a formal counterpart, even though it may require a great
deal of work to get things just right and encourage the proof assistant check all the
details. But with such ‘without loss of generality’ constructs, it’s not immediately ob-
vious what the formal counterpart should be. We can plausibly suggest two possible
formalizations:
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 43–59, 2009.
c Springer-Verlag Berlin Heidelberg 2009
44 J. Harrison
– The phrase may be an informal shorthand saying ‘we should really do 6 very similar
proofs here, but if we do one, all the others are exactly analogous and can be left to
the reader’.
– The phrase may be asserting that ‘by a general logical principle, the apparently
more general case and the special WLOG case are in fact equivalent (or at least the
special case implies the general one)’.
The former point of view can be quite natural in a computer proof assistant. If we
have a proof script covering one of the 6 cases, we might simply perform a 6-way
case-split and for each case use a duplicate of the initial script, changing the names of
variables systematically in an editor. Indeed, if we have a programmable proof assistant,
it would be more elegant to write a general parametrized proof script that we could use
for all 6 cases with different parameters. This sort of programming is exactly the kind
of thing that LCF-style systems [3] like HOL [2] are designed to make easy via their
‘metalanguage’ ML, and sometimes its convenience makes it irresistible. However, this
approach is open to criticism on at least three grounds:
– Ugly/clumsy
– Inefficient
– Not faithful to the informal proof.
Indeed, it seems unnatural, even with the improvement of using a parametrized script, to
perform essentially the same proof 6 different times, and if each proof takes a while to
run, it could waste computer resources. And it is arguably not what the phrase ‘without
loss of generality’ is meant to conjure up. If the book had intended that interpretation, it
would probably have said something like ‘the other cases are similar and are left to the
reader’. So let us turn to how we might formalize and use a general logical principle.
This asserts that for any property P of two real numbers, if the property is symmetric
between those two numbers (∀x y. P x y ⇔ P y x) and assuming x ≤ y the property
holds (∀x y. x ≤ y ⇒ P x y), then we can conclude that it holds for all real numbers
(∀x y. P x y). In order to tackle the Schur inequality we will prove a version for
three variables. Our chosen formulation is quite analogous, but using a more minimal
formulation of symmetry between all three variables:
REAL_WLOG_3_LE =
|- (∀x y z. P x y z ⇒ P y x z ∧ P x z y) ∧
(∀x y z. x <= y ∧ y <= z ⇒ P x y z)
⇒ (∀x y z. P x y z)
Without Loss of Generality 45
Now let us see how to use this to prove Schur’s inequality in HOL Light, which we
formulate as follows:
The first step in the proof is to strip off the additional variable k (which will not
play a role in the symmetry argument), use backwards chaining with the WLOG the-
orem REAL_WLOG_3_LE, and then break the resulting goal into two subgoals, one
corresponding to the symmetry and the other to the special case.
Although this looks rather large, the proof simply exploits the fact that addition and
multiplication are associative and commutative via routine logical reasoning, so we can
solve it by:
MESON_TAC[REAL_ADD_AC; REAL_MUL_AC]
46 J. Harrison
We have now succeeded in reducing the original goal to the special case:
and so we can claim that the foregoing proof steps correspond almost exactly to the
informal WLOG principle. We now rewrite the expression into a more convenient form:
The form of this expression is now congenial, so we can simply proceed by repeat-
edly chaining through various monotonicity theorems and then use linear arithmetic
reasoning to finish the proof:
REPEAT(FIRST(map MATCH_MP_TAC
[REAL_LE_ADD; REAL_LE_MUL; REAL_LE_MUL2]) THEN
ASM_SIMP_TAC[REAL_POW_LE2; REAL_POW_LE; REAL_SUB_LE] THEN
REPEAT CONJ_TAC) THEN
ASM_REAL_ARITH_TAC
invariance properties: the conservation of angular momentum arises from invariance un-
der rotations, while conservation of energy arises from invariance under shifts in time,
and so on [8].
One of the most important ways in which such invariances are used in proofs is to
make a convenient choice of coordinate system. In our formulation of Euclidean space
in HOL Light [6], geometric concepts are all defined in analytic terms using vectors,
which in turn are expressed with respect to a standard coordinate basis. For example,
the angle formed by three points is defined in terms of the angle between two vectors:
|- angle(a,b,c) = vector_angle (a - b) (c - b)
which is defined in terms of norms and dot products using the inverse cosine function
acs (degenerating to π/2 if either vector is zero):
|- vector_angle x y =
if x = vec 0 ∨ y = vec 0 then pi / &2
else acs((x dot y) / (norm x * norm y))
This means that whenever we state geometric theorems, most of the concepts ul-
timately rest on a particular choice of coordinate system and standard basis vectors.
When we are performing high-level reasoning, we can often reason about geometric
concepts directly using lemmas established earlier without ever dropping down to the
ultimate representation with respect to the standard basis. But when we do need to
reason algebraically in terms of coordinates, we often find that a different choice of
coordinate system would make the reasoning much more tractable.
The simplest example is probably choosing the origin of the coordinate system. If a
proposition ∀x. P [x] is invariant under spatial translation, i.e. changing x to any a + x,
then it suffices to prove the special case P [0], or in other words, to assume without loss
of generality that x is the origin. The reasoning is essentially trivial: if we have P [0]
and also ∀a x. P [x] ⇒ P [a + x], then we can deduce P [x + 0] and so P [x]. In HOL
Light we can state this as the following general theorem, asserting that if P is invariant
under translation and we have the special case P [0], then we can conclude ∀x. P [x]:
WLOG_ORIGIN =
|- (∀a x. P(a + x) ⇔ P x) ∧ P(vec 0) ⇒ (∀x. P x)
Thus, when confronted with a goal, we can simply rearrange the universally quanti-
fied variables so that the one we want to take as the origin is at the outside, then apply
48 J. Harrison
this theorem, giving us the special case P [0] together with the invariance of the goal
under translation. For example, suppose we want to prove that the angles of a triangle
that is not completely degenerate all add up to π radians (180 degrees):
‘∀A B C. ˜(A = B ∧ B = C ∧ A = C)
⇒ angle(B,A,C) + angle(A,B,C) + angle(B,C,A) = pi‘
‘∀B C.
˜(vec 0 = B ∧ B = C ∧ vec 0 = C)
⇒ angle(B,vec 0,C) + angle(vec 0,B,C) + angle(B,C,vec 0) =
pi‘
and another goal for the invariance of the property under translation by a:
‘∀a A. (∀B C.
˜(a + A = B ∧ B = C ∧ a + A = C)
⇒ angle(B,a + A,C) +
angle(a + A,B,C) + angle(B,C,a + A) = pi) ⇔
(∀B C.
˜(A = B ∧ B = C ∧ A = C)
⇒ angle(B,A,C) + angle(A,B,C) + angle(B,C,A) = pi)‘
We will not dwell more on the detailed proof of the theorem in the special case where
A is the origin, but will instead focus on the invariance proof. In contrast to the case of
Schur’s inequality, this is somewhat less easy and can’t obviously be deferred to basic
first-order automation. So how do we prove it?
At first sight, things don’t look right: it seems that we ought to have translated not
just A but all the variables A, B and C together. However, note that for any given a the
translation mapping x → a + x is surjective: for any y there is an x such that a + x = y
(namely x = y − a). That means that we can replace universal quantifiers over vec-
tors, and even existential ones too, by translated versions. This general principle can be
embodied in the following HOL theorem, easily proven automatically by MESON_TAC:
QUANTIFY_SURJECTION_THM =
|- ∀f:A->B.
(∀y. ∃x. f x = y)
⇒ (∀P. (∀x. P x) ⇔ (∀x. P (f x))) ∧
(∀P. (∃x. P x) ⇔ (∃x. P (f x))) ∧
We can apply it with a bit of instantiation and higher-order rewriting to all the uni-
versally quantified variables on the left-hand side of the equivalence in the goal and
obtain:
Without Loss of Generality 49
‘∀a A.
(∀B C.
˜(a + A = a + B ∧ a + B = a + C ∧ a + A = a + C)
⇒ angle(a + B,a + A,a + C) +
angle(a + A,a + B,a + C) +
angle(a + B,a + C,a + A) = pi) ⇔
(∀B C.
˜(A = B ∧ B = C ∧ A = C)
⇒ angle(B,A,C) + angle(A,B,C) + angle(B,C,A) = pi)‘
[. . . ] formal proofs by symmetry are much harder than anticipated. It was nec-
essary to give a total of nearly a hundred lemmas, showing that the symmetries
preserve all of the relevant structures, all the way back to the foundations.
Indeed, this process seems unpleasant enough that we should consider automating it,
and for geometric invariants this is just what we have done.
While we usually aim to prove that numerical functions of vectors (e.g. distances
or angles) or predicates on vectors (e.g. collinearity) are completely invariant under
translation, for operations returning more vectors, we normally want to prove that the
translation can be ‘pulled outside’, e.g.
Then a translated formula can be systematically mapped into its untranslated form
by applying these transformations in a bottom-up fashion, pulling the translation up
through vector-producing functions like midpoint and then systematically eliminat-
ing them when they reach the level of predicates or numerical functions of vectors.
Our setup is somewhat more ambitious in that it applies not only to properties of
vectors but also to properties of sets of vectors, many of which are also invariant under
translation. For example, recall that a set is convex if whenever it contains the points x
and y it also contains each intermediate point between x and y, i.e. each ux + vy where
0 ≤ u, v and u + v = 1:
|- ∀s. convex s ⇔
(∀x y u v.
x IN s ∧ y IN s ∧ &0 <= u ∧ &0 <= v ∧ u + v = &1
⇒ u % x + v % y IN s)
as are many other geometric or topological predicates (bounded, closed, compact, path-
connected, . . . ) and numerical functions on sets such as measure (area, volume etc.
depending on dimension):
As with points, for functions that return other sets of vectors, our theorems state
rather that the ‘image under translation’ operation can be pulled up through the function,
e.g.
We include in the list other theorems of the same type for the basic set operations, so
that they can be handled as well, e.g.
With this done, it remains only to rewrite with the invariance theorems taken from
the list invariant_under_translation in a bottom-up sweep. If the intended
result uses only these properties in a suitable fashion, then this should automatically
reduce the invariance goal to triviality. The user does not even see it, but is presented
instead with the special case. (If the process of rewriting does not solve the invariance
goal, then that is returned as an additional subgoal so that the user can either help the
proof along manually or perhaps observe that a concept is used for which no invariance
theorem has yet been stored.) For example, if we set out to prove the formula for the
volume of a ball:
‘∀z:realˆ3 r. &0 <= r
⇒ measure(cball(z,r)) = &4 / &3 * pi * r pow 3‘
Here is an example with a more complicated quantifier structure and a mix of sets
and points. We want to prove that for any point a and nonempty closed set s there is a
closest point of s to a. (A set is closed if it contains all its limit points, i.e. all points that
can be approached arbitrarily closely by a member of the set.) We set up the goal:
g ‘∀s a:realˆN.
closed s ∧ ˜(s = {})
⇒ ∃x. x IN s ∧
(∀y. y IN s ⇒ dist(a,x) <= dist(a,y))‘;;
and with a single application of our tactic, we can suppose the point in question is the
origin:
# e(GEOM_ORIGIN_TAC ‘a:realˆN‘);;
val it : goalstack = 1 subgoal (1 total)
|- ∀f. orthogonal_transformation f ⇔
linear f ∧ (∀v w. f v dot f w = v dot w)
|- ∀f. linear f ⇔
(∀x y. f(x + y) = f x + f y) ∧
(∀c x. f(c % x) = c % f x)
|- orthogonal_transformation f ⇔
linear f ∧ orthogonal_matrix(matrix f)
|- orthogonal_matrix(Q) ⇔
transp(Q) ** Q = mat 1 ∧ Q ** transp(Q) = mat 1‘;;
|- ∀a b:realˆN.
norm(a) = norm(b)
⇒ ∃f. orthogonal_transformation f ∧ f a = b
Without Loss of Generality 53
|- ∀a b:realˆN.
2 <= dimindex(:N) ∧ norm(a) = norm(b)
⇒ ∃f. orthogonal_transformation f ∧
det(matrix f) = &1 ∧
f a = b
|- ∀f s. linear f
⇒ convex hull IMAGE f s = IMAGE f (convex hull s)
Some apply to all injective linear maps, e.g. those for closedness of a set:
|- ∀f s. linear f ∧ (∀x y. f x = f y ⇒ x = y)
⇒ (closed (IMAGE f s) ⇔ closed s)
Some apply to all bijective (injective and surjective) linear maps, e.g. those for
openness of a set:
|- ∀f s. linear f ∧
(∀x y. f x = f y ⇒ x = y) ∧ (∀y. ∃x. f x = y)
⇒ (open (IMAGE f s) ⇔ open s)
Some apply to all norm-preserving linear maps, e.g. those for angles:
Note that a norm-preserving linear map is also injective, so this property also suffices
for all those requiring injectivity. For a function f : RN → RN this property is precisely
equivalent to being an orthogonal transformation:
|- ∀f:realˆN->realˆN.
orthogonal_transformation f ⇔
linear f ∧ (∀v. norm(f v) = norm v)
54 J. Harrison
However, it is important for some other related applications (an example is below)
that we make theorems applicable to maps where the dimensions of the domain and
codomain spaces are not necessarily the same.
Finally, the most restrictive requirement applies to just one theorem, the one for the
vector cross product. This has a kind of chirality, so may have its sign changed by a
general orthogonal transformation. Its invariance theorem requires a rotation of type
R3 → R3 :
|- ∀f x y. linear f ∧
(∀x. norm(f x) = norm x) ∧ det(matrix f) = &1
⇒ (f x) cross (f y) = f(x cross y)
We actually store the theorem in a slightly peculiar form, which makes it easier to
apply uniformly in a framework where we can assume a transformation is a rotation
except in dimension 1:
We can implement various tactics that exploit our invariance theorems to make vari-
ous simplifying transformations without loss of generality:
The first two work in much the same way as the earlier tactic for choosing the ori-
gin. We apply the general theorem, modify all the other quantified variables and then
rewrite with invariance theorems. We can profitably think of the basic processes in
such cases as instances of general HOL theorems, though this is not actually how they
are implemented. For example, we might say that if for each x we can find a ‘trans-
form’ (e.g. translation, or orthogonal transformation) f such that f (x) is ‘nice’ (e.g. is
zero, or a multiple of some basis vector), and can also deduce for any ‘transform’ that
P (f (x)) ⇔ P (x), then proving P (x) for all x is equivalent to proving it for ‘nice’ x.
(The theorem that follows is automatically proved by MESON.)
However, in some more general situations we don’t exactly want to show that P (f (x))
⇔ P (x), but rather that P (f (x)) ⇔ P (x) for some related but not identical property
Without Loss of Generality 55
P , for example if we want to transfer a property to a different type. For this reason, it
is actually more convenient to observe that we can choose a ‘transform’ from a ‘nice’
value rather than to it, i.e. rely on the following:
The advantage of this is that in our approach based on rewriting by applying in-
variance theorems, the new property P can emerge naturally from the rewriting of
P (f (x)), instead of requiring extra code for its computation. Even in cases where the
generality is not needed, we typically use this structure, i.e. choose our mapping from a
‘nice’ value.
6 An Extended Example
Let us see a variety of our tactics at work on a problem that was, in fact, the original
motivation for most of the work described here.
‘∀u1:realˆ3 u2 p a b.
˜(u1 = u2) ∧
plane p ∧
{u1,u2} SUBSET p ∧
dist(u1,u2) <= a + b ∧
abs(a - b) < dist(u1,u2) ∧
&0 <= a ∧
&0 <= b
⇒ (∃d1 d2. {d1,d2} SUBSET p ∧
&1 / &2 % (d1 + d2) IN affine hull {u1, u2} ∧
dist(d1,u1) = a ∧
dist(d1,u2) = b ∧
dist(d2,u1) = a ∧
dist(d2,u2) = b)‘
The first step is to assume without loss of generality that the plane p is {(x, y, z) |
z = 0}, i.e. the set of points whose third coordinate is zero, following which we man-
ually massage the goal so that the quantifiers over u1 , u2 , d1 and d2 carry explicit
restrictions:
Now we apply another WLOG tactic to reduce the problem from R3 to R2 , and again
make a few superficial rearrangements:
# e(PAD2D3D_TAC THEN
SIMP_TAC[RIGHT_IMP_FORALL_THM; IMP_IMP; GSYM CONJ_ASSOC]);;
resulting in:
‘∀u1 u2 a b.
˜(u1 = u2) ∧
plane {z | z$3 = &0} ∧
dist(u1,u2) <= a + b ∧
abs(a - b) < dist(u1,u2) ∧
&0 <= a ∧
&0 <= b
⇒ (∃d1 d2.
&1 / &2 % (d1 + d2) IN affine hull {u1, u2} ∧
dist(d1,u1) = a ∧
dist(d1,u2) = b ∧
dist(d2,u1) = a ∧
dist(d2,u2) = b)‘
Although HOL Light does not by default show the types, all the vector variables
are now in R2 instead of R3 (except for the bound variable z in the residual planarity
hypothesis, which is no longer useful anyway). Having collapsed the problem from 3
dimensions to 2 in this way, we finally choose u1 as the origin:
Without Loss of Generality 57
# e(GEOM_ORIGIN_TAC ‘u1:realˆ2‘);;
val it : goalstack = 1 subgoal (1 total)
‘∀u2 a b.
˜(vec 0 = u2) ∧
plane {z | z$3 = &0} ∧
dist(vec 0,u2) <= a + b ∧
abs(a - b) < dist(vec 0,u2) ∧
&0 <= a ∧
&0 <= b
⇒ (∃d1 d2.
&1 / &2 % (d1 + d2) IN affine hull {vec 0, u2} ∧
dist(d1,vec 0) = a ∧
dist(d1,u2) = b ∧
dist(d2,vec 0) = a ∧
dist(d2,u2) = b)‘
# e(GEOM_BASIS_MULTIPLE_TAC 1 ‘u2:realˆ2‘);;
val it : goalstack = 1 subgoal (1 total)
We have thus reduced the original problem to a nicely oriented situation where the
points we consider live in 2-dimensional space and are of the form (0, 0) and (0, u2 ).
The final coordinate geometry is now relatively straightforward.
7 Future Work
Our battery of tactics so far is already a great help in proving geometric theorems. There
are several possible avenues for improvement and further development.
58 J. Harrison
One is to make use of still broader classes of transformations when handling theo-
rems about correspondingly narrower classes of concepts. For example, some geometric
properties, e.g. those involving collinearity and incidence but not distances and angles,
are invariant under still broader classes of transformations, such as shearing, and this
can be of use in choosing an even more convenient coordinate system — see for exam-
ple the proof of Pappus’s theorem given by Chou [1]. Other classes of theorems behave
nicely under scaling, so we can freely turn some point (0, a) = (0, 0) into just (0, 1) and
so eliminate another variable. Indeed, for still more restricted propositions, e.g. those
involving only topological properties, we can consider continuous maps that may not
be linear.
It would also be potentially interesting to extend the process to additional ‘higher-
order’ properties. To some extent, we already do this with our support for sets of vectors,
but we could take it much further, e.g. considering properties of sequences and series
and their limits. A nice example where we would like to exploit a higher-order invari-
ance arises in proving that every polygon has a triangulation. The proof given in [4]
says: ‘Pick the coordinate axis so that no two vertices have the same y coordinate’. It
should not be difficult to extend the methods here to prove invariance of notions like
‘triangulation of’, and we could then pick a suitable orthogonal transformation to force
the required property (there are only finitely many vertices but uncountably many angles
of rotation to choose).
Another interesting idea would be to reformulate the process in a more ‘metalogical’
or ‘reflective’ fashion, by formalizing the class of problems for which our transforma-
tions suffice once and for all, instead of rewriting with the current selection of theorems
and then either succeeding or failing. From a practical point of view, we think our
current approach is usually better. It is actually appealing not to delimit the class of
permissible geometric properties, but have that class expand automatically as new in-
variance theorems are added. Moreover, to use the reflective approach we would need to
map into some formal syntax, which needs similar transformations anyway. However,
there may be some situations where it would be easier to prove general properties in a
metatheoretic fashion. For example, a first-order assertion over vectors with M vector
variables, even if the pattern of quantification is involved, can be reduced to spaces of
dimension ≤ M [9]. It should be feasible to handle important special cases (e.g. purely
universal formulas) within our existing framework, but exploiting the full result might
be a good use for metatheory.
Acknowledgements
The author is grateful to Truong Nguyen, whose stimulating questions on the Flyspeck
project mailing list were the inspiration for most of this work.
References
1. Chou, S.-C.: Proving elementary geometry theorems using Wu’s algorithm. In: Bledsoe, W.W.,
Loveland, D.W. (eds.) Automated Theorem Proving: After 25 Years. Contemporary Mathe-
matics, vol. 29, pp. 243–286. American Mathematical Society, Providence (1984)
Without Loss of Generality 59
2. Gordon, M.J.C., Melham, T.F.: Introduction to HOL: a theorem proving environment for
higher order logic. Cambridge University Press, Cambridge (1993)
3. Gordon, M., Wadsworth, C.P., Milner, R.: Edinburgh LCF. LNCS, vol. 78. Springer,
Heidelberg (1979)
4. Hales, T.C.: Easy pieces in geometry (2007),
http://www.math.pitt.edu/˜thales/papers/
5. Hales, T.C.: The Jordan curve theorem, formally and informally. The American Mathematical
Monthly 114, 882–894 (2007)
6. Harrison, J.: A HOL theory of Euclidean space. In: Hurd, J., Melham, T. (eds.) TPHOLs 2005.
LNCS, vol. 3603, pp. 114–129. Springer, Heidelberg (2005)
7. Klein, F.: Vergleichende Betrachtungen ber neuere geometrische Forschungen. Mathematische
Annalen 43, 63–100 (1893); Based on the speech given on admission to the faculty of the
Univerity of Erlang in 1872. English translation “A comparative review of recent researches
in geometry” in Bulletin of the New York Mathematical Society 2, 460–497 (1892-1893)
8. Noether, E.: Invariante Variationsprobleme. Nachrichten von der Königlichen Gesellschaft der
Wissenschaften zu Gttingen: Mathematisch-physikalische Klasse, 235–257 (1918); English
translation “Invariant variation problems” by M.A. Travel in ‘Transport Theory and Statistical
Physics’, 1, 183–207 (1971)
9. Solovay, R.M., Arthan, R., Harrison, J.: Some new results on decidability for elementary alge-
bra and geometry. ArXiV preprint 0904.3482 (2009); submitted to Annals of Pure and Applied
Logic, http://arxiv.org/PS_cache/arxiv/pdf/0904/0904.3482v1.pdf
HOL Light: An Overview
John Harrison
The original Edinburgh LCF was a theorem prover for Scott’s Logic of Com-
putable Functions [16], hence the name LCF. But as emphasized by Gordon [4],
the basic LCF approach is applicable to any logic, and now there are descen-
dents implementing a variety of higher order logics, set theories and constructive
type theories. In particular, members of the HOL family [5] implement a ver-
sion of classical higher order logic, hence the name HOL. They take the LCF
approach a step further in that all theory developments are pursued ‘definition-
ally’. New mathematical structures, such as the real numbers, may be defined
only by exhibiting a model for them in the existing theories (say as Dedekind
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 60–66, 2009.
c Springer-Verlag Berlin Heidelberg 2009
HOL Light: An Overview 61
HOL Light’s logic is simple type theory [1,2] with polymorphic type variables.
The terms of the logic are those of simply typed lambda calculus, with formulas
being terms of boolean type, rather than a separate category. Every term has a
single welldefined type, but each constant with polymorphic type gives rise to
an infinite family of constant terms. There are just two primitive types: bool
(boolean) and ind (individuals), and given any two types σ and τ one can form
the function type σ → τ .1
For the core HOL logic, there is essentially only one predefined logical con-
stant, equality (=) with polymorphic type α → α → bool. However to state one
of the mathematical axioms we also include another constant ε : (α → bool) →
α, explained further below. For equations, we use the conventional concrete syn-
tax s = t, but this is just surface syntax for the λ-calculus term ((=)s)t, where
juxtaposition represents function application. For equations between boolean
terms we often use s ⇔ t, but this again is just surface syntax.
The HOL Light deductive system governs the deducibility of one-sided se-
quents Γ p where p is a term of boolean type and Γ is a set (possibly empty)
of terms of boolean type. There are ten primitive rules of inference, rather similar
to those for the internal logic of a topos [14].
REFL
t=t
Γ s=t Δt=u
TRANS
Γ ∪Δs=u
Γ s=t Δu=v
MK COMB
Γ ∪ Δ s(u) = t(v)
Γ s=t
ABS
Γ (λx. s) = (λx. t)
BETA
(λx. t)x = t
ASSUME
{p} p
1
In Church’s original notation, also used by Andrews, these are written o, ι and τ σ
respectively. Of course the particular concrete syntax has no logical significance.
62 J. Harrison
Γ p⇔q Δp
EQ MP
Γ ∪Δ q
Γ p Δq
DEDUCT ANTISYM RULE
(Γ − {q}) ∪ (Δ − {p}) p ⇔ q
Γ [x1 , . . . , xn ] p[x1 , . . . , xn ]
INST
Γ [t1 , . . . , tn ] p[t1 , . . . , tn ]
Γ [α1 , . . . , αn ] p[α1 , . . . , αn ]
INST TYPE
Γ [γ1 , . . . , γn ] p[γ1 , . . . , γn ]
In MK COMB it is necessary for the types to agree so that the composite terms
are well-typed, and in ABS it is required that the variable x not be free in any of
the assumptions Γ , while our notation for term and type instantiation assumes
capture-avoiding substitution. All the usual logical constants are defined in terms
of equality. The conventional syntax ∀x. P [x] for quantifiers is surface syntax for
(∀)(λx. P [x]), and we also use this ‘binder’ notation for the ε operator.
These definitions allow us to derive all the usual (intuitionistic) natural de-
duction rules for the connectives in terms of the primitive rules above. All of the
core ‘logic’ is derived in this way. But then we add three mathematical axioms:
In addition, HOL Light includes two principles of definition, which allow one
to extend the set of constants and the set of types in a way guaranteed to
preserve consistency. The rule of constant definition allows one to introduce
a new constant c and an axiom c = t, subject to some conditions on free
variables and polymorphic types in t, and provided no previous definition for
c has been introduced. All the definitions of the logical connectives above are
introduced in this way. Note that this is ‘object-level’ definition: the constant
and its defining axiom exists in the object logic. Nevertheless, the definitional
principles are designed so that they always give a conservative (in particular
consistency-preserving) extension of the logic.
One doesn’t normally use such low-level rules much, but instead interacts with
HOL via a series of higher-level derived rules, using built-in parsers and printers
to read and write terms in a more natural syntax. For example, if one wants to
bind the name th6 to the theorem of real arithmetic that when |c − a| < e and
|b| ≤ d then |(a + b) − c| < d + e, one simply does:
If the purported fact in quotations turns out not to be true, then the rule
will fail by raising an exception. Similarly, any bug in the derived rule (which
represents several dozen pages of code written by the present author) would lead
to an exception.2 But we can be rather confident in the truth of any theorem
that is returned, since it must have been created via applications of primitive
rules, even though the precise choreographing of these rules is automatic and of
no concern to the user. What’s more, users can write their own special-purpose
proof rules in the same style when the standard ones seem inadequate — HOL
is fully programmable, yet retains its logical trustworthiness when extended by
ordinary users.
Among the facilities provided by HOL is the ability to organize proofs in a
mixture of forward and backward steps, which users often find more congenial.
The user invokes so-called tactics to break down the goal into more manageable
subgoals. For example, in HOL’s inbuilt foundations of number theory, the proof
that addition of natural numbers is commutative is written as follows (the symbol
∀ means ‘for all’):
The tactic INDUCT TAC uses mathematical induction to break the original
goal down into two separate goals, one for m = 0 and one for m + 1 on the
assumption that the goal holds for m. Both of these are disposed of quickly
simply by repeated rewriting with the current assumptions and a previous, even
more elementary, theorem about the addition operator. The identifier THEN is
a so-called tactical, i.e. a function that takes two tactics and produces another
tactic, which applies the first tactic then applies the second to any resulting
subgoals (there are two in this case).
For another example, we can prove that there is a unique x such that x =
f (g(x)) if and only if there is a unique y with y = g(f (y)) using a single stan-
dard tactic MESON TAC, which performs model elimination [15] to prove theorems
about first order logic with equality. As usual, the actual proof under the surface
happens by the standard primitive inference rules.
These and similar higher-level rules certainly make the construction of proofs
manageable whereas it would be almost unbearable in terms of the primitive rules
alone. Nevertheless, we want to dispel any false impression given by the simple ex-
amples above: proofs often require long and complicated sequences of rules. The
2
Or possibly to a true but different theorem being returned, but this is easily guarded
against by inserting sanity checks in the rules.
HOL Light: An Overview 65
Over the years, HOL Light has been used for a wide range of applications, and in
concert with this its library of pre-proved formalized mathematics and its stock
of more powerful derived inference rules have both been expanded. As well as
the usual battery of automated techniques like first-order reasoning and linear
arithmetic, HOL Light has been used to explore and apply unusual and novel
decision procedures [12,17].
In verification, HOL Light has been used at Intel to verify a number of com-
plex floating-point algorithms including division, square root and transcendental
functions [11]. HOL Light seems well-suited to applications like this. It has a
substantial library of formalized real analysis, which is used incessantly when
justifying the correctness of such algorithms. The flexibility and programmabil-
ity that the LCF approach affords are also important here since one can write
custom derived rules for special tasks like accumulating bounds on rounding
errors or enumerating the solutions to Diophantine equations of special kinds.
As for the formalization of mathematics, HOL Light has from the very be-
ginning had a useful formalization of real analysis [10]. More recently this has
been substantially developed to cover multivariate analysis in Euclidean space
and complex analysis. As well as the miscellany of theorems noted in the list
at http://www.cs.ru.nl/~freek/100/, HOL Light has been used to formalize
some particularly significant results such as the Jordan Curve Theorem [8] and
the Prime Number Theorem [13]. HOL Light is also heavily used in the Fly-
speck Project [7] to formalize the proof of the Kepler sphere-packing conjecture,
possibly the most ambitious formalization project to date.
References
1. Andrews, P.B.: An Introduction to Mathematical Logic and Type Theory: To Truth
Through Proof. Academic Press, London (1986)
2. Church, A.: A formulation of the Simple Theory of Types. Journal of Symbolic
Logic 5, 56–68 (1940)
3. Diaconescu, R.: Axiom of choice and complementation. Proceedings of the Ameri-
can Mathematical Society 51, 176–178 (1975)
4. Gordon, M.J.C.: Representing a logic in the LCF metalanguage. In: Néel, D. (ed.)
Tools and notions for program construction: an advanced course, pp. 163–185.
Cambridge University Press, Cambridge (1982)
66 J. Harrison
Institute of Informatics
University of Bialystok, Poland
{adamn,arturk}@math.uwb.edu.pl
1 Introduction
The original goal of Mizar [8], as conceived by its inventor, Andrzej Trybulec in
the early 1970s, was to construct a formal language close to the mathematical jar-
gon used in publications, but at the same time simple enough to enable computer-
ized processing, in particular verification of full logical correctness. The historical
description of the first 30 years of Mizar presented in [7] outlines the evolution
of the project, from its relatively modest initial implementations constrained by
the capabilities of computers available at that time, to the current proof assistant
system successfully used for practical formalization of mathematics.
In late 1980s Mizar developers started to systematically collect formaliza-
tions, which gave rise to the Mizar Mathematical Library - MML. Since then
the development of MML has been the central activity in the Mizar project,
as it has been believed that only substantial experience may help in improv-
ing the system. When in 1993 there emerged the QED initiative [12] to devise a
computer-based database of all mathematical knowledge, strictly formalized and
with all proofs having been checked automatically, Mizar was ready to actively
implement that ideology. Although the QED project has not been continued, to
some extent the development of Mizar is still driven in the spirit of its main
goals.
Nowadays, when it has been demonstrated by Mizar and numerous other sys-
tems that the computer mechanization of mathematics can be done in practice,
the important field for research is how to do it in a relatively easy and com-
fortable way. Therefore useful constructs that occur in informal mathematics
are still being incorporated into the linguistic layer to extend the expressiveness
of the Mizar language, and at the same time the efforts of Mizar developers
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 67–72, 2009.
c Springer-Verlag Berlin Heidelberg 2009
68 A. Naumowicz and A. Kornilowicz
Apart from the long-term goal of developing MML into a database for math-
ematics, the most important applications of Mizar today are playing the role
of a proof assistant to support creating rigorous mathematics, in mathematics
education and in software and hardware verification.
To facilitate the whole process of writing formal mathematics, several exter-
nal systems have been developed that complement the Mizar proof checker. For
example, effective semantic-based information retrieval, i.e., searching, browsing
and presentation of MML can be done with the MML Query system developed
by G. Bancerek [1]. Several sites provide an on-line Mizar processor, writing
proofs may also be assisted by the systems MoMM (a matching and interreduc-
tion tool) and the Mizar Proof Advisor developed by J. Urban. The contents
of MML as well as newly created documents can be presented in various user-
friendly formats, including a semantically-linked XML-based web pages [15] or
an automatically generated translation into English in the form of an electronic
and printed journal, Formalized Mathematics.
A Brief Overview of Mizar 71
For several decades Mizar has been used for educational purposes on various
levels: from secondary school to doctoral studies. Usually the teaching was orga-
nized as Mizar-aided courses, most typically on introduction to logic, topology,
lattice theory, general and universal algebra, category theory, etc. Recent appli-
cations in regular university-level courses being part of the obligatory curriculum
for CS students at the University of Bialystok are presented in [2,9].
Mizar has been used to define mathematical models of computers and prove
properties of their programs. One approach which is well-developed in MML
is based on the theory of random access Turing machines. There are also other
formalized attempts to model and analyze standalone algorithms. Numerous
MML articles are also devoted to the construction and analysis of gates and
digital circuits.
6 Current Development
Despite its origins and initial implementations in 1970s, Mizar is still being
actively developed. The development concerns both the language and the proof-
checking software. The evolution of the Mizar language goes into the direction
of best possible expressiveness, and still new useful language constructs are iden-
tified in mathematical texts and transformed into the formal setting of Mizar.
Much work in this area has been concentrated on the processing of attributes,
which in the most recent implementation can be expressed with their own visi-
ble arguments (e.g. n-dimensional, X-valued, etc.) in much the same way types
have been constructed. As the Mizar type checking mechanism uses quite pow-
erful automation techniques based on adjectives, the change makes it possible
to formalize many concepts in a more natural and, what is maybe even more
important, automatic way.
The capabilities of the proof-checker has recently been strengthened by pro-
viding means for a more complete adjective processing and the use of global
choice (selecting unique representatives of types) to enable eliminating the so
called ‘permissive’ definitions. The system has also been equipped with an effi-
cient method of identifying semantically equivalent operations defined in differ-
ent contexts, e.g the addition of numbers and the corresponding operation in the
field of real numbers. The system has also been extended with more powerful
automation of numerical computations.
Between the planned and currently considered future enhancements there are
several forms of ellipsis (the ubiquitous ‘...’ notation) and a syntactic extension
to support binding operators like the sum, product or integral.
7 Miscellanea
More information on Mizar can be found on the project’s web page [8] or its
several mirrors. The site contains information on the Mizar language (e.g. the
formal syntax, available manuals and other bibliographic links) and provides
downloading of the system and its library. There are also pointers to other
72 A. Naumowicz and A. Kornilowicz
References
1. Bancerek, G., Rudnicki, P.: Information retrieval in MML. In: Asperti, A., Buch-
berger, B., Davenport, J.H. (eds.) MKM 2003. LNCS, vol. 2594, pp. 119–132.
Springer, Heidelberg (2003)
2. Borak, E., Zalewska, A.: Mizar course in logic and set theory. In: Kauers, M.,
Kerber, M., Miner, R., Windsteiger, W. (eds.) MKM/CALCULEMUS 2007. LNCS,
vol. 4573, pp. 191–204. Springer, Heidelberg (2007)
3. Corbineau, P.: A declarative language for the Coq proof assistant. In: Miculan,
M., Scagnetto, I., Honsell, F. (eds.) TYPES 2007. LNCS, vol. 4941, pp. 69–84.
Springer, Heidelberg (2008)
4. Fitch, F.B.: Symbolic Logic. An Introduction. The Ronald Press Company (1952)
5. Harrison, J.: A Mizar Mode for HOL. In: von Wright, J., Harrison, J., Grundy, J.
(eds.) TPHOLs 1996. LNCS, vol. 1125, pp. 203–220. Springer, Heidelberg (1996)
6. Jaśkowski, S.: On the rules of supposition in formal logic. Studia Logica 1 (1934)
7. Matuszewski, R., Rudnicki, P.: Mizar: the first 30 years. Mechanized Mathematics
and Its Applications 4(1), 3–24 (2005)
8. Mizar home page: http://mizar.org
9. Naumowicz, A.: Teaching How to Write a Proof. In: Formed 2008: Formal Methods
in Computer Science Education, pp. 91–100 (2008)
10. Naumowicz, A., Byliński, C.: Improving Mizar texts with properties and require-
ments. In: Asperti, A., Bancerek, G., Trybulec, A. (eds.) MKM 2004. LNCS,
vol. 3119, pp. 290–301. Springer, Heidelberg (2004)
11. Ono, K.: On a practical way of describing formal deductions. Nagoya Mathematical
Journal 21 (1962)
12. QED Manifesto: http://www.rbjones.com/rbjpub/logic/qedres00.htm
13. Syme, D.: Three tactic theorem proving. In: Bertot, Y., Dowek, G., Hirschowitz,
A., Paulin, C., Théry, L. (eds.) TPHOLs 1999. LNCS, vol. 1690, pp. 203–220.
Springer, Heidelberg (1999)
14. Trybulec, A.: Tarski Grothendieck set theory. Formalized Mathematics 1(1), 9–11
(1990)
15. Urban, J.: XML-izing Mizar: Making Semantic Processing and Presentation of
MML Easy. In: Kohlhase, M. (ed.) MKM 2005. LNCS, vol. 3863, pp. 346–360.
Springer, Heidelberg (2006)
16. Wenzel, M., Wiedijk, F.: A comparison of Mizar and Isar. Journal of Automated
Reasoning 29(3-4), 389–411 (2002)
17. Wiedijk, F.: Formal Proof Sketches. In: Berardi, S., Coppo, M., Damiani, F. (eds.)
TYPES 2003. LNCS, vol. 3085, pp. 378–393. Springer, Heidelberg (2004)
18. Wiedijk, F.: Mizar Light for HOL Light. In: Boulton, R.J., Jackson, P.B. (eds.)
TPHOLs 2001. LNCS, vol. 2152, pp. 378–393. Springer, Heidelberg (2001)
A Brief Overview of Agda –
A Functional Language with Dependent Types
1 Introduction
A dependently typed programming language and proof assistant. Agda is a func-
tional programming language with dependent types. It is an extension of Martin-
Löf’s intuitionistic type theory [12,13] with numerous features which are useful
for practical programming. Agda is also a proof assistant. By the Curry-Howard
identification, we can represent logical propositions by types. A proposition is
proved by writing a program of the corresponding type. However, Agda is pri-
marily being developed as a programming language and not as a proof assistant.
Agda is the latest in a series of implementations of intensional type theory
which have been developed in Gothenburg (beginning with the ALF-system)
since 1990. The current version (Agda 2) has been designed and implemented by
Ulf Norell and is a complete redesign of the original Agda system. Like its prede-
cessors, the current Agda supports a wide range of inductive data types, pattern
matching, termination checking, and comes with an interface for programming
and proving by direct manipulation of proof terms. On the other hand, the new
Agda goes beyond the earlier systems in several respects: flexibility of pattern-
matching, more powerful module system, flexible and attractive concrete syntax
(using unicode), etc.
A system for functional programmers. A programmer familiar with a standard
functional language such as Haskell or OCaml will find it easy to get started
with Agda. Like in ordinary functional languages, programming (and proving)
consists of defining data types and recursive functions. Moreover, users familiar
with Haskell’s generalised algebraic data types (GADTs) will find it easy to use
Agda’s inductive families [5].
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 73–78, 2009.
c Springer-Verlag Berlin Heidelberg 2009
74 A. Bove, P. Dybjer, and U. Norell
The Agda wiki. More information about Agda can be found on the Agda wiki
[1]. There are tutorials [3,15], a guide to editing, type checking, and compiling
Agda code, a link to the standard library, and much else. There is also a link to
Norell’s PhD thesis [14] with a language definition and detailed discussions of
the features of Agda.
2 Agda Features
Data type definitions. Agda supports a rich family of strictly positive inductive
and inductive-recursive data types and families. Agda checks that the data type
definitions are well-formed according to a discipline similar to that in [6,7].
Recursive function definitions. One of Agda’s main features is its flexible pattern
matching for inductive families. A coverage checker makes sure the patterns cover
all possible cases. As in Martin-Löf type theory, all functions definable in Agda
must terminate, which is ensured by the termination checker.
Codata. The current version of Agda also provides coinductive data types. This
feature is however somewhat experimental and not yet stable.
Concrete syntax. The concrete syntax of Agda is much inspired by Haskell, but
also contains a few distinctive features such as mixfix operators and full support
for unicode identifiers and keywords.
Implicit arguments. The mechanism for implicit arguments allows the omission
of parts of the programs that can be inferred by the typechecker.
Module system. Agda’s module system supports separate compilation and allows
parametrised modules. Together with Agda’s record types, the module system
provides a powerful mechanism for structuring larger developments.
The above is a valid type in Agda syntax. To prove it in Agda we create a file
Example.agda with the following content:
module Example where
prf : ∀ n m → (n + m) + n ≡ m + (n + n)
prf n m = ?
Natural numbers and propositional equality are imported from the standard
library and opened to make their content available. Finally, we declare a proof
object prf, the type of which represents the proposition to be proved; here
∀ x → B is an abbreviation of (x : A) → B which does not explicitly mention
the argument type. The final line is the incomplete definition of prf: it is a
function of two arguments, but we do not yet know how to build a proof of the
equation so we leave a “?” in the right hand side. The “?” is a placeholder that
can be stepwise refined to obtain a complete proof.
In this way we can manually build a proof of the equation from associativity
and commutativity of +, and basic properties of equality which can be found in
the standard library. Manual equational reasoning however can become tedious
for complex equations. We shall therefore write a general procedure for equa-
tional reasoning in commutative monoids, and show how to use it for proving
the equation above.
Decision procedure for commutative monoids. First we define monoid expressions
as an inductive family indexed by the number of variables:
data Expr n : Set where
var : Fin n → Expr n
_⊕_ : Expr n → Expr n → Expr n
zero : Expr n
Fin n is a finite set with n elements; there are at most n variables. Note that
infix (and mixfix) operators are declared by using underscores to indicate where
the arguments should go.
To decide whether two monoid expressions are equal we normalise them and
compare the results. The normalisation function is
norm : ∀ {n} → Expr n → Expr n
Note that the first argument (the number of variables) is enclosed in braces,
which signifies that it is implicit. To define this function we employ normali-
sation by evaluation, that is, we first interpret the expressions in a domain of
“values”, and then reify these values into normal expressions. Below, we omit
the definitions of eval and reify and give only their types:
norm = reify ◦ eval
where we have used an auxiliary function build which builds an equation in Eqn
n from an n-place curried function by applying it to variables.
Equations will be proved by normalising both sides:
simpl : ∀ {n} → Eqn n → Eqn n
simpl (e1 == e2 ) = norm e1 == norm e2
We are now ready to define a general decision procedure for arbitrary commu-
tative monoids (the complete definition is given later):
prove : ∀ {n} (eqn : Eqn n) ρ → Prf (simpl eqn) ρ → Prf eqn ρ
The function takes an equation and an environment in which to interpret it, and
builds a proof of the equation given a proof of its normal form. The definition
of Prf will be given below.
We can instantiate this procedure to the commutative monoid of natural
numbers and apply it to our equation, an environment with the two variables,
and a proof of the normalised equation. Since the two sides of the equation will
be equal after normalisation we prove it by reflexivity:
prf : ∀ n m → (n + m) + n ≡ m + (n + n)
prf n m = prove eqn1 (n :: m :: []) ≡-refl
Opening the CommutativeMonoid module brings into scope the carrier C with
its equality relation _≈_ and the monoid operations _•_ and ε. A monoid ex-
pression is interpreted as a function from an environment containing values for
the variables to an element of C.
Env : → Set
Env n = Vec C n
References
1. Agda wiki page, http://wiki.portal.chalmers.se/agda/
2. Bertot, Y., Castéran, P.: Interactive Theorem Proving and Program Development.
In: Coq’Art: The Calculus of Inductive Constructions. Springer, Heidelberg (2004)
3. Bove, A., Dybjer, P.: Dependent types at work. In: Barbosa, L., Bove, A., Pardo,
A., Pinto, J.S. (eds.) LerNet ALFA Summer School 2008. LNCS, vol. 5520, pp.
57–99. Springer, Heidelberg (to appear, 2009)
4. Coquand, T., Huet, G.: The calculus of constructions. Information and Computa-
tion 76, 95–120 (1988)
5. Dybjer, P.: Inductive families. Formal Aspects of Computing 6, 440–465 (1994)
6. Dybjer, P.: A general formulation of simultaneous inductive-recursive definitions
in type theory. Journal of Symbolic Logic 65(2) (June 2000)
7. Dybjer, P., Setzer, A.: Indexed induction-recursion. Journal of Logic and Algebraic
Programming 66(1), 1–49 (2006)
8. Epigram homepage, http://www.e-pig.org
9. Gonthier, G.: The four colour theorem: Engineering of a formal proof. In: Kapur,
D. (ed.) ASCM 2007. LNCS, vol. 5081, p. 333. Springer, Heidelberg (2008)
10. Gordon, M., Milner, R., Wadsworth, C.: Edinburgh LCF. In: Kahn, G. (ed.) Se-
mantics of Concurrent Computation. LNCS, vol. 70. Springer, Heidelberg (1979)
11. Martin-Löf, P.: Constructive mathematics and computer programming. In: Logic,
Methodology and Philosophy of Science, VI, 1979, pp. 153–175. North-Holland,
Amsterdam (1982)
12. Martin-Löf, P.: Intuitionistic Type Theory. Bibliopolis, Napoli (1984)
13. Nordström, B., Petersson, K., Smith, J.M.: Programming in Martin-Löf’s Type
Theory. An Introduction. Oxford University Press, Oxford (1990)
14. Norell, U.: Towards a practical programming language based on dependent type
theory. PhD thesis, Chalmers University of Technology (2007)
15. Norell, U.: Dependently typed programming in Agda. In: Lecture Notes from the
Summer School in Advanced Functional Programming (2008) (to appear)
The Twelf Proof Assistant
Carsten Schürmann
IT University of Copenhagen
[email protected]
This work is supported in part by NABITT grant 2106-07-0019 of the Danish Strate-
gic Research Council.
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 79–83, 2009.
c Springer-Verlag Berlin Heidelberg 2009
80 C. Schürmann
1 Representation
%sig IL = {
o : type. %name o A.
|- : o -> type. %prefix 9 |-.
=> : o -> o -> o. %infix right 10 =>.
~ : o -> o.
=>I : (|- A -> |- B) -> |- A => B.
=>E : |- A => B -> |- A -> |- B.
~I : ({p:o} |- A -> |- p) -> |- ~ A.
~E : |- ~ A -> |- A -> |- B.
n = [p:o] ~ (~ p).
nI : |- A -> |- n A = [D] ~I [p:o] [u: |- ~ A] ~E u D.
}.
2 Reasoning
Besides being able to reconstruct and check LF types, Twelf is designed as a proof
assistant that allows Twelf users to reason about the meta-theoretic properties
of their encodings. In Twelf we separate cleanly the logical framework LF for
representation from a meta-logic Mω for reasoning. It is well-known that in LF
every term reduces to a unique β-normal η-long form that is also called canonical
form. These forms are inductively defined and give rise to induction principles
that are built into the meta-logic Mω . These principles allow the Twelf user to
reason about LF encodings even though they might be defined using higher-order
abstract syntax.
If we restrict ourselves to the Π2 -fragment of Mω , meta proofs can thankfully
be encoded as relations in LF — with the only caveat being that we need to check
that those relations behave as total functions (when executed on the Twelf logic
programming engine). For every well-typed input, those functions must compute
well-typed outputs, which means that computation must be terminating and
may not get stuck. As an illustrative example of such a meta proof, consider
the following signature that defines the Hilbert calculus and gives a proof of the
deduction theorem.
%sig HILBERT = {
o : type. %name o A.
|- : o -> type. %prefix 9 |-.
=> : o -> o -> o. %infix right 10 =>.
K : |- A => B => A.
S : |- (A => B => C) => (A => B) => A => C.
MP : |- A => B -> |- A -> |- B.
ded :
(|- A -> |- B) -> |- A => B -> type. %mode ded +D -E.
aK :
ded ([x] K) (MP K K).
aS :
ded ([x] S) (MP K S).
aID :
ded ([x] x) (MP (MP S (K : |- A => (A => A) => A)) K).
aMP :
ded ([x] MP (D x) (E x)) (MP (MP S D’) E’)
<- ded ([x] D x) D’ <- ded ([x] E x) E’.
%worlds () (ded _ _).
%total D (ded D _).
}.
The first six lines in this signature define the syntax of formulas, and the
Hilbert calculus for the implicational fragment of propositional logic. The
82 C. Schürmann
3 Organization
Last but not least, Twelf offers a deceptively simple but useful module system.
The module system provides the user with the ability to manage name spaces
but does not extend LF. In fact, every Twelf development that contains module
system features can be elaborated into an equivalent and pure LF signature.
Using structures, one may embed one signature into another, and using views
one may define maps from one signature to another.
Recall the definition of intuitionistic logic from Section 1. The following Twelf
development defines classical logic as an extension of intuitionistic logic by the
law of the excluded middle.
%sig CL = {
%struct IL : IL %open o |- => =>I =>E ~ n ~I ~E nI.
exm : |- ~ (~ A) => A.
}.
%struct imports IL into CL, and the %open allows the user to refer unqualified
to the subsequent list of constant names.
Next, we give the Kolmogorov translation from classical logic into intuition-
istic logic. To get this to work, we need to think of the usual turnstyle as A as
¬¬A. This is possible by defining a view in two steps. First, we define a view
from IL to IL.
%view KOLMIL : IL -> IL = {
o := o.
|- := [x] |- n x.
=> := [x][y] (n x) => (n y).
~ := [x] ~ x.
1
A term is ground if it doesn’t contain free logic variables.
2
Recall that terms may be open when using higher-order abstract syntax.
The Twelf Proof Assistant 83
In this view, the substructure IL is mapped to the view KOLMIL, and the law of
the excluded middle to a term representing a derivation of ¬¬(¬¬A ⊃ A).
The Twelf system and documentation can be accessed from our homepage at
http://www.twelf.org. More information about the module system is available
from http://www.twelf.org/~mod.
References
[Cra03] Crary, K.: Toward a foundational typed assembly language. In: Morrisett, G.
(ed.) Proceedings of the 30th ACM Symposium on Principles of Program-
ming Languages, New Orleans, Louisiana. SIGPLAN Notices, vol. 38(1), pp.
198–212. ACM Press, New York (2003)
[HHP93] Harper, R., Honsell, F., Plotkin, G.: A framework for defining logics. Journal
of the Association for Computing Machinery 40(1), 143–184 (1993)
[LCH07] Lee, D.K., Crary, K., Harper, R.: Towards a mechanized metatheory of stan-
dard ML. In: Proceedings of the 34th Annual Symposium on Principles of
Programming Languages, pp. 173–184. ACM Press, New York (2007)
[Pfe95] Pfenning, F.: Structural cut elimination. In: Kozen, D. (ed.) Proceedings of
the Tenth Annual Symposium on Logic in Computer Science, San Diego,
California, pp. 156–166. IEEE Computer Society Press, Los Alamitos (1995)
[PS99] Pfenning, F., Schürmann, C.: System description: Twelf — a meta-logical
framework for deductive systems. In: Ganzinger, H. (ed.) CADE 1999. LNCS
(LNAI), vol. 1632, pp. 202–206. Springer, Heidelberg (1999)
[SS06] Schürmann, C., Stehr, M.O.: An executable formalization of the HOL/Nuprl
connection in the meta-logical framework Twelf. In: Hermann, M., Voronkov,
A. (eds.) LPAR 2006. LNCS, vol. 4246, pp. 150–166. Springer, Heidelberg
(2006)
Hints in Unification
Andrea Asperti, Wilmer Ricciotti, Claudio Sacerdoti Coen, and Enrico Tassi
1 Introduction
Mathematical objects commonly have multiple, isomorphic representations or
can be seen at different levels of an algebraic hierarchy, according to the kind
or amount of information we wish to expose or emphasise. This richness is a
major tool in mathematics, allowing to implicitly pass from one representation
to another depending on the user needs. This operation is much more difficult
for machines, and many works have been devoted to the problem of adding
syntactic facilities to mimic the abus de notation so typical of the mathematical
language. The point is not only to free the user by the need of typing redundant
information, but to switch to a more flexible linkage model, by combining, for
instance, resolution of overloaded methods, or supporting multiple views of a
same component.
All these operations, in systems based on type theories, are traditionally per-
formed during type-inference, by a module that we call “refiner”. The refiner is
not only responsible for inferring types that have not been explicitly declared:
it must synthesise or constrain terms omitted by the user; it must adjust the
formula, for instance by inserting functions to pass from one representation to
another one; it may help the user in identifying the minimal algebraic structure
providing a meaning to the formula.
From the user point of view, the refiner is the primary source of “intelligence”
of the system: the more effective it is, the easier becomes the communication
with the system. Thus, a natural trend in the development of proof assistants
consists in constantly improving the functionalities of this component, and in
particular to move towards a tighter integration between the refiner and the
modules in charge of proof automation.
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 84–98, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Hints in Unification 85
Among the mechanisms which have been recently introduced in the litera-
ture with the aim to improve the power and flexibility of the refiner, we recall,
in Section 2, Canonical Structures [14], Type Classes [13], and Pullbacks [10].
Our claim is that all these mechanisms are particular instances of a simpler and
more general technique presented in Section 3, just consisting in providing suit-
able hints to the unification procedure underlying the type inference algorithm.
This simple observation paves the way to a light, modular and not intrusive
implementation of all the above mentioned techniques, and looks suitable to
interesting generalisations as discussed in Section 4.
In the rest of the paper we shall use the notation ≡ to express the type
equivalence relation of the given calculus. A unification problem will be expressed
?
as A ≡ B, resulting in a substitution σ such that Aσ ≡ Bσ. Metavariables will
be denoted with ?i , and substitutions are described as lists of assignments of the
form ?i := t.
In this section, we recall some heuristics for type refinement already described
in the literature and implemented in interactive provers like Coq, Isabelle and
Matita.
0+x= x (1)
Suppose that the notation (x + y) is associated with gop ? x y where gop is the
projection of the group operation with type:
gop ?1 0 x = x
where ?1 is a metavariable. For (1) to be well typed the arguments of gop have
to be of type gcarr g for some group g. In particular, the first user provided
argument 0 is of type Z, generating the following unification problem:
86 A. Asperti et al.
?
gcarr ?1 ≡ Z
If the user declared Z as the canonical group structure over Z, the system finds
the solution ?1 := Z. This heuristic is triggered only when the unification prob-
lem involves a record projection πi applied to a metavariable versus a constant
c. Canonical structures S := {c1 ; . . . ; cn } can be easily indexed using as keys all
the pairs of the form πi , ci .
This device was introduced by A.Saibi in the Coq system [14] and is exten-
sively used in the formalisation of finite group theory by Gonthier et al. [2,6].
Class Group (A : Type) := { unit : A; gop : A → A → A; . . .}
Instance Z : Group Z := { unit := 0; gop := Zplus; . . .}
Instance × (A,B: Type) (G: Group A) (H: Group B) : Group (A × B) := {
unit := unit G, unit H;
gop x1,x2 y1,y2 := gop G x1 y1, gop H x2 y2;
...
}
With this device a slightly more complicated formula than (1) can be accepted
by the system, such as:
0, 0 + x = x
Unfolding the + notation we obtain
gop ?1 ?2 0, 0 x = x
where the type of gop and the type of ?2 are:
gop : ∀T : Type.∀g : Group T.T → T → T
?2 : Group ?1
After ?1 is instantiated with Z × Z proof automation is used to inhabit ?2 whose
type has become Group (Z × Z). Automation is limited to a Prolog-like search
whose clauses are the user declared instances. Notice that the user has not defined
a type class instance (i.e. a canonical structure) over the group Z × Z.
Hints in Unification 87
The coercions pullback device was introduced as part of the manifesting coercions
technique by Sacerdoti Coen and Tassi in [10] to ease the encoding of algebraic
structures in type theory (see [11] for a formalisation explicating that technique).
This devices comes to play in a setting with a hierarchy of structures, some
of which are built combining together simpler structures. The carrier projection
is very frequently declared as a coercion [8], allowing the user to type formulas
like ∀g : Group.∀x : g.P (x) omitting to apply gcarr to g (i.e. the system is able
to insert the application of coercions when needed [12]).
ringI
r groupvvvv II
IrImonoid
vv II
v v II
v
{
group $
HH monoid
HH tt
HH t
tt
gcarr HHH$ zt tttmcarr
Type
x ∗ (y + z) = x ∗ y + x ∗ z
Expanding the notation we obtain as the left hand side the following
mop ?1 x (gop ?2 y z)
The second argument of mop has type gcarr ?2 but is expected to have type
mcarr ?1 , corresponding to the unification problem:
?
gcarr ?2 ≡ mcarr ?1
The system should infer the minimal algebraic structure in which the formula
can be interpreted, and the coercions pullback devices amounts to the calculation
of the pullback (in categorical sense) of the coercions graph for the arrows gcarr
and mcarr. The solution, in our example, is the following substitution:
?2 := r group ?3 ?1 := r monoid ?3
88 A. Asperti et al.
The solution is correct since the carriers of the structures composing the ring
structure are compatible w.r.t. equivalence (i.e. the two paths in the coercions
graph commute), that corresponds to the following property: for every ring r
In higher order logic, or also in first order logic modulo sufficiently powerful
rewriting, unification U is undecidable. To avoid divergence and to manage the
complexity of the problem, theorem provers usually implement a simplified, de-
cidable unification algorithm Uo , essentially based on first order logic, sometimes
extended to cope with reduction (two terms t1 and t2 are unifiable if they have
reducts t1 and t2 - usually computed w.r.t. a given reduction strategy - which
are first order unifiable). Unification hints provide a way to easily extend the
system’s unification algorithm Uo (towards U) with heuristics to choose solu-
tions which can be less than most general, but nevertheless constitute a sensible
default instantiation according to the user.
The general structure of a hint is
→ →
?x := H
myhint
P ≡ Q
→ → →
where P ≡ Q is a linear pattern with free variable F V (P, Q) =?v , ?x ⊆?v , all
→
variables in ?x are distinct and Hi cannot depend on ?xi , . . . , ?xn . A hint is ac-
→ → → →
ceptable if P [H / ?x ] ≡ Q[H / ?x ], i.e. if the two terms obtained by telescopic
substitution, are convertible. Since convertibility is (typically) a decidable rela-
tion, the system is able to discriminate acceptable hints.
Hints are supposed to be declared by the user, or automatically generated by
the systems in peculiar situation. Formally a unification hint induces a schematic
→
unification rule over the schematic variables ?v to reduce unification problems
to simpler ones:
→ ? →
?x ≡ H
?
myhint
P ≡ Q
→
Since ?x are schematic variables, when the rule is instantiated, the unification
→ ? →
problems ?x ≡ H become non trivial.
When a hint is acceptable, the corresponding schematic rule for unification is
→ ? → → →
sound (proof: a solution to ?x ≡ H is a substitution σ such that ?x σ ≡ H σ and
→ → → → ?
thus P σ ≡ P [H / ?x ]σ ≡ Q[H / ?x ]σ ≡ Qσ; hence σ is also a solution to P ≡ Q).
Hints in Unification 89
From the user perspective, the intuitive reading is that, having a unification
? → →
problem of the kind P ≡ Q, then the “hinted” solution is ?x :=H .
The intended use of hints is upon failure of the basic unification algorithm
Uo : the recursive definition unif that implements Uo
The function try hints simply matches the two terms m and n against the hints
patterns (in a fixed order decided by the user) and returns the first solution
found:
and try_hints m n =
match m,n with
| ...
| P,Q when unif(x,H) as sigma -> sigma (* myhint *)
| ...
?S := T
πi ?S ≡ t
Intuitively, the hint says that, if the carrier of a group ?0 is a product ?1 ×?2 ,
where ?1 is the carrier of a group ?3 and ?2 is the carrier of a group ?4 then
we may guess that ?0 is the group product of ?3 and ?4 . This is not the only
possible solution but, in lack of alternatives, it is a case worth to be explored.
Problem Solution
?
gcarr ?1 ≡ mcarr ?2 ?1 := r group ?3 , ?2 := r monoid ?3
?
gcarr ?1 ≡ mcarr (r monoid ?2 ) ?1 := r group ?2
?
gcarr (r group ?1 ) ≡ mcarr ?2 ?2 := r monoid ?1
?
gcarr (r group ?1 ) ≡ mcarr (r monoid ?2 ) ?2 :=?1
In a coherent dag, any pair of cofinal coercions defines a hint pattern, and the
corresponding pullback projections (if they exist) are the hinted solution.
Consider again the example given in Sect. 2.3. The generated hint is
?1 := r group ?3 ?2 := r monoid ?3
gcarr ?1 ≡ mcarr ?2
This hint is enough to solve all the unification problems listed in Table 1, that
occur often when formalising algebraic structures (e.g. in [11]).
4 Extensions
All the previous examples are essentially based on simple conversions involv-
ing records and projections. A natural idea is to extend the approach to more
complex cases involving arbitrary, possibly recursive functions.
As we already observed, the natural use of hints is in presence of invertible
reductions, where we may infer part of the structure of a term from its reduct.
A couple of typical situations borrowed from arithmetics could be the follow-
ing, where plus and times are defined be recursion on the first argument, in the
obvious way:
?1 := 0 ?2 := 0 ?1 := 1 ?2 := 1
plus0 times1
?1 +?2 ≡ 0 ?1 ∗?2 ≡ 1
To understand the possible use of these hints, suppose for instance to have
the goal
1≤a∗b
under the assumptions 1 ≤ a and 1 ≤ b; we may directly apply the monotonicity
of times
∀x, y, w, z.x ≤ w → y ≤ z → x ∗ y ≤ w ∗ z
that will succeed unifying (by means of the hint) both x and y with 1, w with a
and z with b.
Even when patterns do not admit a unique solution we may nevertheless
identify an “intended” hint.
Consider for instance the unification problem
?
?n +?m ≡ S ?p
In this case there are two possible solutions:
92 A. Asperti et al.
1) ?n := 0 and ?m := S ?p
2) ?n := S ?q and ?p :=?q +?m
however, the first one can be considered as somewhat degenerate, suggesting to
keep the second one as a possible hint.
?n := S ?q ?p :=?q +?m
plus-S
?n +?m ≡ S ?p
This would for instance allow to apply the lemma le plus : ∀x, y : N.x ≤ y + x
to prove that m ≤ S(n + m).
The hint can also be used recursively: the unification problem
?
?j + ?i ≡ S(S(n + m))
We call sgcarr the projection extracting the carrier of the semi-group structure,
and semigroup the record type representing the algebraic structure under anal-
ysis. Associated to that abstract syntax there is an interpretation function [[·]]S
mapping an abstract term of type Expr S to a concrete one of type sgcarr S.
Hints in Unification 93
let rec [[e : Expr S]](S:semigroup) : sgcarr S :=
match e with
[ EVar x ⇒ x
| Eop x y ⇒ sgop S [[x]]S [[y]]S
].
The normalisation function simpl is given the following type and is proved
sound:
let rec simpl (e: Expr S) : Expr S := . . .
lemma soundness:
∀ S:semigroup.∀ P:sgcarr S → Prop.∀ x:Expr S. P [[simpl x]]S →P [[x]]S
Given the following sample goal, imagine the user applies the soundness lemma
(where P is instantiated with λx.x = d).
a + (b + c) = d
yielding the unification problem
?
[[?1 ]]g ≡ a + (b + c) (2)
This is exactly what the extra-logical initial phase of every reflexive tactic has
to do: interpret a given concrete term into an abstract syntax.
We now show how the unification problem is solved declaring the two following
hints, where h-add is declared with higher precedence.
?a := Eop ?S ?x ?y ?m := [[?x ]]?S ?n := [[?y ]]?S
h-add
[[?a ]]?S ≡?m +?n
?a := EVar ?S ?z
h-base
[[?a ]]?S ≡?z
Hint h-add can be applied to problem (2), yielding three new recursive unification
problems. H-base is the only hint that can be applied to the second problem,
while the third one is matched by h-add, yielding three more problems whose
last two can be solved by h-base:
.. ..
. .
? ? ? ?
?x ≡ EVar g a ?y ≡ Eop g ?x ?y b ≡ [[?x ]]g c ≡ [[?y ]]g
? ?
h-base ?
?1 ≡ Eop g ?x ?y a ≡ [[?x ]]g b + c ≡ [[?y ]]g
?
h-add
[[?1 ]]g ≡ a + b + c
The leaves of the tree are all trivial instantiations of metavariables that together
form a substitution that instantiates ?1 with the following expected term:
Eop g (EVar g a) (Eop g (EVar g b) (EVar g c))
94 A. Asperti et al.
?
[[?1 ; ?2 ]]?3 ≡ x ∗ (x−1 ∗ y)
and admits multiple solutions (corresponding to permutations of elements in the
heap).
To be able to interpret the whole concrete syntax of groups in the abstract
syntax described by the Expr type, we need the following hints:
?a := Emult ?x ?y ?m := [[?x ; ?Γ ]]?g ?n := [[?y ; ?Γ ]]?g
h-times
[[?a ; ?Γ ]]?g ≡?m ∗?n
To identify equal variables, and give them the same abstract representation,
we need two hints, implementing the lookup operation in the heap (or better,
the generation of a duplicate free heap by means of explicit sharing).
?a := Evar 0 ?Γ :=?r ::?Θ
h-var-base
[[?a ; ?Γ ]]?g ≡?r
The second recursive unification problem can be solved applying hint h-var-base:
? ?
?x ≡ Evar 0 ?2 ≡ x ::?Θ
?
h-var-base
x ≡ [[?x ; ?2 ]]
96 A. Asperti et al.
5 Conclusions
? ?
In a higher order setting, unification problems of the kind f ?i ≡ o and ?f i ≡ o
are extremely complex. In the latter case, one can do little better than us-
ing generate-and-test techniques; in the first case, the search can be partially
driven by the structure of the function, but still the operation is very expensive.
Moreover, higher order unification does not admit most general unifiers, so both
problems above usually have several different solutions, and it is hard to guide
the procedure towards the intended solution.
On the other side, it is simple to hint solutions to the unification algorithm,
since the system has merely to check their correctness. By adding suitable hints
in a controlled way, we can restrict to a first order setting keeping interesting
higher-order inferences. In particular, we proved that hints are expressive enough
to mimic some interesting ad-hoc unification heuristics like canonical structures,
type classes and coercion pullbacks. It also seems that system provided unifica-
tion errors in case of error-free formulae can be used to suggest to the user the
need for a missing hint, in the spirit of “productive use of failure” [4].
Unification hints can be efficiently indexed using data structures for first order
terms like discrimination trees. Their integration with the general flow of the
unification algorithm is less intrusive than the previously cited ad-hoc techniques.
We have also shown an interesting example of application of unification hints
to the implementation of reflexive tactics. In particular, we instruct the unifica-
tion procedure to automatically infer a syntactic representation S of a term t
such that [[S]] ≡ t, introducing sharing in the process. This operation previously
had to be done by writing a small extra-logical program in the programming
language used to write the system, or in some ad-hoc language for customisa-
tion, like L-tac [5]. Our proposal is superior since the refiner itself becomes able
to solve such unification problems, that can be triggered in situations where the
external language is not accessible, like during semantic analysis of formulae.
A possible extension consists in adding backtracking to the management of
hints. This would require a more intrusive reimplementation of the unification
algorithm; moreover it is not clear that this is the right development direction
since the point is not to just add expressive power to the unification algorithm,
but to get the right balance between expressiveness and effectiveness, especially
in case of failure.
Another possible extension is to relax the linearity constraint on patterns with
the aim to capture more invertible rules, like in the following cases:
?x := 0 ?x := S ?z
plus-0 plus-S
?x +?y ≡?y ?x +?y ≡ S (?z +?y )
It seems natural to enlarge the matching relation allowing the recursive use of
hints, at least when they are invertible. For instance, to solve the unification
?
problem ?1 + (?2 + x) ≡ x we need to apply hint plus-0 but matching the hint
pattern requires a recursive application of hint plus-0 (hence it is not matching
in the usual sense, since ?2 has to be instantiated with 0). The properties of this
“matching” relation need a proper investigation that we leave for future work.
98 A. Asperti et al.
References
1. Barthe, G., Ruys, M., Barendregt, H.: A two-level approach towards lean proof-
checking. In: Berardi, S., Coppo, M. (eds.) TYPES 1995. LNCS, vol. 1158, pp.
16–35. Springer, Heidelberg (1996)
2. Bertot, Y., Gonthier, G., Ould Biha, S., Pasca, I.: Canonical big operators. In:
Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp.
86–101. Springer, Heidelberg (2008)
3. Boutin, S.: Using reflection to build efficient and certified decision procedures. In:
Ito, T., Abadi, M. (eds.) TACS 1997. LNCS, vol. 1281, pp. 515–529. Springer,
Heidelberg (1997)
4. Bundy, A., Basin, D., Hutter, D., Ireland, A.: Rippling: meta-level guidance for
mathematical reasoning. Cambridge University Press, New York (2005)
5. Delahaye, D.: A Tactic Language for the System Coq. In: Parigot, M., Voronkov,
A. (eds.) LPAR 2000. LNCS, vol. 1955, pp. 85–95. Springer, Heidelberg (2000)
6. Gonthier, G., Mahboubi, A., Rideau, L., Tassi, E., Thery, L.: A Modular Formali-
sation of Finite Group Theory. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007.
LNCS, vol. 4732, pp. 86–101. Springer, Heidelberg (2007)
7. Hall, C., Hammond, K., Jones, S.P., Wadler, P.: Type classes in haskell. ACM
Transactions on Programming Languages and Systems 18, 241–256 (1996)
8. Luo, Z.: Coercive subtyping. J. Logic and Computation 9(1), 105–130 (1999)
9. Luo, Z.: Manifest fields and module mechanisms in intensional type theory. In:
Miculan, M., Scagnetto, I., Honsell, F. (eds.) TYPES 2007. LNCS, vol. 4941.
Springer, Heidelberg (2008)
10. Sacerdoti Coen, C., Tassi, E.: Working with mathematical structures in type theory.
In: Miculan, M., Scagnetto, I., Honsell, F. (eds.) TYPES 2007. LNCS, vol. 4941,
pp. 157–172. Springer, Heidelberg (2008)
11. Sacerdoti Coen, C., Tassi, E.: A constructive and formal proof of Lebesgue’s dom-
inated convergence theorem in the interactive theorem prover Matita. Journal of
Formalized Reasoning 1, 51–89 (2008)
12. Saibi, A.: Typing algorithm in type theory with inheritance. In: The 24th Annual
ACM SIGPLAN - SIGACT Symposium on Principle of Programming Language
(POPL) (1997)
13. Sozeau, M., Oury, N.: First-class type classes. In: Mohamed, O.A., Muñoz, C.,
Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 278–293. Springer, Heidelberg
(2008)
14. The Coq Development Team. The Coq proof assistant reference manual (2005),
http://coq.inria.fr/doc/main.html
15. Wadler, P., Blott, S.: How to make ad-hoc polymorphism less ad hoc. In: POPL
1989: Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles
of programming languages, pp. 60–76. ACM, New York (1989)
16. Wenzel, M.: Type classes and overloading in higher-order logic. In: Gunter,
E.L., Felty, A.P. (eds.) TPHOLs 1997. LNCS, vol. 1275, pp. 307–322. Springer,
Heidelberg (1997)
Psi-calculi in Isabelle
1 Introduction
There are today several formalisms to describe the behaviour of computer sys-
tems. Some of them, like the lambda-calculus and the pi-calculus, are intended
to explore fundamental principles of computing and consequently contain as few
and basic primitives as possible. Other are more tailored to application areas
and include many constructions for modeling convenience. Such formalisms are
now being developed en masse. While this is not necessarily a bad thing there
is a danger in developing complicated theories too quickly. The proofs (for ex-
ample of compositionality properties) become gruesome with very many cases
to check and the temptation to resort to formulations such as “by analogy with
. . . ” or “is easily seen. . . ” can be overwhelming. For examples in point, both the
applied pi-calculus [1] and the concurrent constraint pi-calculus [8] have recently
been discovered to have flaws or incompletenesses in the sense that the claimed
compositionality results do not hold [5].
Since such proofs often require stamina and attention to detail rather than
ingenuity and complicated new constructions they should be amenable to proof
mechanisation. Our contribution in this paper is to implement a family of ap-
plication oriented calculi in Isabelle [12]. The calculi we consider are the so
called psi-calculi [5], obtained by extending the basic untyped pi-calculus with
the following parameters: (1) a set of data terms, which can function as both
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 99–114, 2009.
c Springer-Verlag Berlin Heidelberg 2009
100 J. Bengtson and J. Parrow
2 Psi-calculi
This section is a brief recapitulation of psi-calculi and nominal data types; for a
more extensive treatment including motivations and examples see [5].
also represent symbols acting as variables in the sense that they can be subject to
substitution. A nominal set [13] is a set equipped with name swapping functions
written (a b), for any names a, b. An intuition is that (a b)·X is X with a replaced
by b and b replaced by a. A sequence of swappings is called a permutation, often
denoted p, where p · X means the term X with the permutation p applied to it.
We write p− for the reverse of p. The support of X, written n(X), is the least
set of names A such that (a b) · X = X for all a, b not in A. We write a#X,
pronounced “a is fresh for X”, for a ∈ n(X). If A is a set of names we write
A#X to mean ∀a ∈ A . a#X. We require all elements to have finite support, i.e.,
n(X) is finite for all X. A function f is equivariant if (a b) · f (X) = f ((a b) · X)
holds for all X, and similarly for functions and relations of any arity. Intuitively,
this means that all names are treated equally.
2.2 Agents
A psi-calculus is defined by instantiating three nominal data types and four
operators:
Definition 1 (Psi-calculus parameters). A psi-calculus requires the three
(not necessarily disjoint) nominal data types:
T the (data) terms, ranged over by M, N
C the conditions, ranged over by ϕ
A the assertions, ranged over by Ψ
and the four equivariant operators:
.
↔: T × T → C Channel Equivalence
⊗ : A × A → A Composition
1:A Unit
⊆ A ×C Entailment
We require the existence of a substitution function for T, C and A. When X
is a term, condition or assertion we write X[ a := T] to mean the simultaneous
substitution of the names a for the terms T in X. The exact requisites of this
function will be covered in Section 4.
The binary functions above will be written in infix. Thus, if M and N are
.
terms then M ↔ N is a condition, pronounced “M and N are channel equiva-
lent” and if Ψ and Ψ are assertions then so is Ψ ⊗Ψ . Also we write Ψ ϕ, “Ψ
entails ϕ”, for (Ψ, ϕ) ∈ .
We say that two assertions are equivalent if they entail the same conditions:
Definition 2 (assertion equivalence). Two assertions are equivalent, writ-
ten Ψ Ψ , if for all ϕ we have that Ψ ϕ ⇔ Ψ ϕ.
Channel equivalence must be symmetric and transitive, ⊗ must be compositional
with regard to , and the assertions with (⊗, 1) form an abelian monoid.
In the following ã means a finite (possibly empty) sequence of names, a1 , . . . , an .
The empty sequence is written and the concatenation of ã and b̃ is written ãb̃.
102 J. Bengtson and J. Parrow
In the Input M (λ x)N.P we require that x ⊆ n(N ) is a sequence without dupli-
cates, and here any name in x binds its occurrences in both N and P . Restric-
tion binds a in P . An assertion is guarded if it is a subterm of an Input or
Output . In a replication !P there may be no unguarded assertions in P , and in
case ϕ1 : P1 [] · · · [] ϕn : Pn there may be no unguarded assertion in any Pi .
Table 1. Structured operational semantics. Symmetric versions of Com and Par are
elided. In the rule Com we assume that F(P ) = BP , ΨP and F(Q) = BQ , ΨQ where
BP is fresh for all of Ψ, BQ , Q, M and P , and that BQ is similarly fresh. In the rule
Par we assume that F(Q) = BQ , ΨQ where BQ is fresh for Ψ, P and α. In Open the
expression νã ∪ {b} means the sequence ã with b inserted anywhere.
. .
Ψ M ↔K Ψ M ↔K
In
Out
Ψ M (λ
K N[
y:=L]
y)N.P −−−−−−−→ P [
y := L] Ψ M N.P −−−→ P
KN
α
Ψ Pi −→ P Ψ ϕi
Case
: P −→ P
α
Ψ case ϕ
M (ν
a)N
−−−−−→ P
ΨQ ⊗Ψ P −
KN .
ΨP ⊗Ψ Q −−−→ Q Ψ ⊗ΨP ⊗ΨQ M ↔K
Com
a#Q
τ
a)(P | Q )
Ψ P | Q −→ (ν
α α
ΨQ ⊗Ψ P −→ P Ψ P −→ P
Par bn(α)#Q Scope b#α, Ψ
α α
Ψ P |Q −→ P |Q Ψ (νb)P −→ (νb)P
M (ν
a)N α
−−−−−→ P
Ψ P − b#a, Ψ, M Ψ P | !P −→ P
Open Rep
M (ν
a∪{b})N b ∈ n(N ) α
Ψ (νb)P −−−−−−−−−→ P Ψ !P −→ P
3 Binding Sequences
The main difficulty when formalising any calculus with binders is to handle
alpha-equivalence. The techniques that have been used thus far by theorem
104 J. Bengtson and J. Parrow
provers share the trait that they only reason about single binders. This works well
for many calculi, but psi-calculi require binding sequences of arbitrary length.
For our psi-calculus datatype (Def. 5), a binding sequence is needed in the
Input-case where the term M (λ x)N.P has the sequence x binding into N and
P . The second place sequences are needed is when defining frames (Def 3).
Frames are derived from processes (Def. 6) and as agents can have an arbi-
trary number of binders, so can the frames. The third occurrence of binding
sequences can be found in the operational semantics (Table 1). In the transition
Ψ P −−−−−−→ P , the sequence
M (ν
a)N
a represents the bound names in P which
occur in the object N .
In order to formalise these types of calculi efficiently in a theorem prover,
libraries with support for sequences of binders have to be added. In the next
sections we will discuss two approaches that have been made in this area, first
one by us, which we call explicit binding sequences, and then one by Berghofer
and Urban which we in this paper will call implicit binding sequences. They
both build on the existing nominal representation of alpha-equivalence classes
where a binding occurrence of the name a in the term T is written [a].T , and
the support of [a].T is the support of T with a removed. From this definition,
creating a term with the binding sequence ã in the term T , written [ã].T , can
easily be done by recursion over ã. The proof that the support of [ã].T is equal
to the support of T with the names of ã removed is trivial. Similarly, the notion
of freshness needs to be expanded to handle sequences. The expression ã#T is
defined as: ∀x ∈ set ã. x#T . This expression is overloaded for when ã is either
a list or a set.
Our approach is to scale the existing single binder setting to sequences. Isabelle
has native support for generating fresh names, i.e. given any finite context of
names C, Isabelle can generate a name fresh for that context. There is also
a distinctness predicate, written distinct a which states that a contains no
duplicates. From these we can generate a finite sequence a of arbitrary length n
where length a = n,
a#C and distinct a by induction on n.
The term [a].T can be alpha-converted into the term [b].(a b)·T if b#T , where
we call (a b) an alpha-converting swapping. In order to mimic this behaviour with
sequences, we lift name swapping to sequence swapping by pairwise composing
the elements of two sequences to create an alpha-converting permutation. We
a b) for such a composition defined in the following manner:
will write (
Definition 8
([] []) = []
((x :: xs) (y :: ys)) = (x, y) :: (xs ys)
All theories that construct permutations using this function will ensure that the
length of the sequences are equal.
We can now lift alpha-equivalence to support sequences.
Psi-calculi in Isabelle 105
This method has the problem that when cancelling alpha-converting permuta-
tions as in section 3.1, the freshness conditions we use to cancel the permutation
from the remaining terms are lost since (p · x )#U does not imply (p− · x )#U .
We define the following predicate to fix this.
Definition 9. distinctPerm p ≡ distinct((map fst p)@(map snd p))
Intuitively, the distinctPerm predicate ensures that all names in a permutation
are distinct.
Proof. By induction on p.
Proof. Since each name in ã can only bind once in T we can construct b̃ by
replacing any duplicate name in ã with a sufficiently fresh name.
The advantage of implicit alpha-conversions is that facts about length and dis-
tinctness of sequences do not need to be maintained through the proofs. The
freshness conditions are the ones needed for the single binder case and the dis-
tinctness properties are only needed when cancelling permutations. For most
cases, this method is more convenient to work with. There are disadvantages re-
garding inversion rules, and alpha-equivalence properties that will be discussed
in the next section.
3.3 Alpha-Equivalence
When reasoning with single binders, the nominal approach to alpha-equivalence
is quite straightforward. Two terms [a].T and [b].U are equal if and only if either
a = b and T = U or a = b, a#U and U = (a b) · T . Reasoning about binding
sequences is more difficult. Exactly what does it mean for two terms [ a].T and
[b].U to be equal? As long as T and U cannot themselves have binding sequences
on a top level we know that length a = length b, but the problem with the
general case is what happens when a and b partially share names. As it turns
out, this case is not important in order to reason about these types of equalities,
but special heuristics are required.
The times where we actually get assumptions such as [ a].T = [b].U in our
proofs are when we do induction or inversion over a term with binders. Typically,
[b].U is the term we start with, and [
a].T is the term that appears in the induction
or inversion rule. These rules are designed in such a way that any bound names
Psi-calculi in Isabelle 107
The problem with this approach is that we do not know how a and b are related.
If we know that they are both distinct then we can construct p such that a = p· b
but generally we do not know this. The problematic cases are the ones dealing
with inversion, in which case we resort to explicit binding sequences, but for the
majority of our proofs Lemma 3 is enough.
4 Formalisation
Psi-calculi are parametric calculi. A specific instance is created by instantiating
the framework with dataterms for the terms, assertions and conditions of the
calculus. We also require an entailment relation, a notion of channel equality
and composition of assertions. Isabelle has good support for reasoning about
parametric systems through the use of locales [3].
Nominal Isabelle does not support datatypes with binding sequences or nested
datatypes. The two cases that are problematic when formalising psi-calculi are
the Input case, which requires a binding sequence, and the Case case which
requires a list of assertions and processes. The required datatype can be encoded
using mutual recursion in the following way.
Definition 11. The psi-calculi datatype has three type variables for terms, as-
sertions and conditions respectively. In the Res and the Bind cases, name is a
binding occurrence.
4.3 Frames
The four nominal morphisms from Def. 1 are also encoded using locales along
with their equivariance properties. From this definition, implementing Def. 2
and a locale for our requirements on assertion equivalence is straightforward.
To implement frames, the following nominal datatype is created:
Definition 12
nominal datatype β frame = Assertion β
| FStep name (β frame)
In order to overload the ⊗ operator to work on frames as described in Def. 3
we create the following two nominal functions.
Definition 13
insertAssertion (Assertion Ψ ) Ψ = Assertion(Ψ ⊗Ψ )
x#Ψ ⇒ insertAssertion (FStep x F ) Ψ = FStep x (insertAssertion F Ψ )
(Assertion Ψ ) ⊗ G = insertAssertion G Ψ
x#G ⇒ (FStep x F ) ⊗ G = FStep x (F ⊗ G)
We will use the notation (νa)N ≺ P for a term of type boundOutput which has
a )N ≺ P
a into N and P . We can also write Ψ P −→ M (ν
the binding sequence
for Ψ P −−−−−−→ P and similarly for input and tau transitions.
M (ν
a)N
ΨQ ⊗Ψ P −→ P
α
F (Q) = BQ , ΨQ BQ #Ψ, P, α, P , Q
Par
distinct BQ
Ψ P |Q −→ P |Q
α
a
b#a,
c, Ψ, M
Ψ P −−−−−−→ P
M (ν c)N
b ∈ n(N )
Open
a#Ψ, P, M, c
Ψ (νb)P −−−−−−−→ P
M (ν
ab
c)N
c#Ψ, P, M
Psi-calculi in Isabelle 111
At the core of any nominal formalisation is the need to create custom induction
rules which allow the introduced bound names to be fresh for any given context.
Without these, the user is forced to do manual alpha-conversions throughout
the proofs and such proofs will differ significantly from their pen and paper
counterparts, where freshness is just assumed. An in depth description can be
found in [16]. Very recent additions to the nominal package generate induction
rules where the user is allowed to choose a set of name which can be arbitrarily
fresh for each inductive case. In most cases, this set will be the set of binders
present in the rule.
Standard induction. Isabelle will automatically create a rule for doing in-
duction on transitions of the form Ψ P −→ Rs, where Rs is a residual. In
nominal induction the predicate to be proven has the extra argument C, such
that all bound names introduced by the induction rule are fresh for C. Thus, the
predicate has the form Prop C Ψ P Rs. This induction rule is useful for very
general proofs about transitions, but we often need proofs which are specialised
for input, output, or tau transitions. We create the following custom induction
rules:
Lemma 5.
Ψ P −−−→ P
MN
Ψ P −−−−−−→ P
M (ν
a)N
Ψ P −→ P
τ
.. .. ..
. . .
Prop C Ψ P M N P a)(N ≺ P ))
Prop C Ψ P M ((ν Prop C Ψ P P
The inductive steps for each rule have been left out as they are instances of the
ones from the automatically generated induction rule, but with the predicates
changed to match the corresponding transition.
These induction rules work well only as long as the predicate to be proven
does not depend on anything under the scope of a binder. Trying to prove the
following lemma illustrates the problem.
The problem is that none of the induction rules we have will prove this lemma
in a satisfactory way. Every applicable case in the induction rule will introduce
its own bound output term (νb)N ≺ P where we know that (νb)N ≺ P =
(νa)N ≺ P . What we need to prove relates to the term P , what the inductive
hypotheses will give us is something related to the term P where all we know
is that they are part of alpha-equivalent terms.
112 J. Bengtson and J. Parrow
Proving this lemma on its own is not too difficult but in every step of ev-
ery proof of this type, manual alpha-conversions and equivariance properties are
needed. The following induction rule solves this problem.
Ψ P −−−−−−→ P
M (ν
a)N
⎛ ⎞
a#b, Ψ, P, M, C ∧ b#N, P ∧
⎜ set p ⊆ set
a × set b ∧ ⎟
a N P b p C. ⎜
∀Ψ P M ⎝ Prop C Ψ P M
⎟
⎠
a N P −→
Prop C Ψ P M b (p · N ) (p · P )
..
.
Prop C Ψ P M a N P
The difference between this rule and the output rule in Lemma 5 is that the
predicate in Lemma 5 takes a residual (ν a)N ≺ P as one argument and the
predicate in this rule takes
a, N and P as three separate ones. By disassoci-
ating the binding sequence from the residual in this manner we have lost the
ability to alpha-convert the residual, but we have gained the ability to reason
about terms under the binding sequence. The extra added case in the induction
rule above (beginning with ∀Ψ P M . . .) is designed to allow the predicate to
mimic the alpha-conversion abilities we have lost. When proving this induction
rule, Lemma 3 is used in each step to generate the alpha-converting permuta-
tion, Prop is proven in the standard way and then alpha-converted using the
new inductive case.
With this lemma, we must prove that the predicate we are trying to prove can
respect alpha-conversions. The advantage is that it only has to be done once for
each proof. Moreover, the case is very general and does not require the processes
or actions to be of a specific form.
Using this induction rule will not allow us to prove lemmas which reason
directly about the binding sequence a. The new inductive case swaps a sequence
a for b but as in Lemma 3, we do not know exactly how these sequences relate
to each other.
This lemma suffers from the same problem as Lemma 6 – every inductive step
will generate a frame alpha-equivalent to BP , ΨP and many tedious alpha-
conversions have to be done to prove the lemma. Moreover, some of our lemmas
Psi-calculi in Isabelle 113
need to directly reason about the binding sequence of the frame. A similar in-
duction rule as for output transitions can be created to solve the problem.
Ψ P −−−→ P
MN
F (P ) = BP , ΨP
⎛ distinct BP ⎞
(p · BP )#Ψ, P, M, C, N, P , BP ∧ BP #ΨP ∧
⎜ set p ⊆ set BP × set(p · BP ) ∧ ⎟
∀Ψ P M N P BP ΨP p C. ⎜ ⎟
⎝ Prop C Ψ P M N P BP ΨP −→ ⎠
Prop C Ψ P M N P (p · BP ) (p · ΨP )
..
.
Prop C Ψ P M N P BP ΨP
This lemma requires that the binding sequence BP is distinct. This added re-
quirement allows the alpha converting case to relate the sequence BP to p · BP
allowing for a larger class of lemmas to be proven. Our semantics require all
frames to have distinct binding sequences making this added requirement un-
problematic.
A corresponding lemma has to be created for output transitions as well, but
since frames only affect subjects as far as input and output transitions are con-
cerned, this induction rule does not have to use the same mechanism for the
bound names in the residual as for the ones in the frame.
After introducing these custom induction rules, we were able to remove thou-
sands of lines of code which were only dealing with alpha-conversions.
References
1. Abadi, M., Fournet, C.: Mobile values, new names, and secure communication. In:
Proceedings of POPL 2001, pp. 104–115. ACM, New York (2001)
2. Aydemir, B., Charguéraud, A., Pierce, B.C., Pollack, R., Weirich, S.: Engineer-
ing formal metatheory. In: POPL 2008: Proceedings of the 35th annual ACM
SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 3–15.
ACM, New York (2008)
3. Ballarin, C.: Locales and locale expressions in isabelle/isar. In: Berardi, S., Coppo,
M., Damiani, F. (eds.) TYPES 2003. LNCS, vol. 3085, pp. 34–50. Springer,
Heidelberg (2004)
4. Barendregt, H.P.: The Lambda Calculus – Its Syntax and Semantics. Studies in
Logic and the Foundations of Mathematics, vol. 103. North-Holland, Amsterdam
(1984)
5. Bengtson, J., Johansson, M., Parrow, J., Victor, B.: Psi-calculi: Mobile processes,
nominal data, and logic. Technical report, Uppsala University (2009); (submitted),
http://user.it.uu.se/~ joachim/psi.pdf
6. Bengtson, J., Parrow, J.: Formalising the pi-calculus using nominal logic. In: Seidl,
H. (ed.) FOSSACS 2007. LNCS, vol. 4423, pp. 63–77. Springer, Heidelberg (2007)
7. Berghofer, S., Urban, C.: Nominal Inversion Principles. In: Mohamed, O.A.,
Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 71–85. Springer,
Heidelberg (2008)
8. Buscemi, M.G., Montanari, U.: Open bisimulation for the concurrent constraint
π-calculus. In: Drossopoulou, S. (ed.) ESOP 2008. LNCS, vol. 4960, pp. 254–268.
Springer, Heidelberg (2008)
9. de Bruijn, N.G.: Lambda calculus notation with nameless dummies. a tool for
automatic formula manipulation with application to the church-rosser theorem.
Indagationes Mathematicae 34, 381–392 (1972)
10. Hirschkoff, D.: A full formalisation of π-calculus theory in the calculus of construc-
tions. In: Gunter, E.L., Felty, A.P. (eds.) TPHOLs 1997. LNCS, vol. 1275, pp.
153–169. Springer, Heidelberg (1997)
11. Honsell, F., Miculan, M., Scagnetto, I.: π-calculus in (co)inductive type theory.
Theoretical Comput. Sci. 253(2), 239–285 (2001)
12. Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL. LNCS, vol. 2283. Springer,
Heidelberg (2002)
13. Pitts, A.M.: Nominal logic, a first order theory of names and binding. Information
and Computation 186, 165–193 (2003)
14. Röckl, C., Hirschkoff, D.: A fully adequate shallow embedding of the π-calculus in
Isabelle/HOL with mechanized syntax analysis. J. Funct. Program. 13(2), 415–451
(2003)
15. Urban, C.: Nominal techniques in Isabelle/HOL. Journal of Automated Reason-
ing 40(4), 327–356 (2008)
16. Urban, C., Berghofer, S., Norrish, M.: Barendregt’s variable convention in rule in-
ductions. In: Pfenning, F. (ed.) CADE 2007. LNCS, vol. 4603, pp. 35–50. Springer,
Heidelberg (2007)
Some Domain Theory and Denotational
Semantics in Coq
1 Introduction
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 115–130, 2009.
c Springer-Verlag Berlin Heidelberg 2009
116 N. Benton, A. Kennedy, and C. Varming
the usual product of the underlying types of their underlying orders with
the pointwise ordering yields a product order. Equipping that order with a
pointwise least upper bound operation c = ( (fst ◦c), (snd◦c)) for c →m
D1 × D2 yields a product cpo D1 × D2 with continuous πi : D1 × D2 →c Di .
We write f, g for the unique (up to ==) continuous function such that
f == π1 ◦ f, g and g == π2 ◦ f, g .
Closed structure. We can define operations curry : (D × E →c F ) → (D →c
E ⇒c F ) and ev : (E ⇒c D) × E →c D such that for any f : D × E →c F ,
curry(f ) is the unique continuous map such that f == ev ◦ curry f ◦ π1 , π2 .
We define uncurry : (D ⇒c E ⇒c F ) →c D × E ⇒c F by uncurry =
curry(ev ◦ev ◦π1 , π1 ◦π2 , π2 ◦π2 ) and we check that uncurry(curry(f )) == f
and curry(uncurry(h)) == h for all f and h.
So our internal category CPO of cpos and continuous maps is Cartesian closed.
We elide the details of other constructions, including finite coproducts, strict
function spaces and general indexed products, that are in the formalization.
Although our cpos are not required to have least elements, those that do are of
special interest. We use Coq’s typeclass mechanism to capture them:
Class Pointed (D : cpo) := { ⊥ : D; Pleast : ∀ d : D, ⊥ d }.
Instance DOne pointed : Pointed 1.
Instance prod pointed A B { pa : Pointed A} {pb : Pointed B } : Pointed (A × B ).
Instance fun pointed A B {pb : Pointed B } : Pointed (A ⇒c B ).
Now if D is Pointed, and f : D →c D then we can define fixp f , the least fixed
point of f in the usual way, as the least upper bound of the chain of iterates
of f starting at ⊥. We define FIXP : (D ⇒c D) →c D to be the ‘internalised’
version of fixp.
If D : cpo and P : D → Prop, then P is admissible if for all chains c :
natO →m D such that (∀n. P (cn )), one has P ( c). In such a case, the subset
type {d : D | P (d)} with the order and lubs inherited from D is a cpo. We can
also prove the standard fixed point induction principle:
Definition fixp ind D { pd : Pointed D} : ∀ (F : D →m D)(P : D→ Prop),
admissible P → P ⊥ → (∀ x, P x → P (F x )) → P (fixp F ).
The main technical complexity in this part of the formalization is simply the
layering of definitions, with (for example) cpos being built on ord s, and D ⇒c E
being built on D →c E, which is built on D →m E, which is built on D → E.
Definitions have to be built up in multiple staged versions and there are many
implicit coercions and hints for Coq’s auto tactic, which are tricky to get right.
There is also much boilerplate associated with morphism declarations supporting
setoid rewriting, and there is some tension between the elementwise and ‘point-
free’ styles of working.
c0 = Eps ···
: Eps v: Eps
vvv vv
c1 = Eps Eps ? ?
:
vvv
c2 = Eps ? ? ?
1
In reality, the output stream ‘ticks’ less frequently than the picture would suggest.
120 N. Benton, A. Kennedy, and C. Varming
The output we are trying to produce is an element of DL ord. Each time our
interleaving search finds an Eps , we produce an Eps on the output. So if every
element of the chain is Ω, we will end up producing Ω on the output. But should
we find a Val d after outputting some finite number of Eps s, then we know all
later elements of the chain are also non-Ω, so we go ahead and build the chain
in D that they form and compute its least upper bound using the lub operation
of D. The details of this construction, and the proof that it does indeed yield
the least upper bound of the chain c, involve interesting bits of constructive
reasoning: going from knowing that there is a chain in D to actually having that
chain in one’s hand so as to take its lub uses (a provable form of) constructive
indefinite description, for example. But at the end of the day, we end up with a
constructive definition of D⊥ : cpo, which is clearly Pointed.
Lifting gives a strong monad [17] on CPO. The unit η : D →c D⊥ applies the
Val constructor. If f : D →c E⊥ define kleisli f : D⊥ →c E⊥ to be the map
cofix kl (d : D⊥ ) : E⊥ := match d with Eps dl ⇒ Eps (kl dl) | Val d ⇒ f d
Thinking operationally, the way in which kleisli sequences computations is very
intuitive. To run kleisli f d, we start by running d. Every time d takes an Eps
step, we do too, so if d diverges so does kleisli f d. Should d yield a value d ,
however, the remaining steps are those of f d . We prove that kleisli f actually
is a continuous function and, amongst other things, satisfies all the equations
making (−⊥ , η, kleisli(−)) a Kleisli triple on CPO. It is also convenient to have
‘parameterized’ versions of the Kleisli operators Kleislir D E : (D × E →c
F⊥ ) → (D × E⊥ →c F⊥ ) defined by composing kleisli with the evident strength
τ : D × E⊥ →c (D × E)⊥ .
The major drawback of the above is that typing judgments contain proof
objects: simple equalities between types, as in TFIX, and the existence of a
variable in the environment, as in TVAR. It’s necessary to prove (at some length)
that any two typings of the same term are equal, whilst definitions and theorems
are hedged with well-formedness side-conditions.
We recently switched to a strongly-typed term representation in which
variable and term types are indexed by Ty and Env, ensuring that terms are
well-typed by construction. Definitions and theorems become more natural and
much more concise, and the problems with equality proofs go away.2 Here is the
complete definition of well-typed terms:
Inductive Var : Env → Ty → Type :=
| ZVAR : ∀ Γ τ , Var (τ :: Γ ) τ | SVAR : ∀ Γ τ τ , Var Γ τ → Var (τ :: Γ ) τ .
Inductive Value : Env → Ty → Type :=
| TINT : ∀ Γ , nat → Value Γ Int | TBOOL : ∀ Γ , bool → Value Γ Bool
| TVAR : ∀ Γ τ , Var Γ τ → Value Γ τ
| TFIX : ∀ Γ τ1 τ2 , Exp (τ1 :: τ1 -> τ2 :: Γ ) τ2 → Value Γ (τ1 -> τ2 )
| TPAIR : ∀ Γ τ1 τ2 , Value Γ τ1 → Value Γ τ2 → Value Γ (τ1 * τ2 )
with Exp : Env → Ty → Type :=
| TFST : ∀ Γ τ1 τ2 , Value Γ (τ1 * τ2 ) → Exp Γ τ1
| TSND : ∀ Γ τ1 τ2 , Value Γ (τ1 * τ2 ) → Exp Γ τ2
| TOP : ∀ Γ , (nat → nat → nat) → Value Γ Int → Value Γ Int → Exp Γ Int
| TGT : ∀ Γ , Value Γ Int → Value Γ Int → Exp Γ Bool
| TVAL : ∀ Γ τ , Value Γ τ → Exp Γ τ
| TLET : ∀ Γ τ1 τ2 , Exp Γ τ1 → Exp (τ1 :: Γ ) τ2 → Exp Γ τ2
| TAPP : ∀ Γ τ1 τ2 , Value Γ (τ1 -> τ2 ) → Value Γ τ1 → Exp Γ τ2
| TIF : ∀ Γ τ , Value Γ Bool → Exp Γ τ → Exp Γ τ → Exp Γ τ .
Definition CExp τ := Exp nil τ . Definition CValue τ := Value nil τ .
Variables of type Var Γ τ are represented by a “typed” de Bruijn index that
is in essence a proof that τ lives at that index in Γ . The typing rule associated
with each term constructor can be read directly off its definition: for example,
TLET takes an expression typed as τ1 under Γ , and another expression typed
as τ2 under Γ extended with a new variable of type τ1 ; its whole type is then τ2
under Γ . The abbreviations CExp and CValue define closed terms.
Now the operational semantics can be presented very directly:
Inductive Ev : ∀ τ , CExp τ → CValue τ → Prop :=
| e Val : ∀ τ (v : CValue τ ), TVAL v ⇓ v
| e Op : ∀ op n1 n2 , TOP op (TINT n1 ) (TINT n2 ) ⇓ TINT (op n1 n2 )
| e Gt : ∀ n1 n2 , TGT (TINT n1 ) (TINT n2 ) ⇓ TBOOL (ble nat n2 n1 )
| e Fst : ∀ τ1 τ2 (v1 : CValue τ1 ) (v2 : CValue τ2 ), TFST (TPAIR v1 v2 ) ⇓ v1
| e Snd : ∀ τ1 τ2 (v1 : CValue τ1 ) (v2 : CValue τ2 ), TSND (TPAIR v1 v2 ) ⇓ v2
| e App : ∀ τ1 τ2 e (v1 : CValue τ1 ) (v2 : CValue τ2 ),
substExp (doubleSubst v1 (TFIX e)) e ⇓ v2 → TAPP (TFIX e) v1 ⇓ v2
| e Let : ∀ τ1 τ2 e1 e2 (v1 : CValue τ1 ) (v2 : CValue τ2 ),
e1 ⇓ v1 → substExp (singleSubst v1 ) e2 ⇓ v2 → TLET e1 e2 ⇓ v2
| e IfTrue : ∀ τ (e1 e2 : CExp τ ) v, e1 ⇓ v → TIF (TBOOL true) e1 e2 ⇓ v
2
The new Program and dependent destruction tactics in Coq 8.2 are invaluable for
working with this kind of strongly dependent representation.
122 N. Benton, A. Kennedy, and C. Varming
e0 = ⊥ : D0 →c D1 p0 = ⊥ : D1 →c D0
en+1 = mor F (pn , en ) : Dn+1 →c Dn+2 pn+1 = mor F (en , pn ) : Dn+2 →c Dn+1 .
Let πi : Πj Dj →c Di be the projections from the product of all the Dj s.
The predicate P : Πj Dj → Prop defined by P d := ∀i, πi d == pn (πi+1 d) is
admissible, so we can define the sub-cpo D∞ to be {d | P d} with order and lubs
inherited from the indexed product. D∞ will be the cpo we seek, so we now need
to construct the required isomorphism.
Define tn : Dn → D∞ to be the map that for i < n projects Dn to Di
via pi ◦ · · · ◦ pn−1 and for i > n embeds Dn in Di via en ◦ · · · ◦ ei−1 . Then
mor F (ti , πi ) : ob F (D∞ , D∞ ) →c ob F (Di , Di ) = Di+1 , so ti+1 ◦ mor F (ti , πi ) :
ob F (D∞ , D∞ ) →c D∞ , and mor F (πi , ti ) ◦ π1+1 : D∞ →c ob F (D∞ , D∞ ). We
then define
UNROLL := (mor F (πi , ti ) ◦ πi+1 ) : D∞ →c ob F (D∞ , D∞ )
i
ROLL := (ti+1 ◦ mor F (ti , πi )) : ob F (D∞ , D∞ ) →c D∞
i
We interpret the unityped language in a solution for the recursive domain equa-
tion D (nat + (D →c D))⊥ , following the intuition that a computation either
diverges or produces a value which is a number or a function. This is not the
‘tightest’ domain equation one could use for CBV: one could make function space
strict, or equivalently make the argument of the function space be a domain of
values rather than computations. But this equation still gives an adequate model.
The construction in Coq is an instantiation of results from the previous section.
First we build the strict bifunctor F (D, E) = (nat + (D →c E))⊥ :
Definition FS := BiLift strict (BiSum (BiConst (Discrete nat)) BiArrow ).
And then we construct the solution, defining domains D∞ for computations
and V∞ for values:
Definition D∞ := D∞ FS.
Definition V∞ := Dsum (Discrete nat) (D∞ →c D∞ ).
Definition Roll : (V∞ )⊥ →c D∞ := ROLL FS.
Definition Unroll : D∞ →c (V∞ )⊥ := UNROLL FS.
Definition UR iso : Unroll ◦ Roll == ID := DIso ur FS.
Definition RU iso : Roll ◦ Unroll == ID := DIso ru FS.
For environments we define the n-ary product of V∞ and projection function.
Fixpoint SemEnv n : cpo := match n with O ⇒ 1| S n ⇒ SemEnv n × V∞ end.
Fixpoint projenv (m n : nat) : (m < n) → SemEnv n →c V∞ :=
match m, n with
| m, O ⇒ fun inconsistent ⇒ match (lt n O m inconsistent) with end
| O, S n ⇒ fun ⇒ π2
| S m, S n ⇒ fun h ⇒ projenv (lt S n h) ◦ π1 end.
3
We induce over evaluations to construct well-formedness derivations when showing
well-formedness preservation, and well-formedness derivations are themselves in Type
so that we can use them to inductively define the denotational semantics.
Some Domain Theory and Denotational Semantics in Coq 127
show that this is closed under intersection, so admissible relations form a com-
plete lattice.
We then define a relational action corresponding to the bifunctor used in
defining our recursive domain. This action, RelV , maps a pair of relations R, S
on (V∞ )⊥ × Value to a new relation that relates (inl m) to (NUM m) for all
m : nat, and relates (inr f ) to (LAMBDA e) just when f : D∞ →c D∞ is strict
and satisfies the ‘logical’ property
6 Discussion
as a dependently typed map from syntax to semantics, rather than only being
able to do shallow embeddings – this is clearly necessary if one wishes to prove
theorems like adequacy or do compiler correctness. Secondly, it seems one really
needs dependent types to work conveniently with monads and logical relations,
or to formalize the inverse limit construction.4
The constructive nature of our formalization and the coinductive treatment of
lifting has both benefits and drawbacks. On the minus side, some of the proofs
and constructions are much more complex than they would be classically and
one does sometimes have to pay attention to which of two classically-equivalent
forms of definition one works with. Worse, some constructions do not seem to be
possible, such as the smash product of pointed domains; not being able to define
⊗ was one motivation for moving from Paulin-Mohring’s pointed cpos to our
unpointed ones. One benefit that we have not yet seriously investigated, however,
is that it is possible to extract actual executable code from the denotational
semantics. Indeed, the lift monad can be seen as a kind of syntax-free operational
semantics, not entirely unlike game semantics; this perspective, and possible
connections with step-indexing, seem to merit further study.
The Coq development is of a reasonable size. The domain theory library,
including the theory of recursive domain equations, is around 7000 lines. The
formalization of the typed language and its soundness and adequacy proofs are
around 1700 lines and the untyped language takes around 2500. Although all the
theorems go through (with no axioms), we have to admit that the development
is currently rather ‘rough’. Nevertheless, we have already used it as the basis
of a non-trivial formalization of some new research [8] and our intention is to
develop the formalization into something that is more widely useful. Apart from
general polishing, we plan to abstract some of the structure of our category of
domains to make it convenient to work simultaneously with different categories,
including categories of algebras. We would also like to provide better support for
‘diagrammatic’ rewriting in monoidal (multi)categories. It is convenient to use
Setoid rewriting for pointfree equational reasoning, direct translating the normal
categorical commuting diagrams. But dealing with all the structural morphisms
is still awkward, and it should be possible to support something more like the
diagrammatic proofs one can do with ‘string diagrams’ [13].
References
1. Adams, R.: Formalized metatheory with terms represented by an indexed family
of types. In: Filliâtre, J.-C., Paulin-Mohring, C., Werner, B. (eds.) TYPES 2004.
LNCS, vol. 3839, pp. 1–16. Springer, Heidelberg (2006)
2. Agerholm, S.: Domain theory in HOL. In: Joyce, J.J., Seger, C.-J.H. (eds.) HUG
1993. LNCS, vol. 780. Springer, Heidelberg (1994)
4
Agerholm [3] formalized the construction of a model of the untyped lambda calculus
using HOL-ST, a version of HOL that supports ZF-like set theory; this is elegant
but HOL-ST is not widely used and no denotational semantics seems to have been
done with the model. Petersen [21] formalized a reflexive cpo based on P ω in HOL,
though this also appears not to have been developed far enough to be useful.
130 N. Benton, A. Kennedy, and C. Varming
1 Introduction
Inductively defined predicates (for short, (inductive) predicates) are a popu-
lar specification device in the theorem proving community. Major theory devel-
opments in the proof assistant Isabelle/HOL [8] make pervasive use of them,
e.g. formal semantics of realistic programming language fragments [11]. From
such large applications naturally the desire arises to generate executable proto-
types from the abstract specifications. It is well-known how systems of predicates
can be transformed to functional programs using mode analysis. The approach
described in [1] for Isabelle/HOL works but has turned out unsatisfactorily:
– The applied transformations are not trivial but are carried out outside the
LCF inference kernel, thus relying on a large code base to be trusted.
– Recently a lot of code generation facilities in Isabelle/HOL have been gen-
eralized to cover type classes and more languages than ML, but this has not
yet been undertaken for predicates.
– The transformation is carried out inside the logic; thus the transformation
is guarded by LCF inferences and does not increase the trusted code base.
Supported by BMBF in the VerisoftXT project under grant 01 IS 07008 F.
Supported by DFG project NI 491/10-1.
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 131–146, 2009.
c Springer-Verlag Berlin Heidelberg 2009
132 S. Berghofer, L. Bulwahn, and F. Haftmann
– The code generator itself can be fed with the function-like equations and
does not need to be extended; also other tools involving equational reasoning
could benefit from the transformation.
– Proposed extensions can also work inside the logic and do not endanger
trustability.
The role of our transformation in this scenario is shown in the following picture:
2 Related Work
From the technical point of view, the execution of predicates has been extensively
studied in the context of the programming languages Curry [4] and Mercury [10].
The central concept for executing predicates are modes, which describe dataflow
by partitioning arguments into input and output.
We already mentioned the state-of-the-art implementation of code generation
for predicates in Isabelle/HOL [1] which turns inductive predicates into ML
programs extralogically using mode analysis.
Delahaye et al. provide a similar direct extraction for the Coq proof assis-
tant [2]; however at most one solution is computed, multiple solutions are not
enumerated.
For each of these approaches, correctness is ensured by pen-and-paper proofs.
Our approach instead animates the correctness proof by applying it to each
single predicate using the proof assistant itself; thus correctness is guaranteed
by construction.
3 Preliminaries
3.1 Inductive Predicates
An inductive predicate is characterized by a collection of introduction rules (or
clauses), each of which has a conclusion and an arbitrary number of premises.
It corresponds to the smallest set closed under these clauses. As an example,
consider the following predicate describing the concatenation of two lists, which
can be defined in in Isabelle/HOL using the inductive command:
Turning Inductive into Equational Specifications 133
For each predicate, an elimination (or case analysis) rule is provided, which for
append has the form
append
Xs Ys Zs =⇒
(ys. Xs = [] =⇒ Ys = ys =⇒ Zs = ys =⇒ P ) =⇒
( xs ys zs x .
Xs = x · xs =⇒
Ys = ys =⇒ Zs = x · zs =⇒ append xs ys zs =⇒ P ) =⇒
P
There is also an induction rule, which however is not relevant in our scenario.
In introduction rules, we distinguish between premises of the form Q u1 . . . uk ,
where Q is an inductive predicate, and premises of other shapes, which we call
side conditions. Without loss of generality, we only consider clauses without side
conditions in most parts of our presentation. The general form of a clause is
We use ki,j and l to denote the arities of the predicates Qi,j and P , i.e. the
length of the argument lists ui,j and ti , respectively.
The code generator turns a set of equational theorems into a program inducing
the same equational rewrite system. This means that any sequence of reduction
steps the generated program performs on a term can be simulated in the logic:
134 S. Berghofer, L. Bulwahn, and F. Haftmann
code generation
is a list of modes M, Mi,1 , . . . Mi,ni for the predicates P, Qi,1 , . . . , Qi,ni , where
1 ≤ i ≤ m, M ⊆ {1, . . . , l} and Qi,j ⊆ {1, . . . , ki,j }. Let FV (t) denote the set
of free variables in a term t. Given a vector of arguments t and a mode M , the
projection expression tM denotes the list of all arguments in t (in the order of
their occurrence) whose index is in M .
Turning Inductive into Equational Specifications 135
1. v0 = FV (ti M )
2. vj = vj−1 ∪ FV (ui,j )
such that
Without loss of generality we can examine clauses under mode inference modulo
reordering of premises. For side conditions R, condition 3 has to be replaced by
FV (R) ⊆ vj−1 , i.e. all variables in R must be known when evaluating it. This
definition yields a check whether a given clause is consistent with a particular
mode assignment.
eval (pred f ) = f
From the point of view of the logic, this characterization of the α pred algebra
in terms of unit abstractions might seem odd; their purpose comes to surface
when translating these equations to executable code, e.g. in ML:
datatype ’a pred = Seq of (unit -> ’a seq)
and ’a seq = Empty | Insert of ’a * ’a pred | Union of ’a pred list;
val bot_pred : ’a pred = Seq (fn u => Empty)
fun single x = Seq (fn u => Insert (x, bot_pred));
fun bind (Seq g) f =
Seq (fn u =>
(case g () of Empty => Empty
| Insert (x, xq) => Union [f x, bind xq f]
| Union xqs => Union (map (fn x => bind x f) xqs)));
fun sup_pred (Seq f) (Seq g) =
Seq (fn u =>
(case f () of Empty => g ()
| Insert (x, xq) => Insert (x, sup_pred xq (Seq g))
| Union xqs => Union (append xqs [Seq g])));
fun eval A_ (Seq f) = member A_ (f ())
and member A_ Empty x = false
| member A_ (Insert (y, yq)) x = eq A_ x y orelse eval A_ yq x
| member A_ (Union xqs) x = list_ex (fn xq => eval A_ xq x) xqs;
In the function definitions for eval and member, the expression A_ is the dictio-
nary for the eq class allowing for explicit equality checks using the overloaded
constant eq.
In shape this follows a well-known ML technique for lazy lists: each inspection
of a lazy list by means of an application f () is protected by a constructor Seq.
Thus we enforce a lazy evaluation strategy for predicate enumerations even for
eager languages.
languages like Prolog, the execution of the functional program generated from
the clauses uses pattern matching instead of unification. A precondition for the
applicability of pattern matching is that the input arguments in the conclusions
of the clauses, as well as the output arguments in the premises of the clauses are
built up using only datatype constructors and variables. In the following descrip-
tion of the translation scheme, we will treat the pattern matching mechanism as
a black box. However, our implementation uses a pattern translation algorithm
due to Slind [9, §3.3], which closely resembles the techniques used in compilers
for functional programming languages. The following notation will be used in
our description of the translation mechanism:
x = x1 . . . xl (x) = (x1 , . . . , xl )
τ = τ1 . . . τl τ ⇒ σ = τ1 ⇒ · · · ⇒ τl ⇒ σ
τ = τ1 × · · · × τl M − = {1, . . . , l}\M
The recursion equation for P M can be obtained from the clauses characterizing
P in a canonical way:
Intuitively, this means that the set of output values generated by P M is the
union of the output values generated by the clauses Ci . In order for pattern
matching to work, all patterns occurring in the program must be linear, i.e.
no variable may occur more than once. This can be achieved by renaming the
free variables occurring in the terms ti , ui,1 , . . ., ui,ni , and by adding suitable
equality checks to the generated program. Let ti , ui,1 , . . ., ui,ni denote these
linear terms obtained by renaming the aforementioned ones, and let θi = {yi →
z i }, θi,1 = {yi,1 → z i,1 }, . . ., θi,ni = {yi,ni → z i,ni } be substitutions such that
θi (ti ) = ti , θi,1 (ui,1 ) = ui,1 , . . ., θi,ni (ui,ni ) = ui,ni , and (dom(θi ) ∪ dom(θi,1 ) ∪
Turning Inductive into Equational Specifications 139
append{1,2} xs ys =
single (xs, ys) >>= (λa. case a of
([], zs) ⇒ single zs
| (z · zs, ws) ⇒ ⊥) %
single (xs, ys) >>= (λb. case b of
([], zs) ⇒ ⊥
| (z · zs, ws) ⇒ append{1,2} zs ws >
>= (λvs. single (z · vs)))
append{3} xs =
single xs >>= (λys. single ([], ys)) %
single xs >>= (λa. case a of
[] ⇒ ⊥
| z · zs ⇒ append{3} zs > >= (λb. case b of
(ws, vs) ⇒ single (z · ws, vs)))
Side conditions can be embedded into this translation scheme using the function
if-pred :: α
ifpred b = (if b then single () else ⊥)
that maps False and True to the empty sequence and the singleton sequence
containing only the unit element, respectively.
elimination rules for P . We will also need introduction and elimination rules for
the operators on type pred , which we show in Table 2. From the definition of
P M , we can easily derive the introduction rule
P x =⇒ eval (P M xM ) (xM − )
and the elimination rule
eval (P M xM ) (xM − ) =⇒ P x
By extensionality (rule =I ), proving
P M xM = C1 xM % · · · % Cm xM
amounts to showing that
(1) x. eval (P M xM ) x =⇒ eval (C1 xM % · · · % Cm xM ) x
(2) x. eval (C1 xM % · · · % Cm xM ) x =⇒ eval (P M xM ) x
where x :: τ M − . The variable x can be expanded to a tuple of variables:
(1) xM − . eval (P M xM ) (xM − ) =⇒
eval (C1 xM % · · · % Cm xM ) (xM − )
(2) xM . eval (C1 xM % · · · % Cm xM ) (xM − ) =⇒
−
Proof of (1). From eval (P M xM ) (xM − ), we get P x using the elimination
rule for P M . Applying the elimination rule for P
P x =⇒ E1 x =⇒ · · · =⇒ Em x =⇒ R
Ei x ≡ bi . x = ti =⇒ Qi,1 ui,1 =⇒ · · · =⇒ Qi,ni ui,ni =⇒ R
yields m proof obligations, each of which corresponds to an introduction rule.
Note that bi consists of the free variables of ui,j and ti . For the ith introduction
⊥E eval ⊥ x =⇒ R
single I eval (single x ) x
single E eval (single x ) y =⇒ (y = x =⇒ R) =⇒ R
>>=I eval P x =⇒ eval (Q x )y =⇒ eval (P > >= Q) y
>>=E eval (P > >= Q) y =⇒ ( x . eval P x =⇒ eval (Q x ) y =⇒ R) =⇒ R
I1 eval A x =⇒ eval (A B ) x
I2 eval B x =⇒ eval (A B ) x
E eval (A B ) x =⇒ (eval A x =⇒ R) =⇒ (eval B x =⇒ R) =⇒ R
ifpred I P =⇒ eval (ifpred P ) ()
ifpred E eval (ifpred b) x =⇒ (b =⇒ x = () =⇒ R) =⇒ R
=I ( x . eval A x =⇒ eval B x ) =⇒ ( x . eval B x =⇒ eval A x ) =⇒ A = B
Turning Inductive into Equational Specifications 141
rule, we have to prove eval (C1 xM % · · · % Cm xM ) (xM − ) from the as-
sumptions x = ti and Qi,1 ui,1 , . . ., Qi,ni ui,ni . By applying the rules %I1 and
%I2 in a suitable order, we select the Ci corresponding to the ith introduction
rule, which leaves us with the proof obligation eval (Ci ti M ) (ti M − ). By the
definition of Ci and the rule >>=I , this gives rise to the two proof obligations
(1.i) eval (single (ti M )) (ti M )
(1.ii) eval (case ti M of
(ti M ) ⇒ if yi = z i then ⊥ else
M
Qi,1i,1 (ui,1 Mi,1 ) >
>= (λa1 . case a1 of . . .)
| ⇒ ⊥) ti M −
Goal (1.i) is easily proved using single I . Concerning goal (1.ii), note that (ti M )
matches (ti M ), so we have to consider the first branch of the case expression.
Due to the definition of ti , we also know that yi = z i , which means that we have
to consider the else branch of the if clause. This leads to the new goal
>= (λa1 . case a1 of . . .)) ti M −
M
eval (Qi,1i,1 (ui,1 Mi,1 ) >
>=I , can be split up into the two goals
that, by applying rule >
M −
(1.iii) eval (Qi,1i,1 (ui,1 Mi,1 )) (ui,1 Mi,1 )
−
(1.iv) eval (case ui,1 Mi,1 of
−
(ui,1 Mi,1 ) ⇒ if y i,1 = z i,1 then ⊥ else . . .
| ⇒ ⊥) ti M −
Goal (1.iii) follows from the assumption Qi,1 ui,1 using the introduction rule for
M
Qi,1i,1 , while goal (1.iv) can be solved in a similar way as goal (1.ii). Repeating
M Mi,ni
this proof scheme for Qi,2i,2 , . . ., Qi,ni finally leads us to a goal of the form
eval (single (ti M − )) (ti M − )
which is trivially solvable using single I .
Proof of (2). The proof of this direction is dual to the previous one: rather
than splitting up the conclusion into simpler formulae, we now perform for-
ward inferences that transform complex premises into simpler ones. Eliminating
eval (C1 xM % · · · % Cm xM ) (xM − ) using rule %E leaves us with m proof
obligations of the form
eval (Ci xM ) (xM − ) =⇒ eval (P M xM ) (xM − )
By unfolding the definition of Ci and applying rule >
>=E to the premise of the
above implication, we obtain a0 such that
(2.i) eval (single (xM )) a0
(2.ii) eval (case a0 of
(ti M ) ⇒ if yi = z i then ⊥ else
M
Qi,1i,1 (ui,1 Mi,1 ) >
>= (λa1 . case a1 of . . .)
| ⇒ ⊥) xM −
142 S. Berghofer, L. Bulwahn, and F. Haftmann
Membership tests. The type constructor pred can be stripped using explicit
membership tests. For example, we could define a suffix predicate using append :
using introduction and elimination rules for op = and single. This equation
then is directly executable.
In addition to its two arguments of type α, rtc also has a parameter r that stays
fixed throughout the definition. The general form of a mode for a higher-order
predicate P with k arguments and parameters r1 , . . . , rρ with arities k1 , . . . , kρ
is (M1 , . . . , Mρ , M ), where Mi ⊆ {1, . . . , ki } (for 1 ≤ i ≤ ρ) and M ⊆ {1, . . . , k}.
Intuitively, this mode means that P r1 · · · rρ has mode M , provided that ri
has mode Mi . The possible modes for rtc are ({}, {1}), ({}, {2}), ({}, {1, 2}),
({1}, {1}), ({2}, {2}), ({1}, {1, 2}), and ({2}, {1, 2}). The general definition of
the function corresponding to the mode (M1 , . . . , Mρ , M ) of a predicate P is
rtc({2},{2}) r y =
single y > >= (λx . single x ) %
single y > >= (λz . rtc({2},{2}) r z >
>= (λy. r y >
>= (λx . single x )))
inductive
S :: alfa list ⇒ bool and
A :: alfa list ⇒ bool and B :: alfa list ⇒ bool
Turning Inductive into Equational Specifications 145
where
S []
| A w =⇒ S (b · w )
| B w =⇒ S (a · w )
| S w =⇒ A (a · w )
| A v =⇒ A w =⇒ A (b · v @ w )
| S w =⇒ B (b · w )
| B v =⇒ B w =⇒ B (a · v @ w )
By choosing mode {} for the above predicates (i.e. their arguments are all out-
put), we can enumerate all elements of the set S containing equally many as and
bs. However, the above predicates cannot easily be used with mode {1}, i.e. for
checking whether a given word is generated by the grammar. This is because of
the rules with the conclusions A (b · v @ w ) and B (a · v @ w ). Since the append
function (denoted by @) is not a constructor, we cannot do pattern matching
on the argument. However, the problematic rules can be rephrased as
append v w vw =⇒ A v =⇒ A w =⇒ A (b · vw )
append v w vw =⇒ B v =⇒ B w =⇒ B (a · vw )
The problematic expression v @ w in the conclusion has been replaced by a new
variable vw. The fact that vw is the result of appending the two lists v and w is
now expressed using the append predicate from §3. In order to check whether a
given word can be generated using these rules, append first enumerates all ways
of decomposing the given list vw into two sublists v and w, and then recursively
checks whether these words can be generated by the grammar.
References
1. Berghofer, S., Nipkow, T.: Executing higher order logic. In: Callaghan, P., Luo, Z.,
McKinna, J., Pollack, R. (eds.) TYPES 2000. LNCS, vol. 2277, p. 24. Springer,
Heidelberg (2002)
2. Delahaye, D., Dubois, C., Étienne, J.F.: Extracting purely functional contents from
logical inductive types. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007. LNCS,
vol. 4732, pp. 70–85. Springer, Heidelberg (2007)
3. Haftmann, F., Nipkow, T.: A code generator framework for Isabelle/HOL. Tech.
Rep. 364/07, Department of Computer Science, University of Kaiserslautern (2007)
4. Hanus, M.: A unified computation model for functional and logic programming.
In: Proc. 24th ACM Symposium on Principles of Programming Languages (POPL
1997), pp. 80–93 (1997)
5. Henrio, L., Kammüller, F.: A mechanized model of the theory of objects. In: Bon-
sangue, M.M., Johnsen, E.B. (eds.) FMOODS 2007. LNCS, vol. 4468, pp. 190–205.
Springer, Heidelberg (2007)
6. Mellish, C.S.: The automatic generation of mode declarations for prolog programs.
Tech. Rep. 163, Department of Artificial Intelligence (1981)
7. Nipkow, T., von Oheimb, D., Pusch, C.: μJava: Embedding a programming lan-
guage in a theorem prover. In: Bauer, F., Steinbrüggen, R. (eds.) Foundations of
Secure Computation. Proc. Int. Summer School Marktoberdorf 1999, pp. 117–144.
IOS Press, Amsterdam (2000)
8. Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL. LNCS, vol. 2283. Springer,
Heidelberg (2002)
9. Slind, K.: Reasoning about terminating functional programs. Ph.D. thesis, Institut
für Informatik, TU München (1999)
10. Somogyi, Z., Henderson, F.J., Conway, T.C.: Mercury: an efficient purely declar-
ative logic programming language. In: Proceedings of the Australian Computer
Science Conference, pp. 499–512 (1995)
11. Wasserrab, D., Nipkow, T., Snelting, G., Tip, F.: An operational semantics and
type safety proof for multiple inheritance in C++. In: OOPSLA 2006: Proceedings
of the 21st annual ACM SIGPLAN conference on Object-oriented programming
languages, systems, and applications, pp. 345–362. ACM Press, New York (2006)
Formalizing the Logic-Automaton Connection
1 Introduction
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 147–163, 2009.
c Springer-Verlag Berlin Heidelberg 2009
148 S. Berghofer and M. Reiter
2 Basic Definitions
2.1 Presburger Arithmetic
∀ x ≥8. ∃ y z . 3 ∗ y + 5 ∗ z = x
can be encoded by
Forall (Imp (Le [−1] −8) (Exist (Exist (Eq [5, 3, −1] 0))))
Like Boudet and Comon [3], we only consider variables ranging over the natural
numbers. The left-hand side of a Diophantine (in)equation can be evaluated
using the function
The abstract framework for automata used in this paper is quite similar to the
one used by Nipkow [11]. The purpose of this framework is to factor out all
properties that deterministic and nondeterministic automata have in common.
Automata are characterized by a transition function tr of type σ ⇒ α ⇒ σ, where
σ and α denote the types of states and input symbols, respectively. Transition
functions can be extended to words, i.e. lists of symbols in a canonical way:
steps :: (σ ⇒ α ⇒ σ) ⇒ σ ⇒ α list ⇒ σ
steps tr q [] = q
steps tr q (a · as) = steps tr (tr q a) as
The reachability of a state q from a state p via a word as is defined by
reach :: (σ ⇒ α ⇒ σ) ⇒ σ ⇒ α list ⇒ σ ⇒ bool
reach tr p as q ≡ q = steps tr p as
Another characteristic property of an automaton is its set of accepting states.
Given a predicate P denoting the accepting states, an automaton is said to
accept a word as iff from a starting state s we reach an accepting state via as:
accepts :: (σ ⇒ α ⇒ σ) ⇒ (σ ⇒ bool ) ⇒ σ ⇒ α list ⇒ bool
accepts tr P s as ≡ P (steps tr s as)
The rows in the above word are interpreted as natural numbers, where the left-
most column, i.e. the first symbol in the list
corresponds to the least significant
m−1
bit. Therefore, the value of variable xi is j=0 bi,j 2j . The list of values of n
variables denoted by a word can be computed recursively as follows:
Here, is-alph n xs means that xs is a valid symbol, i.e. the length of xs is equal
to the number of variables n, |bs| and |bss| denote the lengths of bs and bss,
respectively, and bs ∈ bss means that bs is a member of the list bss. Moreover,
nat-of-bools is similar to nats-of-boolss, with the difference that it works on a
single row vector instead of a list of column vectors:
Since the input symbols of our automata are bit vectors, it would be rather inef-
ficient to just represent the transition function for a given state as an association
list relating bit vectors to successor states. For such a list, the lookup operation
would be exponential in the number of variables. When implementing the Mona
tool, Klarlund [8] already observed that representing the transition function as
a BDD is more efficient. BDDs are represented by the datatype2
This operation only returns meaningful results if the height of the BDD is less
or equal to the length of the bit vector. We write bddh n bdd to mean that the
height of bdd is less or equal to n. Two BDDs can be combined using a binary
operator f as follows:
If the two BDDs have different heights, the shorter one is expanded on the fly.
The following theorem states that bdd-binop yields a BDD corresponding to the
pointwise application of f to the functions represented by the argument BDDs:
where xs[i] denotes the ith element of list xs. Finally, using the generic functions
from §2.2, we can produce variants of these functions tailored to DFAs:
The definition of wf-nfa can be obtained from the one of wf-dfa by just replacing
dfa-is-node by nfa-is-node. Due to its “asymmetric” type, a transition function
of type nat ⇒ bool list ⇒ bool list would be incompatible with the abstract
functions from §2.2. We therefore lift the function to work on finite sets of natural
numbers rather that just single natural numbers. This is accomplished by
subsetbdd :: bool list bdd list ⇒ bool list ⇒ bool list bdd ⇒ bool list bdd
subsetbdd [] [] bdd = bdd
subsetbdd (bdd · bdds) (b · bs) bdd =
(if b then subsetbdd bdds bs (bdd-binop bv-or bdd bdd )
else subsetbdd bdds bs bdd )
where bv-or is the bit-wise or operation on bit vectors, i.e. the union of two
finite sets. Using this operation, subsetbdd combines all BDDs in the first list,
for which the corresponding bit in the second list is True. The third argument
of subsetbdd serves as an accumulator and is initialized with a BDD consisting
of only one Leaf containing the empty set, which is the neutral element of bv-or :
Using subsetbdd, the transition function for NFAs can now be defined as follows:
A set of states is accepting iff at least one of the states in the set is accepting:
As in the case of DFAs, we can now instantiate the generic functions from §2.2.
In order to check whether we can reach an accepting state from the start state,
we apply accepts to the finite set containing only the state 0.
where succs returns the list of successors of a node, and the predicate is-node
describes the (finite) set of nodes. Moreover, ins x S, memb x S and empt cor-
respond to {x } ∪ S, x ∈ S and ∅ on sets. The node store must also satisfy an
additional invariant. Using Isabelle’s infrastructure for the definition of functions
by well-founded recursion [9], the DFS function can be defined as follows3 :
dfs :: β ⇒ α list ⇒ β
dfs S [] = S
dfs S (x · xs) = (if memb x S then dfs S xs else dfs (ins x S ) (succs x @ xs))
Note that this function is partial, since it may loop when instantiated with ins,
memb and empt operators not behaving like their counterparts on sets, or when
applied to a list of start values not being valid nodes. However, since dfs is tail
recursive, Isabelle’s function definition package can derive the above equations
without preconditions, which is crucial for the executability of dfs. The central
property of dfs is that it computes the transitive closure of the successor relation:
3 Automata Construction
In this section, we will describe all automata constructions that are used to
recursively build automata from formulae in Presburger arithmetic. The simplest
one is the complement, which we describe in §3.1. It will be used to model
negation. The product automaton construction described in §3.2 corresponds to
binary operators such as ∨, ∧, and −→, whereas the more intricate projection
construction shown in §3.3 is used to deal with existential quantifiers. Finally,
§3.4 illustrates the construction of automata corresponding to atomic formulae.
3.1 Complement
A well-formed DFA A will accept a word bss iff it is not accepted by the DFA
produced by negate-dfa:
Given a binary logical operator f :: bool ⇒ bool ⇒ bool, the product automaton
construction is used to build a DFA corresponding to the formula f P Q from
DFAs A and B corresponding to the formulae P and Q, respectively. As sug-
gested by its name, the state space of the product automaton corresponds to the
cartesian product of the state spaces of the DFAs A and B. However, as already
mentioned in §2.6, not all of the elements of the cartesian product constitute
reachable states. We therefore need an algorithm for computing the reachable
states of the resulting DFA. Moreover, since the automata framework described
in §2.4–2.5 relies on the states to be encoded as natural numbers, we also need
to produce a mapping from nat × nat to nat. All of this can be achieved just by
instantiating the abstract DFS framework with suitable functions, as shown in
Fig. 1. In this construction, the store containing the visited states is a pair nat
option list list × (nat × nat ) list, where the first component is a matrix denoting
156 S. Berghofer and M. Reiter
a partial map from nat × nat to nat. The second component of the store is a list
containing all visited states (i, j ). It can be viewed as a map from nat to nat ×
nat, which is the inverse of the aforementioned map. In order to compute the list
of successor states of a state (i, j ), prod-succs combines the BDDs representing
the transition tables of state i of A, and of state j of B using the Pair oper-
ator, and then collects all leaves of the resulting BDD. The operation prod-ins
for inserting a state into the store updates the entry at position (i, j ) of the
matrix tab with the number of visited states, and appends (i, j ) to the list ps of
visited states. By definition of DFS, this operation is guaranteed to be applied
only if the state (i, j ) has not been visited yet, i.e. the corresponding entry in
the matrix is None and (i, j ) is not contained in the list ps. We now produce
a specific version of DFS called prod-dfs by instantiating the generic function
from §2.6, and using the list containing just one pair of states as a start value.
By induction on gen-dfs, we can prove that the matrix and the list computed by
prod-dfs encodes a bijection between the reachable states (i, j ) of the product
automaton, and natural numbers k corresponding to the states of the resulting
DFA, where k is smaller than the number of reachable states:
If prod-is-node A B x then
((fst (prod-dfs A B x ))[i][j ] = Some k ∧ dfa-is-node A i ∧ dfa-is-node B j ) =
(k < |snd (prod-dfs A B x )| ∧ (snd (prod-dfs A B x ))[k ] = (i, j )).
3.3 Projection
Using the terminology from §2.3, the automaton for ∃ x . P can be obtained from
the one for P by projecting away the row corresponding to the variable x. Since
Formalizing the Logic-Automaton Connection 157
this operation yields an NFA, it is advantageous to first translate the DFA for P
into an NFA, which can easily be done by replacing all the leaves in the transition
table by singleton sets, and leaving the set of accepting states unchanged. The
correctness of this operation called nfa-of-dfa is expressed by
To produce the NFA corresponding to the quantified formula, we just map this
operation over the transition table:
Due to its type, we could apply this function repeatedly to quantify over several
variables in one go. The correctness of this construction is summarized by
This means that the new NFA accepts a list bss of column vectors iff the original
NFA accepts the list obtained from bss by inserting a suitable row vector bs
representing the existential witness. Matters are complicated by the additional
requirement that the word accepted by the new NFA must have the same length
as the witness. This requirement can be satisfied by appending zero vectors to the
end of bss, which does not change its interpretation. Since the other constructions
(in particular the complement) only work on DFAs, we turn the obtained NFA
into a DFA by applying the usual subset construction. The central idea is that
each set of states produced by nfa-steps can be viewed as a state of a new DFA.
As mentioned in §2.6, not all of these sets are reachable from the initial state
of the NFA. Similar to the product construction, the algorithm for computing
the reachable sets shown in Fig. 2 is an instance of the general DFS framework.
The node store is now a pair of type nat option bdd × bool list list, where the
first component is a BDD representing a partial map from finite sets (encoded as
bit vectors) to natural numbers, and the second component is the list of visited
states representing the inverse map. To insert new entries into a BDD, we use
158 S. Berghofer and M. Reiter
Recall that the automaton produced by quantify-nfa will only accept words
with a sufficient number of trailing zero column vectors. To get a DFA that also
accepts words without trailing zeros, we mark all states as accepting from which
an accepting state can be reached by reading only zeros. This construction, which
is sometimes referred to as the right quotient, can be characterized as follows:
We now come to the construction of DFAs for atomic formulae, namely Dio-
phantine (in)equations. For this purpose, we use a method due to Boudet and
Comon [3]. The key observation is that xs is a solution of a Diophantine equation
iff it is a solution modulo 2 and the quotient of xs and 2 is a solution of another
equation with the same coefficients, but with a different right-hand side:
(eval-dioph ks xs = l ) =
(eval-dioph ks (map (λx . x mod 2) xs) mod 2 = l mod 2 ∧
eval-dioph ks (map (λx . x div 2) xs) =
(l − eval-dioph ks (map (λx . x mod 2) xs)) div 2)
In other words, the states of the DFA accepting the solutions of the equation
correspond to the right-hand sides reachable from the initial right-hand side l,
which will again be computed using the DFS algorithm. To ensure termination
of DFS, it is crucial to prove that the reachable right-hand sides m are bounded:
If |m| ≤ max |l | ( k ←ks. |k |) then
|(m − eval-dioph ks (map (λx . x mod 2) xs)) div 2| ≤ max |l | ( k ←ks. |k |).
Formalizing the Logic-Automaton Connection 159
dioph-dfs :: nat ⇒ int list ⇒ int ⇒ nat option list × int list
that, given the number of variables, the coefficients, and the right-hand side,
computes a bijection between reachable right-hand sides and natural numbers:
160 S. Berghofer and M. Reiter
The first component of the pair returned by dioph-dfs can be viewed as a partial
map from integers to natural numbers, where int-to-nat-bij maps negative and
non-negative integers to odd and even list indices, respectively. As shown in
Fig. 3, the transition table of the DFA is constructed by eq-dfa as follows: if
the current state corresponds to the right-hand side j, and the DFA reads a
bit vector xs satisfying the equation modulo 2, then the DFA goes to the state
corresponding to the new right-hand side (j − eval-dioph ks xs) div 2, otherwise
it goes to an error state, which is the last state in the table. To produce a BDD
containing the successor states for all bit vectors of length n, we use the function
The key property of eq-dfa states that for every right-hand side m reachable
from l, the state reachable from m via a word bss is accepting iff the list of
natural numbers denoted by bss satisfies the equation with right-hand side m:
We now have all the machinery in place to write a decision procedure for Pres-
burger arithmetic. A formula can be transformed into a DFA by the following
function:
Formalizing the Logic-Automaton Connection 161
Note that a closed formula is valid iff the start state of the resulting DFA is
accepting, which can easily be seen by letting n = 0 and bss = []. Most cases of
the induction can be proved by a straightforward application of the correctness
results from §3. Unsurprisingly, the only complicated case is the one for the
existential quantifier, which we will now examine in more detail. In this case,
the left-hand side of the correctness theorem is
dfa-accepts
(rquot (det-nfa (quantify-nfa 0 (nfa-of-dfa (dfa-of-pf (Suc n) p)))) n) bss
5 Conclusion
First experiments with the algorithm presented in §4 show that it can compete
quite well with the standard decision procedure for Presburger arithmetic avail-
able in Isabelle. Even without minimization, the DFA for the stamp problem
from §2.1 has only 6 states, and can be constructed in less than a second. The
following table shows the size of the DFAs (i.e. the number of states) for all sub-
formulae of the stamp problem. Thanks to the DFS algorithm, they are much
smaller than the DFAs that one would have obtained using a naive construction:
The next step is to formalize a minimization algorithm, e.g. along the lines of
Constable et al. [6]. We also intend to explore other ways of constructing DFAs
for Diophantine equations, such as the approach by Wolper and Boigelot [15],
which is more complicated than the one shown in §3.4, but can directly deal
with variables over the integers rather than just natural numbers. To improve
the performance of the decision procedure on large formulae, we would also like
to investigate possible optimizations of the simple representation of BDDs pre-
sented in §2.3. Verma [14] describes a formalization of reduced ordered BDDs
with sharing in Coq. To model sharing, Verma’s formalization is based on a
memory for storing BDDs. Due to their dependence on the memory, algorithms
using this kind of BDDs are no longer purely functional, which makes reason-
ing about them substantially more challenging. Finally, we also plan to extend
our decision procedure to cover WS1S, and use it to tackle some of the circuit
verification problems described by Basin and Friedrich [1].
References
1. Basin, D., Friedrich, S.: Combining WS1S and HOL. In: Gabbay, D., de Rijke, M.
(eds.) Frontiers of Combining Systems 2. Studies in Logic and Computation, vol. 7,
pp. 39–56. Research Studies Press/Wiley (2000)
2. Berghofer, S., Nipkow, T.: Executing higher order logic. In: Callaghan, P., Luo, Z.,
McKinna, J., Pollack, R. (eds.) TYPES 2000. LNCS, vol. 2277, p. 24. Springer,
Heidelberg (2002)
3. Boudet, A., Comon, H.: Diophantine equations, Presburger arithmetic and finite
automata. In: Kirchner, H. (ed.) CAAP 1996. LNCS, vol. 1059, pp. 30–43. Springer,
Heidelberg (1996)
Formalizing the Logic-Automaton Connection 163
4. Boutin, S.: Using reflection to build efficient and certified decision procedures. In:
Ito, T., Abadi, M. (eds.) TACS 1997. LNCS, vol. 1281, pp. 515–529. Springer,
Heidelberg (1997)
5. Chaieb, A., Nipkow, T.: Proof synthesis and reflection for linear arithmetic. Journal
of Automated Reasoning 41, 33–59 (2008)
6. Constable, R.L., Jackson, P.B., Naumov, P., Uribe, J.: Constructively formalizing
automata theory. In: Plotkin, G., Stirling, C., Tofte, M. (eds.) Proof, Language,
and Interaction: Essays in Honor of Robin Milner. MIT Press, Cambridge (2000)
7. Harrison, J.: Metatheory and reflection in theorem proving: A survey and critique.
Technical Report CRC-053, SRI Cambridge (1995),
http://www.cl.cam.ac.uk/users/jrh/papers/reflect.dvi.gz
8. Klarlund, N.: Mona & Fido: The logic-automaton connection in practice. In:
Nielsen, M. (ed.) CSL 1997. LNCS, vol. 1414, pp. 311–326. Springer, Heidelberg
(1998)
9. Krauss, A.: Partial recursive functions in higher-order logic. In: Furbach, U.,
Shankar, N. (eds.) IJCAR 2006. LNCS, vol. 4130, pp. 589–603. Springer, Hei-
delberg (2006)
10. Minamide, Y.: Verified decision procedures on context-free grammars. In: Schnei-
der, K., Brandt, J. (eds.) TPHOLs 2007. LNCS, vol. 4732, pp. 173–188. Springer,
Heidelberg (2007)
11. Nipkow, T.: Verified lexical analysis. In: Grundy, J., Newey, M. (eds.) TPHOLs
1998. LNCS, vol. 1479, pp. 1–15. Springer, Heidelberg (1998)
12. Nipkow, T.: Linear quantifier elimination. In: Armando, A., Baumgartner, P.,
Dowek, G. (eds.) IJCAR 2008. LNCS, vol. 5195, pp. 18–33. Springer, Heidelberg
(2008)
13. Nishihara, T., Minamide, Y.: Depth first search. In: Klein, G., Nipkow, T., Paul-
son, L. (eds.) The Archive of Formal Proofs,
http://afp.sf.net/entries/Depth-First-Search.shtml (June 2004); Formal
proof development
14. Verma, K.N., Goubault-Larrecq, J., Prasad, S., Arun-Kumar, S.: Reflecting BDDs
in Coq. In: He, J., Sato, M. (eds.) ASIAN 2000. LNCS, vol. 1961, pp. 162–181.
Springer, Heidelberg (2000)
15. Wolper, P., Boigelot, B.: On the construction of automata from linear arithmetic
constraints. In: Schwartzbach, M.I., Graf, S. (eds.) TACAS 2000. LNCS, vol. 1785,
pp. 1–19. Springer, Heidelberg (2000)
Extended First-Order Logic
1 Introduction
First-order logic can be considered as a natural fragment of Church’s type the-
ory [1]. In this paper we exhibit a larger fragment of type theory, called EFO,
that still enjoys the characteristic properties of first-order logic: complete proof
systems, compactness, and countable models. EFO restricts quantification and
equality to base types but retains lambda abstractions and higher-order vari-
ables. Like type theory, EFO has a type o of truth values and admits functions
that take truth values to individuals. Such functions are not available in first-
order logic. A typical example is a conditional C : oιιι taking a truth value and
two individuals as arguments and returning one of the individuals. Here is a
valid EFO formula that specifies the conditional and states one of its properties:
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 164–179, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Extended First-Order Logic 165
in [7] to obtain standard models. We generalize the model existence theorem such
that we can obtain countable models using the abstract consistency technique.
In a preceding paper [7], we develop a tableau-based decision procedure for
the quantifier- and lambda-free fragment of EFO and introduce the possible-
values-based construction of standard models. In this paper we extend the model
construction to first-order quantification and lambda abstraction. We introduce
a novel subterm restriction for the universal quantifier and employ an abstract
normalization operator, both essential for proof search and decision procedures.
Due to space limitations we have to omit some proofs. They can be found in
the full paper at www.ps.uni-sb.de/Papers.
2 Basic Definitions
Types (σ, τ , μ) are obtained with the grammar τ ::= o | ι | τ τ . The elements
of o are the two truth values, ι is interpreted as a nonempty set, and a function
type στ is interpreted as the set of all total functions from σ to τ . For simplicity,
we provide only one sort ι. Everything generalizes to countably many sorts.
We distinguish between two kinds of names, called constants and variables.
Every name comes with a type. We assume that there are only countably many
names, and that for every type there are infinitely many variables of this type.
If not said otherwise, the letter a ranges over names, c over constants, and x
and y over variables.
Terms (s, t, u, v) are obtained with the grammar t ::= a | tt | λx.t where
an application st is only admitted if s : τ μ and t : τ for some types τ and μ.
Terms of type o are called formulas. A term is lambda-free if it does not contain a
subterm that is a lambda abstraction. We use N s to denote the set of all names
that have a free occurrence in the term s.
We assume that ⊥ : o, ¬ : oo, ∧ : ooo, =σ : σσo, and ∀σ : (σo)o are constants
for all types σ. We write ∀x.s for ∀σ (λx.s). An interpretation is a function I
that is defined on all types and all names and satisfies the following conditions:
– Io = {0, 1}
– I(στ ) is the set of all total functions from Iσ to Iτ
– I⊥ = 0
– I(¬), I(∧), I(=σ ), and I(∀σ ) are the standard interpretations of the respec-
tive logical constants.
We write Îs for the value the term s evaluates to under the interpretation I.
We say that an interpretation I is countable [finite] if Iι is countable [finite].
An interpretation I is a model of a set A of formulas if Îs = 1 for every formula
s ∈ A. A set of formulas is satisfiable if it has a model.
The constants ⊥, ¬, ∧, =ι , and ∀ι are called EFO constants. An EFO term is
a term that contains no other constants but EFO constants. We write EFOσ for
the set of all EFO terms of type σ. For simplicity, we work with a restricted set of
EFO constants. Everything generalizes to the remaining propositional constants,
the identity =o , and the existential quantifier ∃ι .
166 C.E. Brown and G. Smolka
3 Normalization
We assume a normalization operator [] that provides for lambda conversion. The
normalization operator [] must be a type preserving total function from terms
to terms. We call [s] the normal form of s and say that s is normal if [s] = s.
There are several possibilities for the normalization operator []: β-, long β-, or
βη-normal form, all possibly with standardized bound variables [8]. We will not
commit to a particular operator but state explicitly the properties we require
for our results. To start, we require the following properties:
N1 [[s]] = [s]
N2 [[s]t] = [st]
N3 [as1 . . . sn ] = a[s1 ] . . . [sn ] if the type of as1 . . . sn is o or ι
N4 Î[s] = Îs
Note that a ranges over names and I ranges over interpretations. N3 also applies
for n = 0.
We need further properties of the normalization operator that can only be
expressed with substitutions. A substitution is a type preserving partial function
from variables to terms. If θ is a substitution, x is a variable, and s is a term
that has the same type as x, we use θxs to denote the substitution that agrees
everywhere with θ but possibly on x where it yields s. We assume that every
substitution θ can be extended to a type preserving total function θ̂ from terms
to terms such that the following conditions hold:
S1 θ̂a = if a ∈ Dom θ then θa else a
S2 θ̂(st) = (θ̂s)(θ̂t)
S3 [(θ̂(λx.s))t] = [θxt s]
S4 ˆ
[∅s] = [s]
S5 N [s] ⊆ N s and N (θ̂s) ⊆ { N (θ̂a) | a ∈ N s }
Note that a ranges over names and that ∅ (the empty set) is the substitution
that is undefined on every variable.
4 Tableau System
The results of this paper originate with the tableau system T shown in Figure 1.
The rules in the first two lines of Figure 1 are the familiar rules from first-order
logic. The rules in the third and fourth line deal with embedded formulas. The
mating rule Tmat decomposes complementary atomic formulas by introducing
disequations that confront corresponding subterms. Disequations can be further
decomposed with Tdec . Embedded formulas are eventually raised to the top level
by Rule Tbe , which incorporates Boolean extensionality. Rule Tfe incorporates
functional extensionality. It reduces disequations at functional types to disequa-
tions at lower types. The confrontation rule Tcon deals with positive equations
at type ι. A discussion of the confrontation rule can be found in [7]. The tableau
rules are such that they add normal formulas if they are applied to normal
formulas.
Extended First-Order Logic 167
∀ι s ¬∀ι s
T∀ t:ι T¬∀ x : ι fresh
[st] ¬[sx]
s = s s =o t s =στ t
T= Tbe Tfe x : σ fresh
⊥ s , ¬t | ¬s , t [sx] = [tx]
s =ι t , u =ι v
Tcon
s = u, t = u | s = v, t = v
Example 4.1. The following tableau refutes the formula pf ∧¬p(λx.¬¬f x) where
p : (ιo)o and f : ιo.
pf ∧ ¬p(λx.¬¬f x)
pf, ¬p(λx.¬¬f x)
f = (λx.¬¬f x)
f x = ¬¬f x
f x, ¬¬¬f x ¬f x, ¬¬f x
¬f x ⊥
⊥
5 Evidence
E⊥ ⊥ is not in E.
E¬ If ¬x is in E, then x is not in E.
E¬¬ If ¬¬s is in E, then s is in E.
E∧ If s ∧ t is in E, then s and t are in E.
E¬∧ If ¬(s ∧ t) is in E, then ¬s or ¬t is in E.
E∀ If ∀ι s is in E, then [st] is in E for all t ∈ DE,
and [st] is in E for some t ∈ EFOι .
E¬∀ If ¬∀ι s is in E, then ¬[st] is in E for some t ∈ EFOι .
Emat If xs1 . . . sn and ¬xt1 . . . tn are in E where n ≥ 1,
then si = ti is in E for some i ∈ {1, . . . , n}.
Edec If xs1 . . . sn =ι xt1 . . . tn is in E where n ≥ 1,
then si = ti is in E for some i ∈ {1, . . . , n}.
E= If s =ι t is in E, then s and t are different.
Ebe If s =o t is in E, then either s and ¬t are in E or ¬s and t are in E.
Efe If s =στ t is in E, then [sx] = [tx] is in E for some variable x.
Econ If s =ι t and u =ι v are in E,
then either s = u and t = u are in E or s = v and t = v are in E.
1. E¬ is restricted to variables.
2. E∀ requires less instances than T∀ admits.
3. E¬∀ admits all EFO terms as witnesses.
4. E= is restricted to type ι.
6 Carriers
We assume that some evident branch E is given. We say that a set T ⊆ EFOι
is compatible if there are no terms s, t ∈ T such that ([s]=[t]) ∈ E. We write s t
if E contains the disequation s=t or t=s.
Let a non-empty set D and a relation ι ⊆ EFOι × D be given. For T ⊆ EFOι
and a ∈ D we write T ι a if t ι a for every t ∈ T . For all terms s, t ∈ EFOι , all
values a, b ∈ D, and every set T ⊆ EFOι we require the following properties:
B1 s ι a iff [s] ι a.
B2 T compatible iff T ι a for some a ∈ D.
B3 If (s=ι t) ∈ E and s ι a and t ι b, then a = b.
B4 For every a ∈ D either t ι a for some t ∈ DE or t ι a for every t ∈ EFOι .
Given an evident branch E, a carrier for E is a pair (D, ι) as specified above.
D := { s̃ | s ∈ EFOι }
s ι t̃ :⇐⇒ s ∼ t
We will show that (D, ι ) is a carrier for E. Note that ι is well-defined since ∼
is an equivalence relation. D is countable since EFOι is countable.
B1. We have to show that s ∼ t iff [s] ∼ t. This follows with N3 and N1 since
s ∼ t iff [s=t] ∈ E and [s] ∼ t iff [[s]=t] ∈ E.
B2. If T is empty, B2 holds vacuously. Otherwise, let t ∈ T . Then T is compatible
iff s ∼ t for all s ∈ T by Propositions 6.3 and 6.2. Hence T is compatible iff s ι t̃
for all s ∈ T . The claim follows.
170 C.E. Brown and G. Smolka
B3. Let s=ι t in E and s ι ũ and t ι ṽ. Since s=t is normal, we have s ∼ t. By
definition of ι we have s ∼ u and t ∼ v. Hence ũ = ṽ since ∼ is an equivalence
relation.
B4. If DE is empty, then s ι t̃ for all s, t ∈ EFOι and hence the claim holds.
Otherwise, let DE be nonempty. We show the claim by contradiction. Suppose
there is a term t ∈ EFOι such that s ι t̃ for all s ∈ DE. Then [s=t] ∈ E for
all s ∈ DE by Proposition 6.1. Since DE is nonempty, we have [t] ∈ DE by N3.
Thus ([t]=[t]) ∈ E by N3. Contradiction by E= .
We will now show that every evident branch has a carrier. Let an evident
branch E be given. We will call a term discriminating if it is discriminating
in E. A discriminant is a maximal set a of discriminating terms such that there
is no disequation s=t ∈ E such that s, t ∈ a. We will construct a carrier for E
whose values are the discriminants.
Example 6.5. Suppose E = {x=y, x=z, y =z} and x, y, z : ι. Then there are 3
discriminants: {x}, {y}, {z}.
Proof. The first claim follows by contradiction. Suppose there are no terms s ∈ a
and t ∈ b such that s t. Let s ∈ a. Then s ∈ b since b is a maximal compatible
set of discriminating terms. Thus a ⊆ b and hence a = b since a is maximal.
Contradiction.
The second claim also follows by contradiction. Suppose there is an equation
(s1 =s2 ) ∈ E such that s1 ∈ a and s2 ∈ b. By the first claim we have terms s ∈ a
and t ∈ b such that s t. By Econ we have s1 s or s2 t. Contradiction since a
and b are discriminants.
We will show that (D, ι ) is a carrier for E. By Proposition 6.7 we know that D
is finite if E is finite.
B1. Holds by N1.
For the remaining carrier conditions we distinguish two cases. If DE = ∅, then ∅
is the only discriminant and B2, B3, and B4 are easily verified. Otherwise, let
DE = ∅.
B2⇒. Let T be compatible. Then there exists a discriminant a that contains all
the discriminating terms in { [t] | t ∈ T }. The claim follows since T a.
B2⇐. By contradiction. Suppose T a and T is not compatible. Then there are
terms s, t ∈ T such that ([s]=[t]) ∈ E. Thus [s] and [t] cannot be both in a. This
contradicts s, t ∈ T a since [s] and [t] are discriminating.
B3. Let (s=t) ∈ E and s ι a and t ι b. We show a = b. Since there are
discriminating terms, E contains at least one disequation at type ι, and hence
s and t are discriminating by Econ . By N3 s and t are normal and hence s ∈ a
and t ∈ b. Now a = b by Proposition 6.8 (2).
B4. Since there are discriminating terms, we know by E= that every discriminant
contains at least one discriminating term. Since discriminating terms are normal,
we have the claim.
7 Model Existence
We will now show that every evident branch has a model.
Lemma 7.1 (Model Existence). Let (D, ι ) be a carrier for an evident
branch E. Then E has a model I such that Iι = D.
We start the proof of Lemma 7.1. Let (D, ι ) be a carrier for an evident branch E.
For the rest of the proof we only consider interpretations I such that Iι = D.
s o 0 :⇐⇒ [s] ∈
/E
s o 1 :⇐⇒ ¬[s] ∈/E
s στ f :⇐⇒ st τ f a whenever t σ a
Note that we already have a possible-values relation for ι and that the definition
of the possible values relations for functional types is by induction on types. Also
172 C.E. Brown and G. Smolka
note that if s is an EFO formula such that [s] ∈ / E and ¬[s] ∈/ E, then both 0
and 1 are possible values for s. We will show that every EFO term has a possible
value and that we obtain a model of E if we define Ix as a possible value for x
for every variable x.
Proof. By induction on σ. For o the claim follows with N1. For ι the claim follows
with B1. Let σ = τ μ.
Suppose s σ a. Let t τ b. Then st μ ab. By inductive hypothesis [st] μ ab.
Thus [[s]t] μ ab by N2. By inductive hypothesis [s]t μ ab. Hence [s] σ a.
Suppose [s] σ a. Let t τ b. Then [s]t μ ab. By inductive hypothesis [[s]t] μ ab.
Thus [st] μ ab by N2. By inductive hypothesis st μ ab. Hence s σ a.
7.2 Compatibility
It remains to show that there is an admissible interpretation and that every ad-
missible interpretation is a model of E. For this purpose we define compatibility
relations σ ⊆ EFOσ × EFOσ for all types:
Note that the definition of the compatibility relations for functional types is by
induction on types. We say that s and t are compatible if s t. A set T of equi-
typed terms is compatible if s t for all terms s, t ∈ T . If T ⊆ EFOσ , we write
T a if a is a common possible value for all terms s ∈ T . We will show that a
set of equi-typed terms is compatible if and only if all its terms have a common
possible value.
The compatibility relations are reflexive. We first show x x for all vari-
ables x. For the induction to go through we strengthen the hypothesis.
Lemma 7.5 (Reflexivity). For every type σ and all EFO terms s, t, xs1 . . . sn ,
xt1 . . . tn of type σ with n ≥ 0:
1. Not both s σ t and [s] [t].
2. Either xs1 . . . sn σ xt1 . . . tn or [si ] [ti ] for some i ∈ {1, . . . , n}.
We can now prove Lemma 7.1. By Lemma 7.5 (2) we know x x for every
variable x. Hence there exists an admissible interpretation I by Lemma 7.6. By
Lemma 7.7 we know that I is a model of E. This finishes the proof of Lemma 7.1.
8 Abstract Consistency
To obtain our main results, we boost the model existence lemma with the ab-
stract consistency technique. Everything works out smoothly.
An abstract consistency class is a set Γ of branches such that every branch
A ∈ Γ satisfies the conditions in Figure 3. An abstract consistency class Γ is
complete if for every A ∈ Γ and all s, t ∈ EFOι either A ∪ {[s=t]} is in Γ or
A ∪ {[s=t]} is in Γ .
C⊥ ⊥ is not in A.
C¬ If ¬x is in A, then x is not in A.
C¬¬ If ¬¬s is in A, then A ∪ {s} is in Γ .
C∧ If s ∧ t is in A, then A ∪ {s, t} is in Γ .
C¬∧ If ¬(s ∧ t) is in A, then A ∪ {¬s} or A ∪ {¬t} is in Γ .
C∀ If ∀ι s is in A, then A ∪ {[st]} is in Γ for all t ∈ DA,
and A ∪ {[st]} is in Γ for some t ∈ EFOι
C¬∀ If ¬∀ι s is in A, then A ∪ {¬[st]} is in Γ for some t ∈ EFOι .
Cmat If xs1 . . . sn is in A and ¬xt1 . . . tn is in A where n ≥ 1,
then A ∪ {si = ti } is in Γ for some i ∈ {1, . . . , n}.
Cdec If xs1 . . . sn =ι xt1 . . . tn is in A where n ≥ 1,
then A ∪ {si = ti } is in Γ for some i ∈ {1, . . . , n}.
C= If s =ι t is in A, then s and t are different.
Cbe If s =o t is in A, then either A ∪ {s, ¬t} or A ∪ {¬s, t} is in Γ .
Cfe If s =στ t is in A, then A ∪ {[sx] = [tx]} is in Γ for some variable x.
Ccon If s =ι t and u =ι v are in A,
then either A ∪ {s = u, t = u} or A ∪ {s = v, t = v} is in Γ .
9 Completeness
We will now show that the tableau system T is complete. In fact, we will show
the completeness of a tableau system R that is obtained from T by restricting
the applicability of some of the rules. We consider R since it provides for more
focused proof search and also yields a decision procedure for three substantial
176 C.E. Brown and G. Smolka
11 Decidability
The tableau system R defined in § 9 yields a procedure that decides the satisfi-
ability of three substantial fragments of EFO. Starting with the initial branch,
the procedure applies tableau rules until it reaches a branch that contains ⊥ or
cannot be extended with the tableau rules. The procedure returns “satisfiable”
if it arrives at a terminal branch that does not contain ⊥, and “unsatisfiable”
if it finds a refutation. There are branches on which the procedure does not
terminate (e.g., {∀ι x. f x=x}). We first establish the partial correctness of the
procedure.
Proposition 11.1 (Verification Soundness). Let A be a finite branch that
does not contain ⊥ and cannot be extended with R. Then A is evident and has
a finite model.
Proposition 11.2 (Refutation Soundness)
Every refutable branch is unsatisfiable.
For the termination of the procedure we consider the relation A → A that holds
if A and A are branches such that ⊥ ∈ / A A and A can be obtained from
A by applying a rule of R. We say that R terminates on a set Δ of branches if
there is no infinite derivation A → A → A → · · · such that A ∈ Δ.
Proposition 11.3. Let R terminate on a set Δ of finite branches. Then satis-
fiability of the branches in Δ is decidable and every satisfiable branch in Δ has
a finite model.
Proof. Follows with Propositions 11.2 and 11.1 and Theorem 7.8.
The decision procedure depends on the normalization operator employed with R.
A normalization operator that yields β-normal forms provides for all termination
results proven in this section. Note that the tableau system applies the normaliza-
tion operator only to applications st where s and t are both normal and t has type
ι if it is not a variable. Hence at most one β-reduction is needed for normalization
if s and t are β-normal. Moreover, no α-renaming is needed if the bound variables
are chosen differently from the free variables. For clarity, we continue to work with
an abstract normalization operator and state a further condition:
178 C.E. Brown and G. Smolka
A type is pure if it does not contain o. A term is pure if the type of every name
occurring in it (bound or unbound) is pure. An equation s = t or disequation
s = t is pure if s and t are pure terms.
We now know that the validity of pure equations is decidable, and that the inva-
lidity of pure equations can be demonstrated with finite interpretations (Propo-
sition 11.1). Both results are well-known [9,10], but it is remarkable that we
obtain them with different proofs and as a byproduct.
It is well-known that satisfiability of Bernays-Schönfinkel-Ramsey formulas
(first-order ∃∗ ∀∗ -prenex formulas without functions) is decidable and the frag-
ment has the finite model property [3]. We reobtain this result by showing that
R terminates for the respective fragment. We call a type BSR if it is ι or o or
has the form ι . . . ιo. We call an EFO formula s BSR if it satisfies two conditions:
For simplicity, our BSR formulas don’t provide for outer existential quantifica-
tion. We need one more condition for the normalization operator:
In [7] we study lambda- and quantifier-free EFO and show that the concomitant
subsystem of R terminates on finite branches. The result extends to lambda-free
branches containing quantifiers (e.g., {∀ι f }).
Extended First-Order Logic 179
12 Conclusion
In this paper we have shown that the EFO fragment of Church’s type theory en-
joys the characteristic properties of first-order logic. We have devised a complete
tableau system that comes with a new treatment of equality (confrontation) and
a novel subterm restriction for the universal quantifier (discriminating terms).
The tableau system decides lambda-free formulas, Bernays-Schönfinkel-Ramsey
formulas, and equations between pure lambda terms.
References
1. Andrews, P.B.: Classical type theory. In: Robinson, A., Voronkov, A. (eds.) Hand-
book of Automated Reasoning, vol. 2, pp. 965–1007. Elsevier Science, Amsterdam
(2001)
2. Brown, C.E.: Automated Reasoning in Higher-Order Logic: Set Comprehension
and Extensionality in Church’s Type Theory. College Publications (2007)
3. Börger, E., Grädel, E., Gurevich, Y.: The Classical Decision Problem. Springer,
Heidelberg (1997)
4. Smullyan, R.M.: First-Order Logic. Springer, Heidelberg (1968)
5. Fitting, M.: First-Order Logic and Automated Theorem Proving. Springer, Hei-
delberg (1996)
6. Prawitz, D.: Hauptsatz for higher order logic. J. Symb. Log. 33, 452–457 (1968)
7. Brown, C.E., Smolka, G.: Terminating tableaux for the basic fragment of simple
type theory. In: Giese, M., Waaler, A. (eds.) TABLEAUX 2009. LNCS (LNAI),
vol. 5607, pp. 138–151. Springer, Heidelberg (2009)
8. Hindley, J.R.: Basic Simple Type Theory. Cambridge Tracts in Theoretical Com-
puter Science, vol. 42. Cambridge University Press, Cambridge (1997)
9. Friedman, H.: Equality between functionals. In: Parikh, R. (ed.) Proc. Logic Col-
loquium 1972-73. Lectures Notes in Mathematics, vol. 453, pp. 22–37. Springer,
Heidelberg (1975)
10. Statman, R.: Completeness, invariance and lambda-definability. J. Symb.
Log. 47(1), 17–26 (1982)
Formalising Observer Theory for
Environment-Sensitive Bisimulation
1 Introduction
In most symbolic techniques for reasoning about security protocols, certain as-
sumptions are often made concerning the capability of an intruder that tries
to compromise the protocols. A well-known model of intruder is the so-called
Dolev-Yao model [10], which assumes perfect crytography. We consider here a
formal account of Dolev-Yao intruder model, formalised as some sort of de-
duction system. This deductive formulation is used in formalisations of various
“environment-sensitive” bisimulations (see e.g., [6]) for process calculi designed
for modeling security protocols, such as the spi-calculus [3]. An environment-
sensitive bisimulation is a bisimulation relation which is indexed by a structure
representing the intruder’s knowledge, which we call an observer theory.
An important line of work related to the spi-calculus, or process calculi in
general, is that of automating bisimulation checking. The transition semantics of
these calculi often involve processes with infinite branching (e.g., transitions for
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 180–195, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Formalising Observer Theory for Environment-Sensitive Bisimulation 181
namely, those that concern decidability of consistency checking for (symbolic) ob-
server theories. In Section 3 we consider formalisation of a notion of theory reduc-
tion and decidability of consistency checking for observer theories. In Section 4
we discuss a symbolic representation of observer theories using pairs of symbolic
traces [5], called bi-traces, their consistency requirements and a notion of respectful
substitutions. We prove a key lemma which relates a symbolic technique for trace
refinement [5] to bi-traces, and discuss how this may lead to a decision procedure
for testing bi-trace consistency. Section 5 concludes.
Isabelle notation. The Isabelle codes for the results of this paper can be found
at http://users.rsise.anu.edu.au/~jeremy/isabelle/2005/spi/. In the
statement of lemma or theorem, a name given in typewriter font indicates
the name of the relevant theorem in our Isabelle development. We show selected
theorems and definitions in the text, and more in the Appendix. A version of
the paper, including the Appendix, is in http://users.rsise.anu.edu.au/
~jeremy/pubs/spi/fotesb/. So now we indicate some key points of the Isabelle
notation.
– A name preceded by ? indicates a variable: other names are entities which
have been defined as part of the theory
– Conclusion β depending on assumptions αi is [| α1 ;α2 ; . . . ;αn |] ==> β
– ∀, ∃ are written as ALL, EX
– ⊆, ⊇, ∈ are written as <=, >=, :
2 Observer Theory
An observer theory describes the knowledge accumulated by an observer in its
interaction with a process (in the form of messages sent over networks), and
its capability in analyzing and synthesizing messages. Since messages can be en-
crypted, and the encryption key may be unknown to the observer, it is not always
the case that the observer can decompose all messages sent over the networks.
In the presence of an active intruder, the traditional notion of bisimulation is
not fine grained enough to prove interesting equivalence of protocols. A notion
of bisimulation in which the knowledge and capability of the intruder is taken
into account is often called an environment-sensitive bisimulation.
Messages are expressions formed from names, pairing constructor, e.g.,
M, N , and symmetric encryption, e.g., {M }K , where K is the encryption key
and M is the message being encrypted. Note that we restrict to pairing and
encryption to simplify discussion; there is no difficulty in extending the set of
messages to include other constructors, including asymmetric encryption, natu-
ral numbers, etc. For technical reasons, we shall distinguish two kinds of names:
flexible names and rigid names. We shall refer to flexible names as simply names.
Names will be denoted with lower-case letters, e.g., a, x, y, etc., and rigid names
will be denoted with bold letters, e.g., a, b, etc. We let N denote the set of
names and N = denote the set of pairs (x, x) of the same name. A name is really
just a variable, i.e., a site for substitutions, and rigid names are just constants.
Formalising Observer Theory for Environment-Sensitive Bisimulation 183
x∈N Σ M Σ N
(var) (id) (pr)
Σ x Σ, M M Σ M, N
Σ M Σ N Σ, M, N R Σ N Σ, M, N R
(er) (pl) (el)
Σ {M }N Σ, M, N R Σ, {M }N R
x∈N (M, N ) ∈ Γ
(var) (id)
Γ
x↔x Γ M ↔N
Γ Ma ↔ Na Γ Mb ↔ Nb Γ Mp ↔ Np Γ Mk ↔ Nk
(pr) (er)
Γ Ma , Mb ↔ Na , Nb Γ {Mp }Mk ↔ {Np }Nk
Γ, (Ma , Na ), (Mb , Nb ) M ↔ N
(pl)
Γ, (Ma , Mb , Na , Nb ) M ↔ N
Γ Mk ↔ Nk Γ, (Mp , Np ), (Mk , Nk ) M ↔ N
(el)
Γ, ({Mp }Mk , {Np }Nk ) M ↔ N
{a}b using the second message b, but the same operation cannot be done on
the second projection. The formal definition of consistency involves checking all
message pairs (M, N ) such that Γ M ↔ N is derivable for certain similarity
of observations. The first part of this paper is about verifying that this infinite
quantification is not necessary. This involves showing that for every theory Γ ,
there is a corresponding reduced theory that is equivalent, but for which consis-
tency checking requires only checking finitely many message pairs.
Symbolic observer theory: The definition of open bisimulation for name-
passing calculi, such as the π-calculus, typically includes closure under a certain
notion of respectful substitutions [13]. In the π-calculus, this notion of respectful-
ness is defined w.r.t. to a notion of distinction among names, i.e., an irreflexive
relation on names which forbids identification of certain names. In the case of
the spi-calculus, things get more complicated because the bisimulation relation
is indexed by an observer theory, not just a simple distinction on names. We
need to define a symbolic representation of observer theories, and an appropri-
ate notion of consistency for the symbolic theories. These are addressed in [14]
via a structure called bi-traces. A bi-trace is essentially a list of pairs of messages.
It can be seen as a pair of symbolic traces, in the sense of [5]. The order of the
message pairs in the list indicates the order of their creation (i.e., by the intruder
or by the processes themselves). Names in a bi-trace indicate undetermined mes-
sages, which are open to instantiations. Therefore the notion of consistency of
bi-traces needs to take into account these possible instantiations. Consider the
following sequence of message pairs: (a, d), ({a}b , {d}k ), ({c}{x}b , {k}l ). Con-
sidered as a theory, it is consistent, since none of the encryption keys are known
to the observer. However, if we allow x to be instantiated to a, then the result-
ing theory {(a, d), ({a}b , {d}k ), ({c}{a}b , {k}l )} is inconsistent, since on the first
Formalising Observer Theory for Environment-Sensitive Bisimulation 185
projection, {a}b can be used as a key to decrypt {c}{a}b , while in the second
projection, no decryption is possible. Therefore to check consistency of a bi-
trace, one needs to consider potentially infinitely many instances of the bi-trace.
Section 4 shows some key steps to simplify consistency checking for bi-traces.
We now discuss our formalisation of observer theory and its consistency proper-
ties in Isabelle/HOL.
The datatype for messages is represented in Isabelle/HOL as follows.
datatype msg = Name nat | Rigid nat | Mpair msg msg | Enc msg msg
A observer theory, as already noted, is a finite set of pairs of messages. In Isabelle,
we just use a set of pairs, so the finiteness condition appears in the Isabelle
statements of many theorems. The judgment Γ M ↔ N is represented by
(Γ, (M, N )), or, equivalently in Isabelle, (Γ, M, N ).
In Isabelle we define, inductively, a set of sequents indist which is the set of
sequents derivable in the proof system for message equivalence (Figure 2). Sub-
sequently we found it helpful to define the corresponding set of rules explicitly,
calling them indpsc. The rules for message synthesis, given in Figure 1, are just
a projection to one component of the rule set indpsc; we call this projection
smpsc. It is straightforward to extend the notion of a projection on rule sets,
so we can define the rules for message synthesis as simply smpsc = π1 (indpsc).
The formal expression in Isabelle is more complex: see Appendix A.3. Likewise,
we write pair (X) to turn each message M into the pair (M, M ) in a theory,
sequent, rule or bi-trace X.
The following lemma relates message synthesis and message equivalence.
Lemma 1(d) depends on theory consistency, to be introduced later.
any proof of the sequent. Some results relevant to this argument for decidability
are presented in Appendix A.4. Here we present an alternative proof for the
decidability of and termination of theory reduction.
Tiu [14, Definition 4] defines a reduction relation of observer theories:
We assume that Γ does not contain (Ma , Mb ,Na , Nb ) and ({Mp }Mk ,{Np }Nk )
respectively (otherwise reduction would not terminate). This reduction relation
is terminating and confluent, and so every theory Γ reduces to a unique normal
form Γ⇓. It also preserves the entailment .
It is easy to show that −→ is well-founded, since the sum of the sizes of [the
first member of each of] the message pairs reduces each time. Confluence is
reasonably easy to see since the side condition for the second rule is of the
form Γ Mk ↔ Nk where Γ is exactly the theory being reduced, and, from
Lemma 2, this condition (for a particular Mk , Nk ) will continue to hold, or not,
when other reductions have changed Γ . Actually, proving confluence in Isabelle
was not so easy, and we describe the difficulty and our proof in Appendix A.6.
Then it is a standard result, and easy in Isabelle, that confluence and termination
give normal forms.
This definition does not give the same relation, but we are able to show that
the two relations have the same normal forms. Using this reduction relation,
the procedure to decide whether Γ M ↔ N is: calculate Γ ⇓ and determine
whether Γ ⇓ M ↔ N . Calculating Γ ⇓ requires deciding questions of the form
Γ Mk ↔ Nk , where Γ is smaller than Γ (because a pair ({Mp }Mk ,{Np }Nk )
is omitted). Thus this procedure terminates.
Formalising Observer Theory for Environment-Sensitive Bisimulation 187
Note that Lemma 2 also holds for −→ since −→ ⊆ −→.
To show the two relations have the same normal forms, we first show (in
Theorem 4(b)) that if Γ is −→-reducible, then it is −→ -reducible, even though
the same reduction may not be available.
Theorem 4. (a) (red alt lem) If Γ Mk ↔ Nk then either
Γ \ {({Mp }Mk , {Np }Nk )} Mk ↔ Nk or there exists Γ such that Γ −→ Γ
(b) (oth red alt lem) If Γ −→ Δ then there exists Δ such that Γ −→ Δ
(c) (rsmin or alt) If Γ is −→ -minimal (i.e., cannot be reduced further) then
Γ is −→-minimal
(d) (nf acc alt) Γ −→ Γ⇓ (where Γ⇓ is the −→-normal form of Γ )
(e) (nf alt, nf same) Γ⇓ is also the −→ -normal form of Γ
We can now define a function reduce which computes a −→ -normal form.
reductions available. However we find that the condition is that theories entail
the same pairs iff their normal forms are equal, modulo N = .
We could further change −→ by deleting the (Mk , Nk ) from the second rule.
Lemma 2(a) holds for this new relation. For further discussion see Appendix A.7.
Definition 9. A theory Γ satisfies the predicate thy cons if for every M and
N , if Γ M ↔ N then the following hold:
(a) M and N are of the same type of expressions, i.e., as in Definition 8(a)
(b) for every M, N , Mp , Np if Γ M ↔ N or Γ {Mp }M ↔ {Np }N , then
M = M iff N = N
Definition 15. [15, Definition 35] The set of consistent bi-traces are defined
inductively (on the length of bi-traces) as follows:
192 J.E. Dawson and A. Tiu
The definition of match rc1 (Appendix A.24) follows that of is der virt red
(Appendix A.13), so Theorem 21(a) holds whether or not Γ is actually reduced.
It will be seen that it involves testing for membership of a finite set, and
corresponding uses of the operator, (as in the case of reduce, as discussed
earlier). Therefore we assert that match rc1 is finitely computable.
The return type of match rc1 is message option, which is Some res if the
result res is successfully found, or None to indicate failure.
Theorem 21. (a) (match rc1 iff idvr) If Γ satisfies thy cons red, then
is der virt red (Γ, M, N ) iff match rc1 Γ M = Some N
(b) (match rc1 indist) If Γ is consistent, then
Γ M ↔ N iff match rc1 Γ⇓ M = Some N
Then we defined a function second sub which uses match rc1 to find the appro-
priate value of xθ2 for each new x which appears in the bi-trace, and we proved
that second sub does in fact compute the θ2 of Theorem 20. See Appendix A.26
for the definition of second sub and this result. The function second sub tests
membership of a finite set, and uses reduce and match rc1, so we assert that
second sub is also finitely computable.
We have modelled observer theories and bi-traces in the Isabelle theorem prover,
and have confirmed, by proofs in Isabelle, the results of a considerable part of
[14]. This work constitutes a significant step formalising open bisimulation for
the spi-calculus in Isabelle/HOL, and ultimately towards a logical framework for
proving process equivalence.
We discussed the issue of showing finite computability in Isabelle/HOL, using
a mixed formal/informal argument, and building upon the discussion in Urban
et al [17]. We defined a function reduce in Isabelle, and showed that it computes
Γ⇓. Isabelle required us to show that the function terminates. We asserted, with
relevant discussion, that inspection shows that the definition does not introduce
any infinite aspect into the computation and so asserted that therefore the func-
tion is finitely computable. Similarly, we provided a finitely computable function
is der virt and proved that it tests Γ M ↔ N for a reduced theory Γ .
We then considered bi-traces and bi-trace consistency. The problem here is
that, to test bi-trace consistency, it is necessary to test whether Γ θ is consistent
for all θ satisfying certain conditions. We proved a number of lemmas which
simplify this task, and appear to lead to a finitely computable algorithm for this.
In particular, our result on the unique completion of respectful substitutions that
relates symbolic trace and bi-trace opens up the possibility to use symbolic trace
refinement algorithm [5] to compute a notion of bi-trace refinement, which will
be useful for bi-trace consistency checking.
Another approach to representating observer theories is to use equational
theories, instead of deduction rules, e.g., as in the applied-pi calculus [1]. In this
setting, the notion of consistency of a theory is replaced by the notion of static
Formalising Observer Theory for Environment-Sensitive Bisimulation 195
equivalence between knowledge of observers [1]. Baudet has shown that static
equivalence between two symbolic theories is decidable [4], for a class of theories
called subterm-convergent theories (which subsumes the Dolev-Yao model of
intruder). It will be interesting to work out the precise correspondence between
static equivalence and our notion of bi-trace consistency, as such correspondence
may transfer proof techniques from one approach to the other.
Acknowledgment. We thank the anonymous referees for their comments on an
earlier draft. This work is supported by the Australian Research Council through
the Discovery Projects funding scheme (project number DP0880549).
References
1. Abadi, M., Fournet, C.: Mobile values, new names, and secure communication. In:
POPL, pp. 104–115 (2001)
2. Abadi, M., Gordon, A.D.: A bisimulation method for cryptographic protocols.
Nord. J. Comput. 5(4), 267–303 (1998)
3. Abadi, M., Gordon, A.D.: A calculus for cryptographic protocols: The spi calculus.
Information and Computation 148(1), 1–70 (1999)
4. Baudet, M.: Sécurité des protocoles cryptographiques: aspects logiques et calcula-
toires. PhD thesis, École Normale Supérieure de Cachan, France (2007)
5. Boreale, M.: Symbolic trace analysis of cryptographic protocols. In: Orejas, F.,
Spirakis, P.G., van Leeuwen, J. (eds.) ICALP 2001. LNCS, vol. 2076, pp. 667–681.
Springer, Heidelberg (2001)
6. Boreale, M., De Nicola, R., Pugliese, R.: Proof techniques for cryptographic pro-
cesses. SIAM J. Comput. 31(3), 947–986 (2001)
7. Borgström, J., Briais, S., Nestmann, U.: Symbolic bisimulation in the spi calculus.
In: Gardner, P., Yoshida, N. (eds.) CONCUR 2004. LNCS, vol. 3170, pp. 161–176.
Springer, Heidelberg (2004)
8. Borgström, J., Nestmann, U.: On bisimulations for the spi calculus. Mathematical
Structures in Computer Science 15(3), 487–552 (2005)
9. Dawson, J.E., Goré, R.: Formalising cut-admissibility for provability logic (submit-
ted, 2009)
10. Dolev, D., Yao, A.: On the security of public-key protocols. IEEE Transactions on
Information Theory 2(29) (1983)
11. Kahsai, T., Miculan, M.: Implementing spi calculus using nominal techniques. In:
Beckmann, A., Dimitracopoulos, C., Löwe, B. (eds.) CiE 2008. LNCS, vol. 5028,
pp. 294–305. Springer, Heidelberg (2008)
12. Milner, R., Parrow, J., Walker, D.: A calculus of mobile processes, Part II. Infor-
mation and Computation, 41–77 (1992)
13. Sangiorgi, D.: A theory of bisimulation for the pi-calculus. Acta Inf. 33(1), 69–97
(1996)
14. Tiu, A.: A trace based bisimulation for the spi calculus: An extended abstract. In:
Shao, Z. (ed.) APLAS 2007. LNCS, vol. 4807, pp. 367–382. Springer, Heidelberg
(2007)
15. Tiu, A.: A trace based bisimulation for the spi calculus. Preprint (2009),
http://arxiv.org/pdf/0901.2166v1
16. Tiu, A., Goré., R.: A proof theoretic analysis of intruder theories. In: Proceedings
of RTA 2009 (to appear, 2009)
17. Urban, C., Cheney, J., Berghofer, S.: Mechanizing the metatheory of LF. In: LICS,
pp. 45–56. IEEE Computer Society, Los Alamitos (2008)
Formal Certification of a Resource-Aware
Language Implementation
Abstract. The paper presents the development, by using the proof as-
sistant Isabelle/HOL, of a compiler back-end translating from a func-
tional source language to the bytecode language of an abstract machine.
The Haskell code of the compiler is extracted from the Isabelle/HOL
specification and this tool is also used for proving the correctness of the
implementation. The main correctness theorem not only ensures func-
tional semantics preservation but also resource consumption preserva-
tion: the heap and stacks figures predicted by the semantics are confirmed
in the translation to the abstract machine.
The language and the development belong to a wider Proof Carrying
Code framework in which formal compiler-generated certificates about
memory consumption are sought for.
1 Introduction
The first-order functional language Safe has been developed in the last few years
as a research platform for analysing and formally certifying two properties of
programs related to memory management: absence of dangling pointers and
having an upper bound to memory consumption.
Two features make Safe different from conventional functional languages:
(a) the memory management system does not need a garbage collector; and
(b) the programmer may ask for explicit destruction of memory cells, so that
they could be reused by the program. These characteristics, together with the
above certified properties, make Safe useful for programming small devices where
memory requirements are rather strict and where garbage collectors are a burden
both in space and in service availability.
The Safe compiler is equipped with a battery of static analyses which infer
such properties [15,16,17,22]. These analyses are carried out on an intermediate
language called Core-Safe (explained in Sec. 2.1), obtained after type-checking
and desugaring the source language called Full-Safe. The back-end comprises
two more phases:
1. A translation from Core-Safe to the bytecode language of an imperative
abstract machine of our own, called the Safe Virtual Machine (SVM). We
call this bytecode language Safe-Imp and it is explained in Sec. 2.4.
Work partially funded by the projects TIN2008-06622-C03-01/TIN (STAMP), and
S-0505/ TIC/ 0407 (PROMESAS).
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 196–211, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Formal Certification of a Resource-Aware Language Implementation 197
In this paper we present the certification of the first pass explained above
(Core-Safe to Safe-Imp). The second pass (Safe-Imp to JVM bytecode) is cur-
rently being completed. The reader can find a preliminary version of it in [21].
The main improvement of this work with respect to previous efforts in com-
piler certification is that we prove, not only the preservation of functional se-
mantics, but also the preservation of the resource consumption properties. As
it is asserted in [11], this property can be lost as a consequence of some com-
piler optimisations. For instance, some auxiliary variables not present in the
source may appear during the translation. In our framework, it is essential that
memory consumption is preserved during the translation, since we are trying
to certify exactly this property. To this aim, we introduce at Core-Safe level a
resource-aware semantics and then prove that this semantics is preserved in the
translation to the abstract machine.
With the aim of facilitating the understanding of the paper, and also avoiding
descending to many low level details, we have made available the Isabelle/HOL
scripts at http://dalila.sip.ucm.es/safe/theories. We recommend the reader
to consult this site while reading in order to match the concepts described here
with its definition in Isabelle/HOL. The paper is structured as follows: after
this introduction, in Sec. 2 we motivate our Safe language and then present the
syntax and semantics of the source and target languages. Then, Sec. 3 explains
the translation and gives a small example of the generated code. Sections 2
and 3 contain large portions of material already published in [14,16]. We felt
that this material was needed in order to understand the certification process.
Sec. 4 is devoted to explaining the main correctness theorem and a number of
auxiliary predicates and relations needed in order to state it. Sec. 5 summarises
the lessons learnt, and finally a Related Work section closes the paper.
This appending needs constant (in fact, zero) additional heap space, while the
usual version needs linear additional heap space. The fact that the first list is lost
Formal Certification of a Resource-Aware Language Implementation 199
n m
prog → data i ; dec j ; e {Core-Safe program}
nk l
n m
data → data T αi @ ρj = Ck tks @ ρm {recursive, polymorphic data type}
dec → f xi n @ rj l = e {recursive, polymorphic function}
e → a {atom: literal c or variable x}
|x@r {copy data structure x into region r}
| x! {reuse data structure x}
| a 1 ⊕ a2 {primitive operator application}
| f ai n @ rj l {function application}
| let x1 = be in e {non-recursive, monomorphic}
n
| case x of alt i {read-only case}
n
| case! x of alt i {destructive case}
n
alt → C xi → e {case alternative}
be → C ai n @ r {constructor application}
|e
is reflected, by using the symbol ! in the type inferred for the function appendD ::
∀aρ1 ρ2 . [a]!@ρ1 → [a]@ρ2 → ρ2 → [a]@ρ2 , where ρ1 and ρ2 are polymorphic types
denoting the regions where the input and output lists should live. In this case,
due to the sharing between the second list and the result, these latter lists should
live in the same region. Another possibility is to destroy part of a data structure
and to reuse the rest in the result, as in the following destructive split function:
splitD 0 zs! = ([], zs!)
splitD n []! = ([], [])
splitD n (y:ys)! = (y:ys1, ys2) where (ys1, ys2) = splitD (n-1) ys
The righthand side zs! expresses reusing the remaining list. The inferred type is:
splitD :: ∀aρ1 ρ2 ρ3 . Int → [a]!@ρ2 → ρ1 → ρ2 → ρ3 → ([a]@ρ1 , [a]@ρ2 )@ρ3
Notice that the regions used to build the result appear as additional arguments.
The data structures which are not part of the function’s result are inferred to
be built in the local working region, which we call self, and they die at function
termination. As an example, the tuples produced by the internal calls to splitD
are allocated in their respective self regions and do not consume memory in the
caller regions. The type of these internal calls is Int → [a]!@ρ2 → ρ1 → ρ2 →
ρself → ([a]@ρ1 , [a]@ρ2 )@ρself , which is different from the external type because
we allow polymorphic recursion on region types. More information about Safe
and its type system can be found at [16].
The Safe front-end desugars Full-Safe and produces a bare-bones functional
language called Core-Safe. The transformation starts with region inference and
follows with Hindley-Milner type inference, desugaring pattern matching into
case expressions, where clauses into let expressions, collapsing several function-
defining equations into a single one, and some other transformations.
In Fig. 1 we show Core-Safe’s syntax, which is defined in Isabelle/HOL as a
collection of datatypes. A program prog is a sequence of possibly recursive poly-
morphic data and function definitions followed by a main expression e whose
value is the program result. The abbreviation xi n stands for x1 · · · xn . Destruc-
tive pattern matching is desugared into case! expressions. Constructor applica-
tions are only allowed in let bindings. Only atoms are used in applications, and
200 J. de Dios and R. Peña
only variables are used in case/case! discriminants, copy and reuse expressions.
Region arguments are explicit in constructor and function applications and in
the copy expression. Function definitions have additional region arguments rj l
where the function is allowed to build data structures. In the function’s body
only the rj and its working region self may be used.
E h, k, td , c ⇓ h, k, c, ([ ]k , 0, 1) [Lit]
E[x → v] h, k, td , x ⇓ h, k, v, ([ ]k , 0, 1) [Var ]
j ≤ k (h , p ) = copy(h, p, j) m = size(h, p)
[Var2 ]
E[x → p, r → j] h, k, td , x@r ⇓ h , k, p , ([j → m], m, 2)
fresh(q)
[Var3 ]
E[x → p] h [p → w], k, td , x! ⇓ h [q → w], k, q, ([ ]k , 0, 1)
c = c 1 ⊕ c2
[Primop]
E[a1 → c1 , a2 → c2 ] h, k, td , a1 ⊕ a2 ⇓ h, k, c, ([ ]k , 0, 2)
(f xi n @ rj l = e) ∈ Σ
n l
[xi → E(ai ) , rj → E(rj ) , self → k + 1] h, k + 1, n + l, e ⇓ h , k + 1, v, (δ, m, s)
l
[App]
E h, k, td , f ai n @ rj ⇓ h |k , k, v, (δ|k , m, max{n + l, s + n + l − td })
E h, k, 0, e1 ⇓ h , k, v1 , (δ1 , m1 , s1 )
E ∪ [x1 → v1 ] h , k, td + 1, e2 ⇓ h , k, v, (δ2 , m2 , s2 )
[Let1 ]
E h, k, td , let x1 = e1 in e2 ⇓ h , k, v, (δ1 + δ2 , max{m1 , |δ1 | + m2 }, max{2 + s1 , 1 + s2 })
j≤k fresh(p) E ∪ [x1 → p] h [p → (j, C vi n )], k, td + 1, e2 ⇓ h , k, v, (δ, m, s)
n
[Let2 ]
E[ai → vi , r → j] h, k, td , let x1 = C ai n @r in e2 ⇓ h , k, v, (δ + [j → 1], m + 1, s + 1)
E[x → p] h[p → (j, Cr vi n )] E ∪ [xri → vi nr ] h, k, td + nr , er ⇓ h , k, v, (δ, m, s)
n [Case]
E h, k, td , case x of Ci xij ni
→ ei ⇓ h , k, v, (δ, m, s + nr )
E[x → p] h+ = h [p → (j, Cr vi n )] E ∪ [xri → vi nr ] h, k, td + nr , er ⇓ h , k, v, (δ, m, s)
n [Case!]
E h+ , k, td , case! x of Ci xij ni → ei ⇓ h , k, v, (δ + [j → −1], max{0, m − 1}, s + nr )
Function size in rule Var 2 gives the size of the recursive spine of a data
structure:
size(h[p → (j, C vi n )], p) = 1 + size(h, vi )
i∈RecPos(C )
3 The Translation
The translation splits the runtime environment (E1 , E2 ) of the semantics into
two: a compile-time one ρ mapping program variables to stack offsets, and the
actual runtime environment contained in the stack. As this grows dynamically,
numbers are assigned to the variables from the bottom of the environment.
In this way, if the environment occupies the top m positions of the stack and
ρ[x → 1], then S!(m − 1) will contain the runtime value of x.
An expression let x1 = e1 in e2 will be translated by pushing to the stack
a continuation for e2 , and then executing the translation of e1 . A continuation
consists of a pair (k0 , p) where p points to the translation of e2 and k0 is the
lower watermark associated to e2 . It is saved in the stack because the lower
watermark of e1 is different (see the semantics of PUSHCONT ). As e1 and e2
share most of their runtime environments, the continuation is treated as a barrier
below which the environment must not be deleted while e2 has not reached its
normal form. So, the whole compile-time environment ρ consists of a list of
smaller environments [δ1 , . . . , δn ], mimicking the stack layout. Each individual
block i consists of a triple (δi , li , ni ) with an environment δi mapping variables
to numbers in the range (1 . . . mi ), a block length li = mi + ni , and an indicator
ni = 2 for all the blocks except for the first one, whose value is n1 = 0. We
are assuming that a continuation needs two words in the stack and that the
remaining items need one word.
The offset with respect to the top of the stack of a variable x defined in the
def k
block k, denoted ρ x, is computed as follows: ρ x = ( i=1 li ) − δk x. Only the
top environment may be extended with new bindings. There are three operations
on compile-time environments:
n def n
1. ((δ, m, 0) : ρ) + {xi → ji } = (δ ∪ {xi → m + ji , m + n, 0) : ρ.
def
2. ((δ, m, 0) : ρ)++ = ({}, 0, 0) : (δ, m + 2, 2) : ρ.
def
3. topDepth ((δ, m, 0) : ρ) = m. Undefined otherwise.
The first one extends the top environment with n new bindings, while the second
closes the top environment with a 2-indicator and then opens a new one.
Using these conventions, in Figure 4 we show an idealised version of the trans-
lation function trE taking a Core-Safe expression and a compile-time environ-
ment, and giving as a result a list of SVM instructions and a code store. There,
NormalForm ρ is the following list:
def
NormalForm ρ = [SLIDE 1 (topDepth ρ), DECREGION , POPCONT ]
where cs is the code store resulting from the compilation, and mapAccumL is
a higher-order function, combining map and foldl , defined to Isabelle/HOL by
copying its definition from the Haskell library (http://dalila.sip.ucm.es/safe/
theories for more details).
In Figure 5 we show the code store generated for the following Core-Safe
program with the appendD function of Sec. 2.1:
appendD xs ys @ r = case! xs of
[] → ys
x : xx → let yy = appendD xx ys @ r in
let zz = x : yy @ r in zz ;
let l = [ ] @ self in append l l @ self
4 Formal Verification
The above infrastructure allows us to state and prove the main theorem express-
ing that the pair translation-abstract machine is sound and complete with respect
to the resource-aware semantics. First, we make note that both the semantics
and the SVM machine rules are syntax driven, and that their computations are
deterministic (up to fresh names generation for the heap). So, we only need to
prove that everything done by the semantics can be emulated by the machine,
and that termination of the machine implies termination of the semantics (for
the corresponding expression.)
206 J. de Dios and R. Peña
Definition 1. We say that the environment E = (E1 , E2 ) and the pair (ρ, S)
are equivalent, denoted (E1 , E2 ) 1 (ρ, S), if dom E − {self } = dom ρ, and
∀x ∈ dom E1 . E1 (x) = S!(ρ x), and ∀r ∈ dom E2 − {self } . E2 (r) = S!(ρ r).
Then we define an inductive relation expressing the evolution of the SVM ma-
chine up to some intermediate points corresponding to the end of the evaluation
of sub-expressions:
inductive
execSVMBalanced :: [SafeImpProg,SVMState,nat list,SVMState list,nat list] ⇒ bool
( , -svm→ , )
where
init: P s, n#ns -svm→ [s], n#ns
| step: [[ P s, n#ns -svm→ s’#ss, m#ms;
execSVM P s’ = Right s’’;
m’ = nat (diffStack s’’ s’ m);
m’ ≥ 0;
ms’ = (if pushcont (instrSVM P s’) then 0#m#ms
else if popcont (instrSVM P s’) ∧ ms=m’’#ms’’ then (Suc m’’)#ms’’
else m’#ms)]] =⇒
P s, n#ns -svm→ s’’#s’#ss, ms’
∧ P = ((cs, contm),p,ct,st)
∧ finite (dom h)
−→ ( ∀ rho S S’ k0 s0 p’ q ls is is’ cs1 j.
(q, ls, is, cs1) = trE p’ funm fname rho e
∧ (append cs1 [(q,is’,fname)]) cs
∧ drop j is’ = is
∧ E 1 (rho,S)
∧ td = topDepth rho
∧ k0 ≤ k
∧ S’ = drop td S
∧ s0 = ((h, k), k0, (q, j), S)
−→ ( ∃ s ss q’ i δ m w.
P s0 , td#tds -svm→ s # ss , 1#tds
∧ s = ((h’, k) ↓ k0, k0, (q’, i), Val v # S’)
∧ fst (the (map of cs q’))!i = POPCONT
∧ r = ( δ,m,w)
∧ δ = diff k (h,k) (h’,k)
∧ m = maxFreshCells (rev (s#ss))
∧ w = maxFreshWords (rev (s#ss)))))
The premises state that the arbitrary expression e is evaluated to a value
v according to the Core-Safe semantics, that it is translated in the context of
a closed Core-Safe program defs having a definition for every function reached
from e, and that the instruction sequence is and the partial code store cs1 are
the result of the translation. Then, the execution of this sequence by the SVM
starting at an appropriate state s0 in the context of the translated program P ,
will reach a stopping state s having the same heap (h , k) as the one obtained in
the semantics, and the same value v on top of the stack. Moreover, the memory
(δ, m, w) consumed by the machine, both in the heap and in the stack, is as
predicted by the semantics.
The proof is done by induction on the ⇓ relation, and with the help of a
number of auxiliary lemmas, some of them stating properties of the translation
and some others stating properties of the evaluation. We classify them into the
following groups:
Lemmas on the evolution of the SVM. This group takes care of the first three
conclusions, i.e. P s0 , td#tds -svm→ s # ss , 1#tds and the next two ones, and
there is one or more lemmas for every syntactic form of e.
Lemmas on cells charged to the heap. This group takes care of the last but
two conclusion δ = diff k (h,k) (h’,k), and there is one or more lemmas for every
208 J. de Dios and R. Peña
where (h, k), (h |k , k), and (h , k) are respectively the initial heap, and the heaps
after the evaluation of e1 and e2 .
Lemmas on fresh cells needed in the heap. This group takes care of the last but
one conclusion m = maxFreshCells (rev (s#ss)). If e ≡ let x1 = e1 in e2 , then the
main lemma has essentially this form:
5 Discussion
On the use of Isabelle/HOL. The complete specification in Isabelle/HOL of
the syntax and semantics of our languages, of the translation functions, the
theorems and the proofs, represent almost one person-year of effort. Including
comments, about 7000 lines of Isabelle/HOL scripts have been written, and
about 200 lemmas proved.
Isabelle/HOL gives enough facilities for defining recursive and higher-order
functions. These are written in much the same way as a programmer would do
in ML or Haskell. We have not found special restrictions in this respect. The
only ‘difficulty’ is that it is not possible to write potentially non-terminating
functions. One must provide a termination proof when Isabelle/HOL cannot find
one. Providing such a proof is not always easy because the argument depends
on some other properties such as ‘there are no cycles in the heap’, which are not
so easy to prove. Fortunately in these cases we have expressed the same ideas
using inductive relations.
Isabelle/HOL also provides inductive n-relations, transitive closures as well as
ordinary first-order logic. This has made it easy to express our properties with
Formal Certification of a Resource-Aware Language Implementation 209
almost the same concepts one would use in hand-written proofs. Partial functions
have also been very useful in modelling programming language structures such
as environments, heaps, and the like. Being able to quantify these objects in
Higher-Order Logic has been essential for stating and proving the theorems.
Assessing how ‘easy’ it has been to conduct the proofs is another question. Part
of the difficulties were related to our lack of experience in using Isabelle/HOL.
The learning process was rather slow at the beginning. A second inconvenience
is that proof assistants (as it must be) do not take anything for granted. Trivial
facts that nobody cares to formalise in a hand-written proof, must be painfully
stated and proved before they can be used. We have sparingly used the auto-
matic proving commands such as simp all, auto, etc., in part because they do
‘too many’ things, and frequently one does not recognise a lemma after using
them. Also, we wanted the proof and to relate the proof to our hand-written
version. As a consequence, it is very possible that our scripts are longer than
needed. Finally, having programs and predicates ‘living’ together in a theorem
has been an experience not always easy to deal with.
On the quality of the extracted code. The Haskell code extracted from the Is-
abelle/HOL definitions reaches 700 lines, and has undergone some changes before
becoming operative in the compiler. One of these changes has been a trivial co-
ercion between the Isabelle/HOL types nat and int and the Haskell type Int.
The most important one has been the replacement of the Isabelle/HOL type
representing a partial function, heavily used for specifying our compile-time
environments, by a highly trusty table type of the Haskell library. The code gen-
erated for was just a λ-abstraction needing linear time in order to find the
value associated to a key. This would lead to a quadratic compile time. Our table
is implemented as a balanced tree and has also been used in other phases of the
compiler. With this, the efficiency of the code generation phase is in O(n log n)
for a single Core-Safe function of size n, and about linear with the number of
functions of the input.
6 Related Work
Using some form of formal verification to ensure the correctness of compilers has
been a hot topic for many years. An annotated bibliography covering up to 2003
can be found at [6]. Most of the papers reflected there propose techniques whose
validity is established by formal proofs made and read by humans.
Using machine-assisted proofs for compilers starts around the seventies, with
an intensificaton at the end of the nineties. For instance, [19] uses a constraint
solver to asses the validity of the GNU C compiler translations. They do not try
to prove the compiler correct but instead to validate its output by comparing
it with the corresponding input. This technique was originally proposed in [23].
A more recent experiment in compiler validation is [12]. In this case the source
is the term language of HOL and the target is assembly language of the ARM
processor. The compiler generates for each source, the object file and a proof
showing that the semantics of the source is preserved. The last two stages of the
compilation are in fact formally verified, while validation of the output is used
in the previous phases.
More closely related to our work are [1] which certifies the translation of a Lisp
subset to a stack language by using PVS, and [25] which uses Isabelle/HOL to
formalise the translation from a small subset of Java (called μ-Java) to a stripped
210 J. de Dios and R. Peña
version of the Java Virtual Machine (17 bytecode instructions). Both specify the
translation functions, and prove correctness theorems similar to ours. The latter
work can be considered as a first attempt on Java, and it was considerably
extended by Klein, Nipkow, Berghofer, and Strecker himself in [8,9,3]. Only [3]
claims that the extraction facilities of Isabelle/HOL have been used to produce
an actually running Java compiler. The main emphasis is on formalisation of
Java and JVM features and on creating an infrastructure on which other authors
could verify properties of Java or Java bytecode programs.
A realistic C compiler for programming embedded systems has been built and
verified in [5,10,11]. The source is a small C subset called Cminor to which C is
informally translated, and the target is Power PC assembly language. The com-
piler runs through six intermediate languages for which the semantics are defined
and the translation pass verified. The authors use the Coq proof-assistant and
its extraction facilities to produce Caml code. They provide figures witnessing
that the compile times obtained are competitive whith those of gcc running with
level-2 optimisations activated. This is perhaps the biggest project on machine-
assisted compiler verification done up to now.
Less related work are [7] and the MRG project [24], where certificates in Is-
abelle/HOL about heap consumption, based on special types inferred by the com-
piler, are produced. Two EU projects, EmBounded (http://www.embounded.org)
and Mobius (http://mobius.inria.fr) have continued this work on certification
and proof carrying code, the first one for the functional language Hume, and the
second one for Java and the JVM.
As we have said in Sec. 1, the motivation for verifying the Safe back-end arises
in a different context. We have approached this development because we found
it shorter than translating the Core-Safe properties to certificates at the level
of the JVM. Also, we expected the size of our certificates to be considerably
smaller than the ones obtained with the other approach. We have improved on
previous work by complementing functional correctness with a proof of resource
consumption preservation.
References
1. Dold, A., Vialard, V.: A Mechanically Verified Compiling Specification for a Lisp
Compiler. In: Hariharan, R., Mukund, M., Vinay, V. (eds.) FSTTCS 2001. LNCS,
vol. 2245, pp. 144–155. Springer, Heidelberg (2001)
2. Barthe, G., Grégoire, B., Kunz, C., Rezk, T.: Certificate Translation for Optimizing
Compilers. In: Yi, K. (ed.) SAS 2006. LNCS, vol. 4134, pp. 301–317. Springer,
Heidelberg (2006)
3. Berghofer, S., Strecker, M.: Extracting a formally verified, fully executable compiler
from a proof assistant. In: Proc. Compiler Optimization Meets Compiler Verifica-
tion, COCV 2003. ENTCS, pp. 33–50 (2003)
4. Bertot, Y., Casteran, P.: Interactive Theorem Proving and Program Development
Coq’Art: The Calculus of Inductive Constructions. Texts in Theoretical Computer
Science. An EATCS Series. Springer, Heidelberg (2004)
5. Blazy, S., Dargaye, Z., Leroy, X.: Formal verification of a C compiler front-end.
In: Misra, J., Nipkow, T., Sekerinski, E. (eds.) FM 2006. LNCS, vol. 4085, pp.
460–475. Springer, Heidelberg (2006)
6. Dave, M.A.: Compiler verification: a bibliography. SIGSOFT Software Engineering
Notes 28(6), 2 (2003)
7. Hofmann, M., Jost, S.: Static prediction of heap space usage for first-order func-
tional programs. In: Proc. 30th ACM Symp. on Principles of Programming Lan-
guages, POPL 2003, pp. 185–197. ACM Press, New York (2003)
Formal Certification of a Resource-Aware Language Implementation 211
8. Klein, G., Nipkow, T.: Verified Bytecode Verifiers. Theoretical Computer Sci-
ence 298, 583–626 (2003)
9. Klein, G., Nipkow, T.: A Machine-Checked Model for a Java-Like Language, Vir-
tual Machine and Compiler. ACM Transactions on Programming Languages and
Systems 28(4), 619–695 (2006)
10. Leroy, X.: Formal certification of a compiler back-end, or: programming a compiler
with a proof assistant. In: Principles of Programming Languages, POPL 2006, pp.
42–54. ACM Press, New York (2006)
11. Leroy, X.: A formally verified compiler back-end, July 2008, p. 79 (submitted, 2008)
12. Li, G., Owens, S., Slind, K.: Structure of a Proof-Producing Compiler for a Subset
of Higher Order Logic. In: De Nicola, R. (ed.) ESOP 2007. LNCS, vol. 4421, pp.
205–219. Springer, Heidelberg (2007)
13. Lindholm, T., Yellin, F.: The Java Virtual Machine Sepecification, 2nd edn. The
Java Series. Addison-Wesley, Reading (1999)
14. Montenegro, M., Peña, R., Segura, C.: A Resource-Aware Semantics and Abstract
Machine for a Functional Language with Explicit Deallocation. In: Workshop on
Functional and (Constraint) Logic Programming, WFLP 2008, Siena, Italy, July
2008, pp. 47–61 (2008) (to appear in ENTCS)
15. Montenegro, M., Peña, R., Segura, C.: A Simple Region Inference Algorithm for
a First-Order Functional Language. In: Trends in Functional Programming, TFP
2008, Nijmegen (The Netherlands), May 2008, pp. 194–208 (2008)
16. Montenegro, M., Peña, R., Segura, C.: A Type System for Safe Memory Man-
agement and its Proof of Correctness. In: Nadathur, G. (ed.) PPDP 1999. LNCS,
vol. 1702, pp. 152–162. Springer, Heidelberg (1999)
17. Montenegro, M., Peña, R., Segura, C.: An Inference Algorithm for Guaranteeing
Safe Destruction. In: LOPSTR 2008. LNCS, vol. 5438, pp. 135–151. Springer, Hei-
delberg (2009)
18. Necula, G.C.: Proof-Carrying Code. In: ACM SIGPLAN-SIGACT Principles of
Programming Languages, POPL 1997, pp. 106–119. ACM Press, New York (1997)
19. Necula, G.C.: Translation validation for an optimizing compiler. SIGPLAN No-
tices 35(5), 83–94 (2000)
20. Nipkow, T., Paulson, L., Wenzel, M.: Isabelle/HOL. A Proof Assistant for Higher-
Order Logic. LNCS, vol. 2283. Springer, Heidelberg (2002)
21. Peña, R., Rupérez, D.: A Certified Implementation of a Functional Virtual Machine
on top of the Java Virtual Machine. In: Jornadas sobre Programación y Lenguajes,
PROLE 2008, Gijón, Spain, October 2008, pp. 131–140 (2008)
22. Peña, R., Segura, C., Montenegro, M.: A Sharing Analysis for SAFE. In: Selected
Papers of the 7th Symp. on Trends in Functional Programming, TFP 2006, pp.
109–128 (2007) (Intellect)
23. Pnueli, A., Siegel, M., Singerman, E.: Translation Validation. In: Steffen, B. (ed.)
TACAS 1998. LNCS, vol. 1384, pp. 151–166. Springer, Heidelberg (1998)
24. Sannela, D., Hofmann, M.: Mobile Resources Guarantees. EU Open FET project,
IST 2001-33149 2001-2005, http://www.dcs.ed.ac.uk/home/mrg
25. Strecker, M.: Formal Verification of a Java Compiler in Isabelle. In: Voronkov, A.
(ed.) CADE 2002. LNCS, vol. 2392, pp. 63–77. Springer, Heidelberg (2002)
26. Wildmoser, M.: Verified Proof Carrying Code. Ph.D. thesis, Institut für Informatik,
Technical University Munchen (2005)
A Certified Data Race Analysis for a Java-like
Language
1 Introduction
A fundamental issue in multithreaded programming is data races, i.e., the situation
where two threads access a memory location, and at least one of them changes its value,
without proper synchronisation. Such situations can lead to unexpected behaviours,
sometimes with damaging consequences [14, 20]. The semantics of programs with mul-
tiple threads of control is described by architecture-dependent memory models [1, 10]
which define admissible executions, taking into account optimisations such as caching
and code reordering. Unfortunately, these models are generally not sequentially consis-
tent, i.e., it might not be possible to describe every execution of a program as the se-
rialization, or interleaving, of the actions performed by its threads. Although common
memory models impose restrictions on admissible executions, these are still beyond in-
tuition: writes can be seen out of order and reads can be speculative and return values
from the future.
Reasoning directly on memory models is possible but hard, counter-intuitive and
probably infeasible to the average programmer. As a matter of fact, the interleaving
semantics is generally assumed in most formal developments in compilation, static
analysis and so on. Hopefully, under certain conditions, the interleaving semantics can
be turned into a correct approximation of admissible behaviors. Here, we focus on
programs expressed in JAVA, which comes with its own, relieved from architecture spe-
cific details, memory model. Although the JAVA memory model [15, 21] does not guar-
antee sequential consistency for all programs, race free programs are guaranteed to be
sequentially consistent. Moreover, it enjoys a major property, so called, the datarace
free guarantee. This property states that a program whose all sequentially consistent
Work partially supported by EU project MOBIUS, and by the ANR-SETI-06-010 grant.
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 212–227, 2009.
c Springer-Verlag Berlin Heidelberg 2009
A Certified Data Race Analysis for a Java-like Language 213
executions are race free, only admit sequentially consistent executions. In other words,
proving that a program is race free can be done on a simple interleaving semantics; and
doing so guarantees the correctness of the interleaving semantics for that program. It
is worth noticing that data race freedom is important, not only because it guarantees
semantic correctness, but also because it is at the basis of a higher level property called
atomicity. The possibility to reason sequentially about atomic sections is a key feature
in analysing multithreaded programs. Designing tools, either static or dynamic, aiming
at proving datarace freeness is thus a fundamental matter.
This paper takes root in the european MOBIUS project1 where several program verifi-
cation techniques have been machine checked with respect to a formal semantics of the
sequential JAVA bytecode language. The project has also investigated several verifica-
tion techniques for multithreaded JAVA but we need a formal guarantee that reasoning
on interleaving semantics is safe. While a JAVA memory model’s formalisation has been
done in COQ [9] and a machine-checked proof of the data race free guarantee has been
given in [2] we try to complete the picture formally proving data race freeness. We study
how such a machine-checked formalisation can be done for the race detection analysis
recently proposed by Naik and Aiken [16–18].
The general architecture of our development is sketched in Figure 1. We formalise
four static analyses : a context-sensitive points-to analysis, a must-lock analysis, a con-
ditional must-not alias analysis based on disjoint reachability and a must-not thread
escape analysis. In order to ensure the data-race freeness of the program, these analyses
are used to refine, in several stages an initial over-approximation of the set of poten-
tial races of a program, with the objective to obtain an empty set at the very last stage.
Each analysis is mechanically proved correct with respect to an operational semantics.
However, we consider three variants of semantics. While the first one is a standard
small-step semantics, the second one attaches context information to each reference
and frame. This instrumentation makes the soundness proof of the points-to analysis
easier. The last semantics handles more instrumentation in order to count method calls
and loop iterations. Each instrumentation is proved correct with respect to the seman-
tics just above it. The notion of safe instrumentation is formalised through a standard
simulation diagram.
The main contributions of our work are as follows.
– Naik and Aiken have proposed one of the most powerful data race analysis of the
area. Their analyser relies on several stages that remove pairs of potential races.
Most of these layers have been described informally. The most technical one has
been partially proved correct with pencil and paper for a sequential While lan-
guage [17]. We formalise their work in COQ for a realistic bytecode language with
unstructured control flow, operand stack, objects, virtual method calls and lock and
unlock operations for threads synchronization.
– Our formalisation is an open framework with three layers of semantics. We for-
malise and prove correct four static analyses on top of these semantics. We expect
our framework to be sufficiently flexible to allow easy integration of new certified
blocks for potential race pruning.
1
http://mobius.inria.fr
214 F. Dabrowski and D. Pichardie
is local to a thread at the current point (Escaping pairs). The last potential race (8, f, 8)
requires the most attention since several threads of class T are updating fields f in par-
allel. These writes are safe because they are guarded by a synchronization on an object
which is the only ancestor of the write target in the heap. Such reasoning relies on the
fact that if locks guarding two accesses are different then so are the targeted memory lo-
cations. The main difficulty comes when several objects allocated at the same program
point, e.g. within a loop, may point to the same object. This last triplet is removed by
the conditional must not alias presented in Section 5.
3 Standard Semantics
The previous example can be compiled into a bytecode language whose syntax is
given below. The instruction set allows to manipulate objects, call virtual methods, start
threads and lock (or unlock) objects for threads synchronization.
Compared to real JAVA, we discard all numerical manipulations because they are
not relevant to our purpose. Static fields, static methods and arrays are not managed
216 F. Dabrowski and D. Pichardie
here but they are nevertheless source of several potential data races in JAVA programs.
Naik’s approach [18] for these layers is similar to the technique developed for objects.
We estimate that adding these language features would not bring new difficulties that we
have not covered yet in the current work. At last, as Naik and Aiken did before us, we
only cover synchronization by locks without join, wait and interruption mechanisms.
Our approach is sound in presence of such statements, but doesn’t take into account
the potential races they could prevent. The last missing feature is the JAVA’s exception
mechanism. Exceptions complicate the control flow of a JAVA program. We expect that
handling this mechanism would increase the amount of formal proof but will not require
new proof techniques. This is left for further work.
Program syntax. A program is a set of classes, coming with a Lookup function match-
ing signatures and program points (allocation sites denoting class names) to methods.
Cid ⊇ {cid , . . .} F ⊇ {f, g, h, . . .} Mid ⊇ {mid , . . .}
V ⊇ {x, y, z, . . .} Msig = Mid × Cnid × (Cid ∪ {void})
Semantics Domain. The dynamic semantics of our language is defined over states as a
labelled transition system. States and labels, or events, are defined in Figure 4, where
→ stands for total functions and stands for partial functions. We distinguish location
and memory location sets. The set of locations is kept abstract in this presentation. In
this section, a memory location is itself a location (L = O). This redundancy will be
useful when defining new instrumented semantics where memory locations will carry
more information (Sections 4 and 5). In a state (L, σ, μ), L maps memory locations (that
identify threads) to call stacks, σ denotes the heap that associates memory locations to
objects (cid , map) with cid a class name and map a map from fields to values. We note
class(σ, l) for fst (σ(l)) when l ∈ dom(σ). A locking state μ associates with every
location a pair ( , n) if is locked n times by and the constant free if is not held
ppt
by any thread. An event (, ?ppt f , ) (resp. (, !f , )) denotes a read (resp. a write) of a
field f , performed by the thread over the memory location , at a program point ppt .
An event τ denotes a silent action.
e
Transition system. Labelled transitions have the form st → st (when e is τ we simply
omit it). They rely on the usual interleaving semantics, as expressed in the rule below.
e
L = cs L; (cs, σ, μ) → (L , σ , μ )
e
(L, σ, μ) → (L , σ , μ )
e
Reductions of the shape L; (cs, σ, μ) → (L , σ , μ ) are defined in Figure 5.
Intuitively, such a reduction expresses that in state (L, σ, μ), reducing the thread defined
A Certified Data Race Analysis for a Java-like Language 217
L (location)
O=L (memory location)
O⊥ v ::= | Null (value)
s ::= v :: s | ε (operand stack)
V ar → O⊥ ρ (local variables)
O Cid × (F → O⊥ ) σ (heap)
PPT = M × N ppt ::= (m, i) (program point)
CS cs ::= (m, i, s, ρ) :: cs | (call stack)
O CS L (thread call stacks)
O → ((O × N∗ ) ∪ {free}) μ (locking state)
st ::= (L, σ, μ) (state)
e ::= τ | (, ?ppt
f , ) | (, !f , ) (event)
ppt
(a) Notations
?f
ppt
(m.body) i = monitorenter
μ ∈ {free, (, n)} (m.body) i = monitorexit
μ = acquire μ μ = acquire μ
L = L[ → (m, i + 1, s, ρ) :: cs] L = L[ → (m, i + 1, s, ρ) :: cs]
L; ((m, i, :: s, ρ) :: cs, σ, μ) → (L , σ, μ ) L; ((m, i, :: s, ρ) :: cs, σ, μ) → (L , σ, μ )
(b) Reduction rules
by the memory location and the call stack cs, by a non deterministic choice, produces
the new state (L , σ , μ ). For the sake of readability, we rely on an auxiliary relation of
e
the shape instr; ; ppt (i, s, ρ, σ) →1 (i , s , ρ , σ ) for reduction of intra-procedural
instructions. In Figure 5, we consider only putfield, getfield and new. Reductions
for instructions are standard and produce a τ event. The notation σ[.f ← v] for field
update, where ∈ dom(σ), is defined in Figure 5(a). It does not change the class of
an object. The reduction of a new instruction pushes a fresh address onto the operand
stacks and allocates a new object in the heap. The notation σ[ ← new (cid )], where
¬( ∈ dom(σ)), denotes the heap σ with a new object, at location , of class cid and
with all fields equals to Null. The auxiliary relation is embedded into the semantics
by rule (1). Method invocation relies on the lookup function for method resolution and
generates a new frame. Thread spawning is similar to method invocation. However, the
new frame is put on top of an empty call stack. We omit the reduction rules for return
and areturn, those rules are standard and produce a τ event. For monitorenter and
monitorexit we use a partial function acquire defined in Figure 5(a). Intuitively,
acquire μ results from thread locking object in μ.
We write RState(P ) for the set of states that contains the initial state of a program P ,
that we do not describe here for conciseness concerns, and that is closed by reduction.
A data race is a tuple (ppt 1 , f, ppt 2 ) such that Race(P, ppt 1 , f, ppt 2 ) holds.
ppt 1
1 !f 0 R
2 0 ppt 2 ppt 2
st ∈ RState(P ) st −−−−−→ st1 st −−−−→ st2 R ∈ {?f , !f } 1 = 2
Race(P, ppt 1 , f, ppt 2 )
The ultimate goal of our certified analyser is to guarantee Data Race Freeness, i.e.
for all ppt 1 , ppt 2 ∈ P P T and f ∈ F, ¬Race(P, ppt 1 , f, ppt 2 ).
4 Points-to Semantics
Naik and Aiken make intensive use of points-to analysis in their work. Points-to analy-
sis computes a finite abstraction of the memory where locations are abstracted by their
allocation site. The analysis can be made context sensitive if allocation sites are distin-
guished wrt. the calling context of the method where the allocation occurs.
Many static analyses use this kind of information to have a conservative approxi-
mation of the call graph and the heap of a program. Such analyses implicitly reason
on instrumented semantics that directly manipulates informations on allocation sites
while a standard semantics only keeps track of the class given to a reference during its
allocation. In this section we formalise such an intermediate semantics.
This points-to semantics takes the form of a COQ module functor
Module PointsToSem (C:CONTEXT). ... End PointsToSem.
Parameter make_new_context :
method → line → classId → mcontext → pcontext.
Parameter make_call_context :
method → line → mcontext → pcontext → mcontext.
Parameter get_class : program → pcontext → option classId.
End CONTEXT.
In order to reason on this semantics and its different instantiations we give to this
module a module type POINTSTO_SEM such that for all modules of type CONTEXT,
PointsToSem(C) : POINTSTO_SEM.
Several invariants are proved on this semantics, for example that if any memory
locations (, p1 ) and (, p2 ) are in the domain of a heap reachable from a initial state,
then p1 = p2 .
Safe Instrumentation. The analyses we formalise on top of this points-to semantics are
meant for proving absence of race. To transfer such a semantic statement in terms of
the standard semantics, we prove simulation diagrams between the transitions systems
of the standard and the points-to semantics. Such diagram then allows us to prove that
each standard race corresponds to a points-to race.
Must-Lock Analysis. Fine lock analysis requires to statically understand which locks
are definitely held when a given program point is reached. For this purpose we spec-
ify and prove correct a flow sensitive must-lock analysis that computes the following
informations:
Locks: method → line → mcontext → (var → Prop).
Symbolic: method → line → list expr.
At each flow position (m, i, c), Locks computes an under-approximation of the local
variables that are currently held by the thread reaching this position. The specification
of Locks depends on the points-to information PtL computed before. This is a use-
ful information for the monitorexit instruction because the unlocking of a variable
x can only cancel the lock information of the variables that may be in alias with x.
Symbolic is a flow sensitive abstraction of the operand stack that manipulate symbolic
expressions. Such expressions are path expressions of the form x, x.f , etc... Lock anal-
ysis only requires variable expressions but more complex expressions are useful for the
conditional must lock analysis given in Section 5.
Removing False Potential Races. The previous points-to analysis supports the first two
stages of the race analyser of Naik et al [18]. The first stage prunes the so called Reach-
ablePairs. It only keeps in OriginalsPairs the accesses that may be reachable from a
start() call site that is itself reachable from the main method, according to the
points-to information. Moreover, it discards pairs where each accesses are performed
by the main thread because there is only one thread of this kind.
The next stage keeps only the so called AliasingPairs using the fact that a conflicting
access can only occur on references that may alias. In the example of the Figure 2, the
potential race (5, f, 8) is cancelled because the points-to information of t and m.val
are disjoints.
For each stage we formally prove that all these sets over-approximate the set of real
races wrt. the points-to semantics.
5 Counting Semantics
The next two stages of our analysis require a deeper instrumentation. We introduce a
new semantics with instrumentation for counting method calls and loop iterations. This
semantics builds on top of the points-to semantics and uses k-contexts. All develop-
ments of this section were formalized in COQ. However, for the sake of conciseness we
introduce them in a paper style. In addition to the allocation site (m, i) and the calling
context c of an allocation, this semantics captures counting information. More precisely,
it records that the allocation occurred after the nth iteration of flow edge L(m, i) in the
k th call to m in context c. Given a program P , the function L ∈ M × N → Flow, for
Flow = N × N, must satisfy Safe P (L) as defined below:
A frame holding the code pointer m, i, c, ω, π is the ω(m, c)th call to method m in
context c (a k-context) since the execution began and, so far, it has performed π(m, c, φ)
steps through edge φ of its control flow graph. In a state (L, σ, μ, ωg ), ωg is a global
method vector used as a shared call counter by all threads.
Below, we sketch the extended transition system by giving rules for allocation and
method invocation.
(m.body) i = new cid ∀cp.¬(( , cp) ∈ dom(σ))
L = L[ → (m, i + 1, c, ω, π , :: s, ρ) :: cs]
π = π[(m, c, (i, i + 1)) → π(m, c, (i, i + 1)) + 1] = ( , m, i, c, ω, π)
L; ((m, i, c, ω, π, s, ρ) :: cs, σ, μ, ωg ) → (L , σ[ → new(cid)], μ, ωg )
For allocation, we simply annotate the new memory location with the current code
pointer and record the current move. For method invocation, the caller records the cur-
rent move. The new frame receives a copy of the vectors of the caller (after the call)
where the current call is recorded and the iteration vector corresponding to this call
is reseted. Except for thread spawning, omitted rules simply record the current move.
A Certified Data Race Analysis for a Java-like Language 223
Thread spawning is similar to method invocation except that the new frame receives
fresh vectors rather than copies of the caller’s vectors.
Safe Instrumentation. As we did between the standard and the points-to semantics we
prove a diagram simulation between the points-to semantics and the counting seman-
tics. Is ensures that all points-to races correspond to a counting race. However, in order
to use the soundness theorem of the must-lock analysis we also need to prove a bisim-
ulation diagram. It ensures that all states that are reachable in the counting semantics
correspond to a reachable state in the points-to semantics. It allows us to transfer the
soundness result of the must-lock analysis in terms of the counting semantics.
where localCoherency (L, (, m0 , i0 , c0 , ω0 , π0 ), m, i, c, ω, π) stands for
Type And Effect System. We have formalized a type and effect system which captures
the fact that some components of vectors of a memory location are equals to the same
components of vectors of : (1) the current frame when the memory location is in local
variables or in the stack of the frame or (2) of another memory location pointing to
224 F. Dabrowski and D. Pichardie
it in the heap. By lack of space, we cannot describe the type and effect system here.
Intuitively, we perform a points-to analysis where allocation sites are decorated with
masks which tell us which components of vectors of the abstracted memory location
match the same components in vectors of a given code pointer (depending on whether
we consider (1) or (2)). Formally, an abstract location τ ∈ T in our extended points-to
analysis is a pair (A, F ) where A is a set of allocation sites and F maps every element
of A to a pair Ω, Π of abstract vectors. Abstract vectors are defined by Ω ∈ MVect =
M × Context → {1, } and Π ∈ LVect = M × Context × Flow → {1, }.
Our analysis computes a pair (A, Σ) where A provides flow-sensitive points-to in-
formation with respect to local variables and Σ provides flow-insensitive points-to in-
formation with respect to the heap. The decoration of a points-to information acts as
a mask. For a memory location held by a local variable, it tells us which components
of its vectors (those set to 1) match those of the current frame. When a memory loca-
tion points-to another one in the heap, it tells us which components of their respective
vectors are equal.
Below we present the last stages we use for potential race pruning. For each stage
the result stated by proposition 1 is crucial. Indeed, given an abstract location with allo-
cation site (m, i, c), they rely on a property stating that whenever the decoration states
that Ω(m, c) = Π(m, c, L(m, i)) = 1, the abstraction describes a unique concrete lo-
cation. This property results from the combination of the abstraction relation defined in
our type system and of Proposition 1.
Must Not Escape Analysis. We use the flow sensitive element A of the previous type
and effect system to check that, at some program point, an object allocated by a thread
is still local to that thread (or has not escaped yet, i.e. it is not reachable from others).
More preciselly, the type and effect systems is used to guarantee that, at some program
point, the last object allocated by a thread at a given allocation site is still local to that
thread. In particular, our analysis proves that an access performed at point 4 in our
running example, is on the last object of type T allocated by the main thread (which
is is local, although at each loop iteration, the new object eventually escapes the main
thread). On the opposite, the pair (5, f, 8) cannot be removed by this analysis since the
location has already escaped the main thread at point 5. This pair is removed by the
aliasing analysis. Our Escape analysis improves on that of Naik and Aiken which does
not distinguish among several allocations performed at the same site.
Conditional Must Not Alias Analysis. The flow-insensitive element Σ of the previous
type and effect system is used to define an under-approximation DR Σ of the notion of
disjoint reachability. Given a finite set of heaps {σ1 , . . . , σn } and a set of allocation
sites H, the disjoint reachability set DR{σ1 ,...,σn } (H) is the set of allocation sites h
such that whenever an object o allocated at site h may be reachable by one or more
field dereferences for some heap in {σ1 , . . . , σn }, from objects o1 and o2 allocated at
any sites in H then o1 = o2 . It allows to remove the last potential race of our running
example. For each potential conflict between two program points i1 and i2 , we first
compute the set May 1 and May 2 of sites that the corresponding targeted objects may
A Certified Data Race Analysis for a Java-like Language 225
points-to, using the previous points-to analysis. Then we use the must-lock analysis
to compute the sets Must 1 and Must 2 of allocation sites such that for any h in Must 1
(resp. Must 2 ), there must exists a lock l currently held at point i1 (resp. i2 ) and allocated
at h. The current targeted object must furthermore be reachable from l with respect to
the heap history that leads to the current point. This last property is ensured by the
path expressions that are computed with a symbolic operand stack during the must-lock
analysis. At last, we remove the potential race if and only if Must 1 = ∅, Must 2 = ∅
and
May 1 ∩ May 2 ⊆ DR Σ (Must 1 ∪ Must 2 )
We formally prove that any potential race that succeeds this last check is not a real race.
6 Related Work
Static race detection. Most works on static race detection follow the lock based ap-
proach, as opposed with event ordering based approaches. This approach imposes that
every pair of concurrent accesses to the same memory location are guarded by a com-
mon lock and is usually enforced by means of a type and effect discipline.
Early work [5] proposes an analysis for a λ-calculus extended with support for shared
memory and multiple threads. Each allocation comes in the text of the program with an
annotation specifying which lock protects the new memory location and the type and ef-
fect system checks that this lock is held whenever it is accessed. More precisely, the an-
notation refers to a lexically scoped lock definition, thus insuring unicity. To overcome
the limitation imposed by the lexical scope of locks, existential types are proposed as
a solution to encapsulate an expression with the locks required for its evaluation. This
approach was limited in that it was only able to consider programs where all accesses
are guarded, even when no concurrent access is possible. Moreover, it imposed the use
of specific constructions to manage existential types.
A step toward treatment of realistic languages was made in [7] which considers the
JAVA language and supports various common synchronization patterns, classes with
internal synchronization, classes that require client-side synchronization and thread-
local classes. Aside from additional synchronization patterns, the approach is similar to
the previous one and requires annotations on fields (the lock protecting the field) and
method declarations (locks that must be held at invocation time). However, the object-
oriented nature of the JAVA language is used as a more natural mean for encapsulation.
Fields of an object must be protected by a lock (an object in JAVA) accessible from
this object. For example, x.f may be protected by x.g.h where g and h are final fields
(otherwise, two concurrent accesses to x.f guarded by x.g.h could use different locks).
Client-side synchronization and thread-local classes are respectively handled by classes
parametrized by locks and a simple form of escape analysis. A similar approach, using
ownership types to ensure encapsulation, was taken in [3, 4].
The analysis we consider here is that of [17, 18]. Thanks to the disjoint reachability
property and to an heavy use of points-to analysis, it is more precise and captures more
idioms than those above. Points-to analysis also makes it more costly but it has been
proved that such analyses are tractable thanks to BDD based resolution techniques [22].
226 F. Dabrowski and D. Pichardie
Acknowledgment. We thank Thomas Jensen and the anonymous TPHOLs reviewers for
their helpful comments.
A Certified Data Race Analysis for a Java-like Language 227
References
1. AMD. Amd64 architecture programmer’s manual volume 2: System programming. Techni-
cal Report 24593 (2007)
2. Aspinall, D., Sevcı́k, J.: Formalising java’s data race free guarantee. In: Schneider, K.,
Brandt, J. (eds.) TPHOLs 2007. LNCS, vol. 4732, pp. 22–37. Springer, Heidelberg (2007)
3. Boyapati, C., Lee, R., Rinard, M.: Ownership types for safe programming: preventing data
races and deadlocks. In: ACM Press (ed.) Proc. of OOPSLA 2002, New York, NY, USA, pp.
211–230 (2002)
4. Boyapati, C., Rinard, M.: A parameterized type system for race-free Java programs. In: ACM
Press (ed.) Proc. of OOPSLA 2001, New York, NY, USA, pp. 56–69 (2001)
5. Flanagan, C., Abadi, M.: Types for safe locking. In: Swierstra, S.D. (ed.) ESOP 1999. LNCS,
vol. 1576, pp. 91–108. Springer, Heidelberg (1999)
6. Cachera, D., Jensen, T., Pichardie, D., Rusu, V.: Extracting a Data Flow Analyser in Con-
structive Logic. Theoretical Computer Science 342(1), 56–78 (2005)
7. Flanagan, C., Freund, S.N.: Type-based race detection for java. In: Proc. of PLDI 2000, pp.
219–232. ACM Press, New York (2000)
8. Hobor, A., Appel, A.W., Zappa Nardelli, F.: Oracle semantics for concurrent separation logic.
In: Drossopoulou, S. (ed.) ESOP 2008. LNCS, vol. 4960, pp. 353–367. Springer, Heidelberg
(2008)
9. Huisman, M., Petri, G.: The Java memory model: a formal explanation. In: Verification and
Analysis of Multi-threaded Java-like Programs, VAMP (2007) (to appear)
10. Intel. Intel 64 architecture memory ordering white paper. Technical Report SKU 318147-001
(2007)
11. Klein, G., Nipkow, T.: A machine-checked model for a Java-like language, virtual machine
and compiler. ACM Transactions on Programming Languages and Systems 28(4), 619–695
(2006)
12. Lammich, P., Müller-Olm, M.: Formalization of conflict analysis of programs with proce-
dures, thread creation, and monitors. In: The Archive of Formal Proofs (2007)
13. Leroy, X.: Formal certification of a compiler back-end, or: programming a compiler with a
proof assistant. In: Proc. of POPL 2006, pp. 42–54. ACM Press, New York (2006)
14. Leveson, N.G.: Safeware: system safety and computers. ACM, NY (1995)
15. Manson, J., Pugh, W., Adve, S.V.: The Java Memory Model. In: Proc. of POPL 2005, pp.
378–391. ACM Press, New York (2005)
16. Naik, M.: Effective Static Data Race Detection For Java. PhD thesis, Standford University
(2008)
17. Naik, M., Aiken, A.: Conditional must not aliasing for static race detection. In: Proc. of
POPL 2007, pp. 327–338. ACM Press, New York (2007)
18. Naik, M., Aiken, A., Whaley, J.: Effective static race detection for java. In: Proc. of PLDI
2006, pp. 308–319. ACM Press, New York (2006)
19. Petri, G., Huisman, M.: BicolanoMT: a formalization of multi-threaded Java at bytecode
level. In: Bytecode 2008. Electronic Notes in Theoretical Computer Science (2008)
20. Poulsen, K.: Tracking the blackout bug (2004)
21. Sun Microsystems, Inc. JSR 133 Expert Group, Java Memory Model and Thread Specifica-
tion Revision (2004)
22. Whaley, J., Lam, M.S.: Cloning-based context-sensitive pointer alias analysis using binary
decision diagrams. In: Proc. of PLDI 2004, pp. 131–144. ACM, New York (2004)
Formal Analysis of Optical Waveguides in HOL
1 Introduction
Optical systems are increasingly being used these days, mainly because of their
ability to provide high capacity communication links, in applications ranging
from ubiquitous internet and mobile communications, to not so commonly used
but more advanced scientific domains, such as optical integrated circuits, bio-
photonics and laser material processing. The correctness of operation for these
optical systems is usually very important due to the financial or safety critical
nature of their applications. Therefore, quite a significant portion of the design
time of an optical system is spent on analyzing the designs so that functional
errors can be caught prior to the production of the actual devices. Calculus plays
a significant role in such analysis. Nonliner differential equations with transcen-
dental components are used to model the electric and magnetic field components
of the electromagnetic light waves. The optical components are characterized by
their refractive indices and then the effects of passing electromagnetic waves of
visible and infrared frequencies through these mediums are analyzed to ensure
that the desired reflection and refraction patterns are obtained.
The analysis of optical systems has so far been mainly conducted by using
paper-and-pencil based proof methods [18]. Such traditional techniques are usu-
ally very tedious and always have some risk of an erroneous analysis due to the
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 228–243, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Formal Analysis of Optical Waveguides in HOL 229
complex nature of the present age optical systems coupled with the human-error
factor. The advent of fast and inexpensive computational power in the last two
decades opened up avenues for using computers in the domain of optical sys-
tem analysis. Nowadays, computer based simulation approaches and computer
algebra systems are quite frequently used to validate the optical system analysis
results obtained earlier via paper-and-pencil proof methods. In computer simu-
lation, complex electromagnetic wave models can be constructed and then their
behaviors in an optical medium of known refractive index can be analyzed. But,
computer simulation cannot provide 100% precise results since the fundamental
idea in this approach is to approximately answer a query by analyzing a large
number of samples. Similarly, computer algebra systems, which even though are
considered to be semi-formal and are very efficient in mathematical computa-
tions, also fail to guarantee correctness of results because they are constructed
using extremely complicated algorithms, which are quite likely to contain bugs.
Thus, these traditional techniques should not be relied upon for the analysis of
optical systems, especially when they are used in safety critical areas, such as
medicine, transportation and military, where inaccuracies in the analysis may
even result in the loss of human lives.
In the past couple of decades, formal methods have been successfully used for
the precise analysis of a verity of hardware and software systems. The rigorous
exercise of developing a mathematical model for the given system and analyz-
ing this model using mathematical reasoning usually increases the chances for
catching subtle but critical design errors that are often ignored by traditional
techniques like simulation. Given the sophistication of the present age optical
systems and their extensive usage in safety critical applications, there is a dire
need of using formal methods in this domain. However, due to the continuous na-
ture of the analysis and the involvement of transcendental functions, automatic
state-based approaches, like model checking, cannot be used in this domain.
On the other hand, we believe that higher-order-logic theorem proving offers a
promising solution for conducting formal analysis of optical systems. The main
reason being the highly expressiveness nature of higher-order logic, which can be
leveraged upon to essentially model any system that can be expressed in a closed
mathematical form. In fact, most of the classical mathematical theories behind
elementary calculus, such as differentiation, limit, etc., and transcendental func-
tions, which are the most fundamental tools for analyzing optical systems, have
been formalized in higher-order logic [6]. Though, to the best of our knowledge,
formal analysis of optical devices is a novelty that has not been presented in the
open literature so far using any technique, including theorem proving.
In this paper, as a first step towards using a higher-order-logic theorem prover
for analyzing optical systems, we present the formal analysis of planar optical
waveguides operating in the transverse electric (TE) mode, i.e., a mode when
electric field is transverse to the plane of incidence. A waveguide can be defined
as an optical structure that allows the confinement of electromagnetic light waves
within its boundaries by total internal reflection (TIR). It is considered to be one
of the most fundamental components of any optical system. Some of the optical
230 O. Hasan, S.K. Afshar, and S. Tahar
2 Related Work
is the first one of its kind. In this section, we present a brief overview of the
state-of-the-art informal techniques used for optical system analysis.
The most commonly used computer based techniques for optical system anal-
ysis are based on simulation and numerical methods. Some examples include
the analysis of integrated optical devices [20], optical switches [16] and biosen-
sors [23]. Optical systems are continuous systems and thus the first step in
their simulation based analysis is to construct a discrete model of the given sys-
tem [5]. Once the system is discretized, the electromagnetic wave equations are
solved by numerical methods. Finite difference methods are the most commonly
used numerical approaches applied on wave equations. Finite difference meth-
ods applied to the time domain discretized wave equations are referred to as
the Finite Difference Time Domain (FDTD) methods [21] and to the frequency
domain discretized wave equations as the Finite Difference Frequency Domain
(FDFD) methods [19]. Solving equations with numerical methods itself imposes
an additional form of error on solutions of the problem. Besides inaccuracies,
another major disadvantage, associated with the numerical methods and simu-
lation based approaches, is the tremendous amount of CPU time and memory
requirements for attaining reasonable analysis results [10]. In [9,13], the authors
argued different methodologies to break the structure into smaller components
to improve the memory consumption and speed of the FDTD methods. Simi-
larly, some enhancements for the FDFD method are proposed in [22,12]. There
is extensive effort on this subject and although there are some improvements but
the inherent nature of numerical and simulation based methods fails all these
effort to bring 100% accuracy in the analysis, which can be achieved by the
proposed higher-order-logic theorem proving based approach.
Computer algebra systems incorporate a wide variety of symbolic techniques
for the manipulation of calculus problems. Based on these capabilities, they have
been also tried in the area of optical system analysis. For example, the analysis
of planar waveguides using Mathematica [14], which is a widely used computer
algebra system, is presented in [3]. With the growing interest in optical system
analysis, a dedicated optical analysis package Optica [17] has been very recently
released for Mathematica. Optica performs symbolic modeling of optical systems,
diffraction, interference, and Gaussian beam propagation calculations and is gen-
eral enough to handle many complex optical systems in a semi-formal manner.
Computer algebra systems have also been found to be very useful for evaluating
eigenvalues for transcendental equations. This feature has been extensively used
along with the paper-and-pencil based analytical approaches. The idea here is to
verify the eigenvalue equation by hand and then feed that equation to a computer
algebra system to get the desired eigenvalues [18]. Despite all these advantages,
the analysis results from computer algebra systems cannot be termed as 100%
precise due to the many approximations and heuristics used for automation and
reducing memory constraints. Another source of inaccuracy is the presence of
unverified huge symbolic manipulation algorithms in their core, which are quite
likely to contain bugs. The proposed theorem proving based approach overcomes
these limitations but at the cost of significant user interaction.
232 O. Hasan, S.K. Afshar, and S. Tahar
3 Planar Waveguides
The most important concept in optical waveguides is that of total internal reflec-
tion (TIR). When a wave crosses a boundary between materials with different
refractive indices, it is usually partially refracted at the boundary surface, and
partially reflected. TIR happens when there is no refraction. Since, the objective
of waveguides is to guide waves with minimum loss, ideally we want to ensure TIR
for the waves that we want the waveguide to guide. TIR is ensured only when the
following two conditions are satisfied. Firstly, the refractive index of the trans-
mitting medium must be greater than its surroundings, nmedium > nsurrounding
and secondly, the angle of incidence of the wave at the medium is greater than
a particular angle, which is usually referred to as the critical angle. The value of
the critical angle also depends on the relative refractive index of the two materials
Formal Analysis of Optical Waveguides in HOL 233
j ∂Ey
Hz = (3)
ωμ0 ∂x
where A, B, C, and D are amplitude coefficients, γc and γs are attenuation coef-
ficients of the cover and substrate, respectively, κf is the transverse component
of the wavevector k = 2π λ in the guiding film, ω is the angular frequency of
light and μ is the permeability of the medium. Some of these parameters can be
further defined as follows:
γc = β 2 − k02 n2c (4)
γs = β 2 − k02 n2s (5)
κf = k02 n2f − β 2 (6)
234 O. Hasan, S.K. Afshar, and S. Tahar
where k0 is the vacuum wavevector, such that k0 = nk with n being the refractive
index of the medium, and β and κ are the longitudinal and transverse compo-
nents of the wavevector k, respectively, inside the film, as depicted in Figure 2.
The angle θ, is the required angle of incidence of the wave.
This completes the mathematical model of the light wave in a planar waveg-
uide, which leads us back to the original question of finding the angle of incidence
θ of the wave to ensure TIR. β is the most interesting vector in this regard. It
summarizes two of the very important characteristics of a wave in a medium.
Firstly, because it is the longitudinal component of the wavevector, β contains
the information about the wavelength of the wave. Secondly, it contains the
propagation direction of the wave within the medium, which consequently gives
us the angle of incidence θ. Now, in order to ensure the second condition for TIR,
we need to find the corresponding βs. These specific values of βs are nominated
to be the eigenvalue of waveguides since they contain all the information that is
required to describe the behavior of the wave and the waveguide.
The electric and magnetic field equations (2) and (3) can be utilized along
with their well-known continuous nature [18] to verify the following useful rela-
tionship, which is usually termed as the eigenvalue equation for β.
γ + γs
tan(hκf ) = c (7)
κf 1 − γκc γ2 s
f
The good thing about this relationship is that it contains β along with all the
physical characteristics of the planar waveguide, such as refractive indices and
height. Thus, it can be used to evaluate the value of β in terms of the planar
waveguide parameters. This way, we can tune these parameters in such a way
that an appropriate value of β is attained that satisfies the second condition for
TIR, i.e., sin−1 ( λβ
2π ) < critical angle. All the values of β that satisfy the above
conditions are usually termed as the TE modes in the planar waveguide.
In this paper, we present the higher-order-logic formalization of the electric
and magnetic field equations for the planar wave guide, given in Equations (2)
and (3), respectively. Then, based on these formal definitions, we present the
formal verification of the eigenvalue equation, given in Equation (7). As out-
lined above, it is one of the most important relationships used for the analysis
of planar waveguides, which makes its formal verification in a higher-order-logic
theorem prover a significant step towards using them for conducting formal op-
tical systems analysis.
Formal Analysis of Optical Waveguides in HOL 235
Next, we formally verify that the derivative of h step function for all values of
its argument x, except 0, is equal to 0.
where the HOL function deriv represents the derivative function [6] that accepts
a real-valued function f and a differentiating variable x and returns df /dx. The
proof of the above theorem is based on the classical definitions of differentiation
and limit along with some simple arithmetic reasoning.
Now, the electric field of a planar waveguide, given in Equation (2), can be
expressed in higher-order logic as the following function.
236 O. Hasan, S.K. Afshar, and S. Tahar
The function H field accepts the frequency omega and the permeability of the
medium mu besides the same parameters that have been used for defining the
electric field of the planar waveguide in Definition 2. We have removed the imag-
inary unit part from the original definition, given in Equation (3), in the above
definition for simplicity as our analysis is based on the amplitudes or absolute
values of electric and magnetic fields and thus requires the real portion of the
corresponding complex numbers only. However, if need arises, the imaginary
part can be included in the analysis as well by utilizing the higher-order-logic
formalization of complex numbers [7].
Formal Analysis of Optical Waveguides in HOL 237
Definitions 2 and 3 can now be used to formally verify a relation for the
magnetic field in a planar waveguide as follows:
This theorem can be verified by proving the derivatives of the three expressions
found in the definition of the electric field and the derivative of the Heaviside
step function, given in Theorem 2, along with basic differentiation properties of
a product and sum of functions, formally verified in [6].
The abs function is the HOL function for the absolute value of a real number.
According to the above definition, the limit of a real valued function f (x), as x
tends to x0 from the right is y0, if for all strictly positive values e, there exists a
number d such that for all x satisfying x0 < x < x0 + d, we have |f (x) − y0| < e.
Similarly, the left hand limit can be formalized as follows:
Definition 5: Limit from the Left
∀ f y0 x0. left lim f y0 x0 =
∀ e. 0 < e ⇒ ∃d. 0 < d ∧
∀x. -d < x - x0 ∧ x - x0 < 0 ⇒ abs(f x - y0) < e
If the normal limit of a function exists at a point and is equal to y0 then both
the right and left limits for that function are also well-defined for the same point
and are both equal to y0. This is an important result for our analysis and thus
we formally verify it in the HOL theorem prover as the following theorem.
Theorem 4: Limit Implies Limit from the Right and Left
∀ f y0 x0. (f→y0)x0 ⇒ right lim f y0 x0 ∧ left lim f y0 x0
The assumption of the above theorem (f → y0)x0 represents the formalization of
the normal limit of a function [6] and is True only if the function f approaches
y0 at point x = x0. The proof of Theorem 4 is basically a re-writing of the
definitions involved along with the properties of the absolute function. We also
verified the uniqueness of both right and left hand limits as follows.
Theorem 5: Limit from the Right is Unique
∀f y1 y2 x0. right lim f y1 x0 ∧ right lim f y2 x0 ⇒(y1=y2)
Theorem 6: Limit from the Left is Unique
∀f y1 y2 x0. left lim f y1 x0 ∧ left lim f y2 x0 ⇒(y1=y2)
The proof of Theorem 5 is by contradiction, as it is not possible that a real-valued
function gets as near as possible to two unequal points in its range for the same
argument. We proceed with the proof by first assuming that ¬(y1 = y2) and
then rewriting the statement of Theorem 5 with the definition of the function
right lim. Next, the two assumptions are specialized for e = |y1−y2|2 case. Now,
the same x is chosen for both the assumptions in such a way that the conditions
on x, i.e., x0 < x < x0 + d, for both of the assumptions are satisfied. One
such x is min 2d1 d2 + x0, where d1 and d2 are the d s for the two assumptions,
respectively, and the function min returns the minimum value out of its two real
number arguments. Thus, for such an x, the two given assumptions imply that
|f x − y1| < |y1−y2|
2 and |f x − y2| < |y1−y2|
2 , which leads to a contradiction in
both of the cases when y1 < y2 and y2 < y1. Hence, our assumption ¬(y1 =
y2) cannot be True and y1 must be equal to y2, which concludes the proof of
Theorem 5. Theorem 6 is also verified using similar reasoning.
The above infrastructure can now be utilized to formally verify the mathemat-
ical relationships between the amplitude coefficients. The relationship between
the amplitude coefficients B and A can be formally stated as follows:
Formal Analysis of Optical Waveguides in HOL 239
Theorem 7: B = A
∀ A B C D n c n s n f k 0 b h x. 0 < h ∧
(∀x. (λx. E field A B C D n c n s n f k 0 b h x) contl x)
⇒ (B = A)
The first assumption ensures that h is always greater than 0 and is valid since h
represents the height of the waveguide. Whereas, the HOL predicate (f contl
x) [6], used in the above theorem, represents the relational form of a continuous
function definition, which is True when the limit of the real-valued function f
exists for all points x on the real line and is equal to f (x). Thus, the corre-
sponding assumption, in the above theorem, ensures that the function E field
is continuous on the x − axis and its limit at the boundary points x = 0 and
x = −h is equal to the value of the function E field at x = 0 and x = −h.
In order to verify Theorem 7, consider the boundary point x = 0, for which
the value of the function E field becomes A+B2 , according to Definition 2. Now,
based on Theorem 5, the limit from the right at x = 0 for the function E field
is also going to be A+B2 . Next, we verified, using Definition 4 along with the
properties of the exponential function [6], that the limit from the right for the
function E field at point x = 0 is in fact equal to A. The uniqueness of the
right limit property, verified in Theorem 5, can now be used to verify that A
must be equal to A+B 2 as they both represent the limit from the right for the
same function at the same point. This result can be easily used to discharge our
proof goal A = B, which concludes the proof for Theorem 7.
Next, we apply similar reasoning as above with the magnetic field relation
for the planar waveguide, verified in Theorem 2, at point x = 0 to verify the
following relationship between the amplitude coefficients C and A.
Theorem 8: C = −A κγfc
∀ omega mu A B C D n c n s n f k 0 b h x.(0 < h)∧(0 < mu)∧
(0 < omega)∧(b < k 0 n f)∧(k 0 n s < b)∧(0 < n s)∧(0 < k 0)∧
(∀x.λx.H field omega mu A B C D n c n s n f k 0 b h x) contl x)
(gamma b k 0 n c)
⇒ (C = −A (kappa b k 0 n f) )
The additional assumptions besides, 0 < h, used in the above theorem, ensure
that the values of the functions gamma and kappa are positive real numbers and
do not attain an imaginary complex number value, according to their definitions,
given in Section 3. Again based on the continuity of the magnetic field H field
assumption, we know that its limit at point 0 is equal to the value of H field at
x = 0, say H0 . It is important to note that the value of H0 cannot be obtained
from the expression for the H field, given in Theorem 2. Therefore, we cannot
reason about its precise value but based on the continuity of H field, we do know
that it exists. This implies that the limit from right and left for this function
would be also equal to H0 , according to Theorem 4. Next, we verified that limits
from right and left for the magnetic field function H field, given in Theorem 2,
at point x = 0 are −A(gamma b k 0 n c)
om mu and C(kappa b k 0 n f)
om mu using Definitions 4
and 5, respectively. This leads to the verification of Theorem 8, since we already
240 O. Hasan, S.K. Afshar, and S. Tahar
know that these two limit values are equal to H0 , using the uniqueness of limits
from right and left, verified in Theorems 5 and 6.
Now, using similar reasoning as above and applying continuity of E field
and H field at x = −h, we verified the following two relations to express the
amplitude coefficient D in terms of the amplitude coefficients B and C.
Due to the inherent soundness of the theorem proving approach, our verifica-
tion results exactly matched the paper-and-pencil analysis counterparts for the
eigenvalue equation, as conducted in [18], and thus can be termed as 100%
precise. Interestingly, the assumption ¬((kappa b k 0 n f)2 = (gamma b k 0
n c) (gamma b k 0 n s)), without which the eigenvalues are undefined, was
found to be missing in [18]. This fact clearly demonstrates the strength of for-
mal methods based analysis as it allowed us to highlight this corner case, which
if ignored could lead to the invalidation of the whole eigenvalue analysis.
The verification results, given in this section, heavily relied upon real analysis
and thus the useful theorems available in the HOL real analysis theories [6]
Formal Analysis of Optical Waveguides in HOL 241
proved to be a great asset in this exercise. The verification task took around
2500 lines of HOL code and approximately 100 man-hours.
All the quantities in the conclusion of the above theorem are known except k f,
since k 0 can be expressed in terms of the wavelength that is used to excite
the waveguide, as outlined in Section 3. Though, getting a closed form solution
for k f is not possible from the above equation. Therefore, we propose to use a
computer algebra system to solve for the value of k f. Using Mathematica, the
first four eigenvalues of k f were found to be 5497.16, 10963.2, 16351 and 21545
cm−1 . These values can then be used to calculate
the desired eigenvalues for b
according to the following relationship b = (k 0)2 (n f)2 − (k f)2 , and were
found to be 94087, 93608, 92819 and 91752 cm−1 .
Hypothetically the above analysis can be divided into two parts. The first
part covers the analysis starting from the electromagnetic wave equations, with
the given parameters, up to the point where we obtain the alternate form of
eigenvalue equation, given in Theorem 12. The second part is concerned with
the actual computation of eigenvalues from Theorem 12. The first part of the
above analysis was completely formal and thus 100% precise, since it was done
using the HOL theorem prover. The proof script for this theorem was less than
100 lines long, which clearly demonstrates the effectiveness of our work, as it was
242 O. Hasan, S.K. Afshar, and S. Tahar
mainly due the the availability of Theorem 11 that we were able to tackle this
kind of a verification problem with such a minimal effort. The second part of the
analysis cannot be handled in HOL, because of the involvement of a transcen-
dental equation for which a closed form solution for k f does not exist. For this
part, we utilized Mathematica and obtained the desired eigenvalues. To the best
of our knowledge, no other approach based on simulation, numerical methods
or computer algebra systems, can provide 100% precision and soundness in the
results like the proposed approach for the first part of the analysis. Whereas,
in the second part, we have used a computer algebra system, which is the best
option available, in terms of precision, for this kind of analysis. Other approaches
used for the second part include graphical or numerical methods, which cannot
compete with computer algebra systems in precision. Thus, as far as the whole
analysis is concerned, the proposed method offers the most precise solution.
7 Conclusions
This paper presents the formal analysis of planar optical waveguides using a
higher-order-logic theorem prover. Planar optical waveguides are simple, yet
widely used optical structures and not only find their applications in wave guid-
ing, but also in coupling, switching, splitting, multiplexing and de-multiplexing
of optical signals. Hence, their formal analysis paves the way to the formal analy-
sis of many other optical systems as well. Since the analysis is done in a theorem
prover, the results can be termed as 100% precise, which is a novelty that cannot
be achieved by any other computer based optical analysis framework.
We mainly present the formalization of the electromagnetic field equations
for a planar waveguide in the TE mode. These definitions are then utilized to
formally reason about the eigenvalue equation, which plays a vital role in the
design of planar waveguides for various engineering and other scientific domains.
To illustrate the effectiveness and utilization of the formally verified eigenvalue
equation, we used to reason about the eigenvalues of a planar asymmetric waveg-
uide. To the best of our knowledge, this is the first time that a formal approach
has been proposed for the analysis of optical systems.
The successful handling of the planar waveguide analysis clearly demonstrates
the effectiveness and applicability of higher-order-logic theorem proving for an-
alyzing optical systems. Some of the interesting future directions in this novel
domain include the verification of the eigenvalue equation for the planar waveg-
uide in the TM mode, which is very similar to the analysis presented in this
paper, and the analysis of couplers that represent two or more optical devices
linked together with an optical coupling relation, which can be done by building
on top of the results presented in this paper along with formalizing the couple
mode theory [8] in higher-order logic. Besides these, many saftey-critical pla-
nar waveguide applications can be formally analyzed including biosensors [23]
or medical imaging [15] by building on top of our results.
Formal Analysis of Optical Waveguides in HOL 243
References
1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions with Formu-
las, Graphs, and Mathematical Tables. Dover, New York (1972)
2. Anderson, J.A.: Real Analysis. Gordon and Breach Science Publishers, Reading
(1969)
3. Costa, J., Pereira, D., Giarola, A.J.: Analysis of Optical Waveguides using Math-
ematica. In: Microwave and Optoelectronics Conference, pp. 91–95 (1997)
4. Gordon, M.J.C., Melham, T.F.: Introduction to HOL: A Theorem Proving Envi-
ronment for Higher-Order Logic. Cambridge Press, Cambridge (1993)
5. Hafner, C.: The Generalized Multipole Technique for Computational Electromag-
netics. Artech House, Boston (1990)
6. Harrison, J.: Theorem Proving with the Real Numbers. Springer, Heidelberg (1998)
7. Harrison, J.: Formalizing Basic Complex Analysis. In: From Insight to Proof:
Festschrift in Honour of Andrzej Trybulec. Studies in Logic, Grammar and
Rhetoric, vol. 10, pp. 151–165. University of Bialystok (2007)
8. Haus, H., Huang, W., Kawakami, S., Whitaker, N.: Coupled-mode Theory of Op-
tical Waveguides. Lightwave Technology 5(1), 16–23 (1987)
9. Hayes, P.R., O’Keefe, M.T., Woodward, P.R., Gopinath, A.: Higher-order-compact
Time Domain Numerical Simulation of Optical Waveguides. Optical and Quantum
Electronics 31(9-10), 813–826 (1999)
10. Heinbockel, J.H.: Numerical Methods For Scientific Computing. Trafford (2004)
11. Jackson, J.D.: Classical Electrodynamics. John Wiley & Sons, Inc., Chichester
(1998)
12. Johnson, S.G., Joannopoulos, J.D.: Block-iterative Frequency Domain Methods for
Maxwell’s Equations in a Planewave Basis. Optics Express 8(3), 173–190 (2001)
13. Liu, Y., Sarris, C.D.: Fast Time-Domain Simulation of Optical Waveguide Struc-
tures with a Multilevel Dynamically Adaptive Mesh Refinement FDTD Approach.
Journal of Lightwave Technology 24(8), 3235–3247 (2006)
14. Mathematica (2009), http://www.wolfram.com
15. Moore, E.D., Sullivan, A.C., McLeod, R.: Three-dimensional Waveguide Arrays
via Projection Lithography into a Moving Photopolymer. Organic 3D Photonics
Materials and Devices II 7053, 309–316 (2008)
16. Ntogari, G., Tsipouridou, D., Kriezis, E.E.: A Numerical Study of Optical Switches
and Modulators based on Ferroelectric Liquid Crystals. Journal of Optics A: Pure
and Applied Optics 7(1), 82–87 (2005)
17. Optica (2009), http://www.opticasoftware.com/
18. Pollock, C.R.: Fundamentals of Optoelectronics. Tom Casson (1995)
19. Rumpf, R.C.: Design and Optimization of Nano-Optical Elements by Coupling
Fabrication to Optical Behavior. PhD thesis, University of Central Florida, Or-
lando, Florida (2006)
20. Schmidt, F., Zschiedrich, L.: Adaptive Numerical Methods for Problems of Inte-
grated Optics. In: Integrated Optics: Devices, Materials, and Technologies VII,
vol. 4987, pp. 83–94 (2003)
21. Yee, K.: Numerical Solution of Inital Boundary Value Problems involving Maxwell
Equations in Isotropic Media. IEEE Transactions on Antennas and Propaga-
tion 14(3), 302–307 (1966)
22. Yin, L., Hong, W.: Domain Decomposition Method: A Direct Solution of Maxwell
Equations. In: Antennas and Propagation, pp. 1290–1293 (1999)
23. Zhian, L., Wang, Y., Allbritton, N., Li, G.P., Bachman, M.: Labelfree Biosensor
by Protein Grating Coupler on Planar Optical Waveguides. Optics Letters 33(15),
1735–1737 (2008)
The HOL-Omega Logic
Peter V. Homeier
U. S. Department of Defense
[email protected]
http://www.trustworthytools.com
Abstract. A new logic is posited for the widely used HOL theorem
prover, as an extension of the existing higher order logic of the HOL4
system. The logic is extended to three levels, adding kinds to the existing
levels of types and terms. New types include type operator variables and
universal types as in System F . Impredicativity is avoided through the
stratification of types by ranks according to the depth of universal types.
The new system, called HOL-Omega or HOLω , is a merging of HOL4,
HOL2P[11], and major aspects of System Fω from chapter 30 of [10].
This document presents the abstract syntax and semantics for the kinds,
types, and terms of the logic, as well as the new fundamental axioms
and rules of inference. As the new logic is constructed according to the
design principles of the LCF approach, the soundness of the entire system
depends critically and solely on the soundness of this core.
1 Introduction
The HOL theorem prover [3] has had a wide influence in the field of mechanical
theorem proving. Despite appearing in 1988 as one of the first tools in the field,
HOL has enjoyed wide acceptance around the world, and continues to be used
for many substantial projects, for example Anthony Fox’s model of the ARM
processor. HOL’s influence is seen in that three other major theorem provers,
HOL Light, ProofPower, and Isabelle/HOL, have used essentially the same logic.
One of the main reasons for HOL’s influence has been that the actual logic
implemented in the tool, higher order logic based on Church’s simple theory of
types, turns out to be both easy to work with and expressive enough to be able
to support most models of hardware and software that people have wished to
investigate. There are theorem provers with more powerful logics, and ones with
less powerful logics, but it seems that classical higher order logic fortuitously
found a “sweet-spot,” balancing strong expressivity with nimble ease of use.
However, despite HOL’s value, it has been recognized that there are some
useful concepts beyond the power of higher order logic to state. An example is
the practical device of monads. Monads are particularly useful in modelling, for
example, realistic computations involving state or exceptions, as a shallow em-
bedding in a logic which itself is strictly functional, without state or exceptions.
Individual monads can and have been expressed in HOL, and used to reduce
the complexity of proofs about such real-world computations.
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 244–259, 2009.
The HOL-Omega Logic 245
However, stating the general properties of all monads, and proving results
about the class of all monads, has not been possible. The following shows why.
Let M be a postfix unary type operator that maps a type α to a type α M ,
unit a prefix unary term operator of type α → α M , and #= an infix binary term
operator of type α M → (α → β M ) → β M , where k a #= h is (k a) #= h.
Then M together with unit and #= is a monad iff the following properties hold:
left unit: unit a #= k = k a
right unit: m #= unit = m
associativity: m #= (λa. k a #= h) = (m #= k) #= h
There are two problems with this definition in higher order logic. First, while
higher order logic includes type operator constants like list and option, it does
not support type operator variables like M above.
But even if it did, consider the associativity property above. There are four
occurrences of #= in that property. Among these four instances are three dis-
tinct types. Unfortunately, in higher order logic, within a single expression a
variable may only have a single type. So this property would not type-check.
This is annoying because if #= were a constant instead of a variable, these
different instances of its basic type would be supported. What we need is a way
to give #= a single type which can then be specialized for each of #=’s four
instances to produce the three distinct types required.
One way is to introduce universal types, as in System F [10]. A universal type
is written ∀α.σ, where α is a type variable and σ is a type expression, possibly
including α. Such occurrences of α are bound by the universal quantification.
In addition, System F introduces abstractions of types over terms, written as
λ:α.t, where α is a type variable and t is a term. This yields a term, whose type
is a universal type. Specifically, if t has type σ, then λ:α.t has type ∀α.σ.
Given such an abstraction t, it is specialized for a particular type by t[:σ:].
This gives rise to a new form of beta-reduction on term-type applications, where
(λ:α.t)[:σ:] reduces to t[σ/α]. For convenience, we write t[:α, β:] for (t[:α:])[:β:].
Given these new forms, we can express the types of unit and #= as
unit : ∀α. α → α M
#= : ∀α β. α M → (α → β M ) → β M
and the three monad properties as
unit [:α:] a (#=[:α, β:]) k = k a
m (#=[:α, α:]) (unit[:α:]) = m
m (#=[:α, γ:]) (λa. k a (#=[:β, γ:]) h) = (m (#=[:α, β:]) k) (#=[:β, γ:]) h
What we have done here is take manual control of the typing. Since the normal
HOL parametric polymorphism was inadequate, we have added facilities for type
abstraction and instantiation of terms. This allows the single type of a variable
to be specialized for different occurrences within the same expression.
Given the existing polymorphism in HOL, in practice universal types are
needed only rarely; but when they are needed, they are absolutely essential.
246 P.V. Homeier
For reasons of space, we assume the reader is familiar with the types, terms,
axioms, and rules of inference of the HOL logic, as described in [3,4,5,6]. This
section presents the abstract syntax of the new HOL-Omega logic.
In HOL-Omega, the syntax consists of ranks, kinds, types, and terms.
2.1 Ranks
Ranks are natural numbers indicating the depth of universal type quantification
present or permitted in a type. We use the variable r to range over ranks.
2.2 Kinds
HOL-Omega introduces kinds as a new level in the logic, not present in HOL.
Kinds control the proper formation of types just as types do for terms.
There are three varieties of kinds, namely the base kind (the kind of proper
types), kind variables, and arrow kinds (the kinds of type operators).
2.3 Types
Replacing HOL’s two varieties of types, HOL-Omega has five: type variables,
type constants, type applications, type abstractions, and universal types.
σopr :≤ r2 , σarg :≤ r1 α :≤ r1 , σ :≤ r2
Ranking: α :≤ rank of α σarg σopr :≤ max(r1 ,r2 ) λα.σ :≤ max(r1 ,r2 )
σ :≤ r σ :≤ r, r ≤ r α :≤ r1 , σ :≤ r2
τ :≤ rank of τ σ :≤ r ∀α.σ :≤ max(r1 +1,r2 )
Existing types of HOL are fully supported in HOL-Omega. HOL type variables
are represented as HOL-Omega type variables of kind ty and rank 0. HOL type
applications of a type constant to a list of type arguments are represented in
HOL-Omega as a curried type constant applied to the arguments in sequence,
as (α1 , ..., αn )τ = αn (... (α1 τ )...).
We write σ : k :≤ r to say that type σ has kind k and rank r.
Proper types are types of kind ty; only these types can be the type of a term.
In a type application of a type operator to an argument, the operator must
have an arrow kind, and the domain of the kind of the operator must equal the
kind of the argument. If so, the kind of the result of the type application will be
the range of the kind of the operator. Also, the body of a universal type must
have the base kind. These restrictions ensure types are well-kinded.
In both universal types and type abstractions, the type variable is bound
over the type body. This binding structure introduces the notions of alpha and
beta equivalence, as direct analogs of the corresponding notions for terms. In
fact, types are identified up to alpha-beta equivalence. The following denote the
same type: λα.α, λβ.β, λβ.β(λα.α), γ(λα.λβ.β). Beta reduction is of the form
σ2 (λα.σ1 ) = σ1 [σ2 /α], where σ1 [σ2 /α] is the result of substituting σ2 for all free
occurrences of α in σ1 , with bound type variables in σ1 renamed as necessary.
A type σ is an instance of σ if σ =σ[θr ][θk ][θσ ] for some rank, kind, and type
substitutions θr ∈ N, θk mapping kind variables to kinds, and θσ mapping type
variables to types. The substitutions are applied in sequence, with θr first.
When matching two types, the matching is higher order, so the pattern
α → α μ (where μ : ty ⇒ ty) matches β → β, yielding [α → β, μ → λα.α].
The primeval environment contains the type constants bool, ind, and fun as
in HOL, where bool : ty, ind : ty, and fun : ty ⇒ ty ⇒ ty, and all three have
rank 0. fun is usually written as the binary infix type operator →, and for a
function type σ1 → σ2 , we say that the domain is σ1 and the range is σ2 . Also,
for a universal type ∀α.σ, we say that the domain is α and the range is σ.
2.4 Terms
HOL-Omega adds to the existing four varieties of terms two new varieties,
namely term-type applications and type-term abstractions. We use x to range
over term variables, c over term constants, and t over terms.
Arrow. Given Ti and Ui , we can construct Ti+1 by iteration over the ordi-
nals [9]. Let S0 = Ti . For all ordinals α, let Sα+1 be the closure under Sub and
Fun of
Sα { X∈K f X | K ∈ Ui ∧ f : K → Sα }.
For limit ordinals
λ, let Sλ = α<λ Sα , which is closed under Sub and Fun.
Let n = | K∈Ui K|. Then |K| ≤ n for all K ∈ Ui . Let m = n+ , the least
cardinal > n. Then m is a regular cardinal [9, p. 146] > |K| for all K ∈ Ui . Then
we define Ti+1 = Sm , which is sufficiently large by the following theorem.
Theorem 1. Sm is closed under Univ (as well as Sub and Fun).
Proof. Suppose K ∈ Ui and f : K → Sm . Sm = α<m Sα , so for each X∈K
define γX = the smallest α s.t. f X ∈ Sα , thus γX < m. Define Γ = {γX |X∈K}.
Then Γ ⊆ m, and |K| < m so |Γ | < m thus Γ < m since m is regular. The
image of f ⊆ S Γ , so by the definition of Sα+1 , X∈K f X ∈ S( Γ )+1 ⊆ Sm .
If T ∈ T |r, then T ↓r = T
If T ∈ (K1 → K2 )|r, then T ↓r = {(x↓r, y↓r) | (x, y) ∈ T ∧ x ∈ K1 |r }
If K = K1 → K2 , by the definition of T ↓r, T ↓r ⊆ K1 ⇓r×K2 ⇓r, and by T ∈ K|r,
T ↓r is a function, so T ↓r ∈ K1 ⇓r → K2 ⇓r = (K1 → K2 )⇓r = K⇓r.
We can define ⇑r : Ur →U and ↑r : K⇓r→K|r as the inverses of ⇓r and ↓r,
so that (K⇑r)⇓r = K for all K ∈ Ur and (T ↑r)↓r = T for all T ∈ K ∈ Ur .
Tr ⇑r = T
(K1 → K2 )⇑r = K1 ⇑r → K2 ⇑r
If T ∈ T ⇓r, then T ↑r = T
If T ∈ (K1 → K2 )⇓r, then T ↑r = λ(x ∈ K1 ). if x ∈ K1 |r then (T (x↓r))↑r
else chtype(K2 , r)
where chtype(K, r) = (chtyr (K⇓r))↑r
The HOL-Omega Logic 251
[[ty]]ξ = T
[[κ]]ξ = ξ κ
[[k1 ⇒ k2 ]]ξ = [[k1 ]]ξ → [[k2 ]]ξ
[[bool]]ζ,ξ,ρ = B
[[ind]]ζ,ξ,ρ = I
[[ σ1 → σ2 ]]ζ,ξ,ρ = [[σ1 ]]ζ,ξ,ρ → [[σ2 ]]ζ,ξ,ρ
[[τ ]]ζ,ξ,ρ = M (ζ, ξ) τ
[[α]]ζ,ξ,ρ = ρ (ζ, ξ) α
[[ σarg σopr ]]ζ,ξ,ρ = [[σopr ]]ζ,ξ,ρ [[σarg ]]ζ,ξ,ρ
[[σ]]ζ,ξ,ρ[α→T ] if T ∈ [[k]]ξ | [[r]]ζ
[[ λ(α : k :≤ r). σ ]]ζ,ξ,ρ = λT ∈ [[k]]ξ .
chtype([[kσ ]]ξ , [[rσ ]]ζ ) otherwise
[[ ∀(α : k :≤ r). σ ]]ζ,ξ,ρ = T ∈[[k]] ⇓[[r]] [[σ]]ζ,ξ,ρ[α→T ↑[[r]] ]
ξ ζ ζ
where for [[ λ(α : k :≤ r). σ ]]ζ,ξ,ρ , if T has rank larger than the variable α, an
arbitrary type of the kind kσ and rank rσ of σ is returned, essentially as an error.
By induction over the structure of types, it can be demonstrated that the
semantics of types is consistent with the semantics of kinds and ranks, i.e.,
Γ t
(INST TYPE)
Γ [σ1 , . . . , σn /α1 , . . . , αn ] t[σ1 , . . . , σn /α1 , . . . , αn ]
– Rule INST KIND says that consistently substituting kinds for kind variables
throughout a theorem yields a theorem.
Γ t
(INST KIND)
Γ [k1 , . . . , kn /κ1 , . . . , κn ] t[k1 , . . . , kn /κ1 , . . . , κn ]
– Rule INST RANK says that consistently incrementing by n ≥ 0 the rank of all
type variables throughout a theorem yields a theorem. z is the rank variable.
Γ t
(INST RANK)
Γ [(z + n)/z] t[(z + n)/z]
– Rule TY ABS says that if two terms are equal, then their type abstractions
are equal, where α is not free in Γ .
Γ t1 = t2
(TY ABS)
Γ (λ:α.t1 ) = (λ:α.t2 )
– Rule TY BETA CONV describes the equality of type beta-conversion, where
t[σ/α] denotes the result of substituting σ for free occurrences of α in t.
(TY BETA CONV)
(λ:α.t)[:σ:] = t[σ/α]
The HOL-Omega Logic 253
6 Examples
The HOL-Omega logic makes it straightforward to express many concepts from
category theory, such as functors and natural transformations. Much of the first
two examples below is ported from HOL2P [11]; the main difference is that
the higher-order type abbreviations and type inference of HOL-Omega allow a
more pleasing presentation. We focus on the category Type whose objects are
the proper types of the HOL-Omega logic, and whose arrows are the (total)
term functions from one type to another. The source and target of an arrow are
the domain and range of the type of the function. The identity arrows are the
identity functions on each type. The composition of arrows is normal functional
composition. The customary check that the target of one arrow is the source of
the other is accomplished automatically by the strong typing of the logic.
6.1 Functors
Functors map objects to objects and arrows to arrows. In the category Type,
the first mapping is represented as a type F of kind ty ⇒ ty, and the second as
a function of the type F functor, where functor is the type abbreviation
functor = λF. ∀α β. (α → β) → (α F → β F ).
functor (F : F functor) =
(∀:α. F (I : α → α) = I) ∧ Identity
(∀:α β γ. ∀(f : α → β)(g : β → γ). F (g ◦ f ) = F g ◦ F f ) Composition
functor (F : F functor) =
(∀:α. F [:α, α:] (I : α → α) = I) ∧ Identity
(∀:α β γ. ∀(f : α → β)(g : β → γ). Composition
F [:α, γ:] (g ◦ f ) = F [:β, γ:] g ◦ F [:α, β:] f )
In what follows, these type applications will normally be omitted for clarity.
In HOL, list : ty ⇒ ty is the type of finite lists. It is defined as a recursive
datatype with two constructors, [] : α list and :: : α → α list → α list.
:: is infix. The function MAP : (α → β) → (α list → β list) is defined by
MAP f [] = []
MAP f (x :: xs) = f x :: MAP f xs
Then MAP can be proven to be a functor: functor ((λ:α β. MAP) : list functor).
A simple functor is the identity function I: functor ((λ:α β. I) : I functor).
The composition of two functors is a functor. We overload ◦ to define this:
The result has type (F o G)functor. As an example, (λ:α β. MAP) ◦ (λ:α β. MAP) =
(λ:α β. MAP ◦ MAP) : (list o list)functor is a functor. The type composition
operator o reflects the category theory composition of two functors’ mappings
on objects. In HOL2P, the MAP functor composition example is expressed as:
TYINST (θ → λα. (α list)list) functor (λ:α β. λf. MAP (MAP f ))
Here the notation has been adjusted to that of this paper, for ease of com-
parison. TYINST is needed to manually instantiate a free type variable θ of the
functor predicate with the type for this instance, which must be stated as a type
abstraction. HOL-Omega’s kinds and type inference enable a clearer statement:
functor (λ:α β. MAP ◦ MAP)
Beyond the power of HOL2P, HOL-Omega supports quantification over functors:
∃:F . ∃(F : F functor). functor F.
6.3 Monads
Wadler [12] has proposed using monads to structure functional programming.
He defines a monad as a triple (M , unit, #=) of a type operator M and two
term operators unit and #= (where #= is an infix operator) obeying three laws.
We express this definition in HOL-Omega as follows.
We define two type abbreviations unit and bind:
unit = λM. ∀α. α → α M
bind = λM. ∀α β. α M → (α → β M ) → β M
We define a monad to be two term operators, unit and #=, with a single
common free type variable M : ty⇒ty, satisfying a predicate of the three laws:
monad (unit : M unit, #= : M bind) =
(∀:α β. ∀(a : α)(k : α → β M ). (Left unit)
unit a #= k = k a) ∧
(∀:α. ∀(m : α M ). (Right unit)
m #= unit = m) ∧
(∀:α β γ. ∀(m : α M )(k : α → β M )(h : β → γ M ). (Associative)
(m #= k) #= h = m #= (λα. k α #= h))
As an example, we define the unit and #= operations for a state monad as
state = λσ α. σ → α × σ
map = λM. ∀α β. (α → β) → (α M → β M )
join = λM. ∀α. α M M → α M
Given a monad defined using unit, map, and join, the corresponding #=
operator BIND(map, join) may also be constructed automatically:
E.g., for the state monad, state map = MMAP (state unit, state bind)
state join = JOIN (state unit, state bind)
state bind = BIND (state map, state join).
Then it can be proven that these two definitions of a monad are equivalent.
tμ tη
t3 - t2 t - t2 ηt t
@
1@
μt μ μ
1
? ? R ?
@
t2 - t t.
μ
It can be proven that this is equivalent to the (unit, map, join) definition:
7 Conclusion
This document has presented a description of the core logic of the HOL-Omega
theorem prover. This has been implemented as a variant of the HOL4 theorem
prover. The implementation may be downloaded by the command
Also, the nimble ease of use of HOL has been largely preserved. For example, the
type inference algorithm is a pure extension, so that all classic terms have the same
types successfully inferred. Inference of most general types for all terms is not always
possible, as also seen in System F, and type inference may fail even for typeable
terms, but in practice a few user annotations are usually sufficient.
The system is still being developed but is currently useful. All of the examples pre-
sented have been mechanized in the examples/HolOmega subdirectory, along with
further examples from Algebra of Programming [1] ported straightforwardly from
HOL2P, including homomorphisms, initial algebras, catamorphisms, and the ba-
nana split theorem. While maintaining backwards compatibility with the existing
HOL4 system and libraries, the additional expressivity and power of HOL-Omega
makes this tool applicable to a great collection of new problems.
References
1. Bird, R., de Moor, O.: Algebra of Programming. Prentice Hall (1997)
2. Coquand, T.: A new paradox in type theory. In: Prawitx, D., Skyrms, B., Westerstahl,
D. (eds.) Proceedings 9th Int. Congress of Logic, Methodology and Philosophy of
Science, pp. 555–570. North-Holland, Amsterdam (1994)
3. Gordon, M.J.C., Melham, T.F.: Introduction to HOL. Cambridge University Press,
Cambridge (1993)
4. Gordon, M.J.C., Pitts, A.M.: The HOL Logic and System. In: Bowen, J. (ed.) Towards
Verified Systems, ch. 3, pp. 49–70. Elsevier Science B.V., Amsterdam (1994)
5. The HOL System DESCRIPTION (Version Kananaskis 4),
http://downloads.sourceforge.net/hol/kananaskis-4-description.pdf
6. The HOL System LOGIC (Version Kananaskis 4),
http://downloads.sourceforge.net/hol/kananaskis-4-logic.pdf
7. Lack, S., Street, R.: The formal theory of monads II. Journal of Pure Applied Algo-
rithms 175, 243–265 (2002)
8. Melham, T.F.: The HOL Logic Extended with Quantification over Type Variables.
Formal Methods in System Design 3(1-2), 7–24 (1993)
9. Monk, J.D.: Introduction to Set Theory. McGraw-Hill, New York (1969)
10. Pierce, B.C.: Types and Programming Languages. MIT Press, Cambridge (2002)
11. Völker, N.: HOL2P - A System of Classical Higher Order Logic with Second Order
Polymorphism. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007. LNCS, vol. 4732,
pp. 334–351. Springer, Heidelberg (2007)
12. Wadler, P.: Monads for functional programming. In: Jeuring, J., Meijer, E. (eds.) AFP
1995. LNCS, vol. 925. Springer, Heidelberg (1995)
A Purely Definitional Universal Domain
Brian Huffman
1 Introduction
One of the main attractions of pure functional languages like Haskell is that
they promise to be easy to reason about. However, that promise has not yet
been fulfilled. To illustrate this point, let us define a couple of datatypes and
functions, and try to prove some simple properties.
data Cont r a = MkCont ((a -> r) -> r)
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 260–275, 2009.
c Springer-Verlag Berlin Heidelberg 2009
A Purely Definitional Universal Domain 261
operation for the Resumption monad; together with Done as the monadic unit,
we should expect bind to satisfy the monad laws.
The first monad law follows trivially from the definition of bind. Instead,
let’s consider the second monad law (also known as the right-unit law) which
states that bind r Done = r. How can we go about proving this, formally or
otherwise?
It might be worthwhile to try case analysis on r, for a start. If r is equal
to Done x, then from the definition of bind we have bind (Done x) Done =
Done x, so the law holds in this case. Furthermore, if r is equal to ⊥, then
from the strictness of bind we have bind ⊥ Done = ⊥, so the law also holds
for ⊥. Finally, we must consider the case when r is equal to More c. Using the
definition of bind we obtain the following:
Now, if we could only rewrite the bind r Done on the right-hand side to r,
then we could use the functor identity law for mapCont to simplify the entire
right-hand side to More c. Perhaps an appropriate induction rule could help.
When doing induction over simple datatypes like lists, the inductive hypoth-
esis simply assumes that the property being proved holds for an immediate
subterm: We get to assume P(xs) in order to show P(x : xs). This kind of
inductive hypothesis will not work for type Resumption, because of the indirect
recursion in its definition.
In fact, an induction rule for Resumption appropriate for our proof does exist.
(The proof of the second monad law using this induction scheme is left as an
exercise for the reader.)
admissible(P)
P(undefined)
∀x. P(Done x) (1)
∀f c. (∀x. P(f x)) −→ P(More (mapCont f c))
∀x. P(x)
This induction rule is rather unusual—the inductive step quantifies over a func-
tion f, and also mentions mapCont. It is probably not obvious to most readers
that it is correct. How can we trust it? It would be desirable to formally prove
such rules using a theorem prover.
Unfortunately, a fully mechanized semantics of general recursive datatypes
does not yet exist. Various theorem provers have facilities for defining recur-
sive datatypes, but none can properly deal with datatype definitions like the
Resumption type introduced earlier. The non–strictly positive recursion causes
the definition to be rejected by both Isabelle/HOL’s datatype package and Coq’s
inductive definition mechanism.
Of all the currently available theorem proving tools, the Isabelle/HOLCF
domain package is the closest to being able to support such datatypes. It uses
the continuous function space, so it is not limited to strictly positive recursion.
However, the domain package has some problems due to the fact that it generates
262 B. Huffman
non-trivial axioms “on the fly”: For each type definition, the domain package
declares the existence of the new type (without defining it), and asserts an
appropriate type isomorphism and induction rule. The most obvious worry with
this design is the potential for unsoundness. On the other hand, the desire to
avoid unsoundness can lead to an implementation that is overly conservative.
In contrast with the current domain package, the Isabelle/HOL inductive
datatype package [14] is purely definitional. It uses a parameterized universe
type, of which new datatypes are defined as subsets. Induction rules are not
asserted as axioms; rather, they are proved as theorems. Using a similar design
for the HOLCF domain package would allow strong reasoning principles to be
generated, with soundness ensured by construction.
The original contributions of this paper are as follows:
– A new construction of a universal domain that can represent a wide variety of
types, including sums, products, continuous function space, powerdomains,
and recursive types built from these. Universal domain elements are defined
in terms of sets of natural numbers, using ideal completion—thus the con-
struction is suitable for simply-typed, higher-order logic theorem provers.
– A formalization of this construction in the HOLCF library of the Isabelle
theorem prover. The formalization is fully definitional; no new axioms are
asserted.
Section 2 reviews various domain theory concepts used in the HOLCF formal-
ization. The construction of the universal domain type itself, along with embed-
ding and projection functions, are covered in Section 3. Section 4 describes how
the universal domain can be used to define recursive types. After a discussion of
related work in Section 5, conclusions and directions for future work are found
in Section 6.
2 Background Concepts
This paper assumes some familiarity with basic concepts of domain theory: A
partial order is a set with a reflexive, transitive, antisymmetric relation ('). A
chain is an increasing sequence indexed by the naturals; a complete partial order
(cpo) has a least upper bound (lub) for every chain. A pointed cpo also has a
least element, ⊥. A continuous function preserves lubs of chains. An admissible
predicate holds for the lub of a chain, if it holds for all elements of the chain.
HOLCF [13] is a library of domain theory built on top of the Isabelle/HOL
theorem prover. HOLCF defines all of the standard notions listed above; it also
defines standard type constructors like the continuous function space, and strict
sums and products. The remainder of this section is devoted to some more
specialized concepts included in HOLCF that support the formalization of the
universal domain.
2.2 Deflations
Cpos may contain other cpos as subsets. A deflation 1 is a way to encode such a
sub-cpo as a continuous function. Let B be a cpo, and d : B → B be a continuous
function. Then d is a deflation if d ◦ d = d ' IdB . The image set of deflation
d : B → B gives a sub-cpo of B.
Essentially, a deflation is a value that represents a type. For example, the
function deflate in Fig. 1 is a deflation; its image set consists of exactly those
values of type Tree that contain no Leaf constructors. Note that while the
the definition of deflate does not mention type Shrub at all, its image set
is isomorphic to type Shrub—in other words, deflate (a function value) is a
representation of Shrub (a type).
1
My usage of deflation follows Gunter [6]. Many authors use the term projection to
refer to the same concept, but I prefer deflation because it avoids confusion with the
second half of an ep-pair.
264 B. Huffman
While types can be represented by deflations, type constructors (which are like
functions from types to types) can be represented as functions from deflations
to deflations. For example, the map function represents Haskell’s list type con-
structor: While deflate is a deflation on type Tree that represents type Shrub,
map deflate is a deflation on type [Tree] that represents type [Shrub].
Deflations and ep-pairs are closely related. Given an ep-pair (e, p) from cpo A
into cpo B, the composition e◦p is a deflation on B whose image set is isomorphic
to A. Conversely, every deflation d : B → B also gives rise to an ep-pair. Define
the cpo A to be the image set of d; also define e to be the inclusion map from A
to B, and define p = d. Then (e, p) is an embedding-projection pair. So saying
that there exists an ep-pair from A to B is equivalent to saying that there exists
a deflation on B whose image set is isomorphic to A.
Finally we are ready to talk about what it means for a cpo to be a universal
domain. A cpo U is universal for a class of cpos, if for every cpo D in the class,
there exists an ep-pair from D into U . Equivalently, for every D there must exist
a deflation on U with an image set isomorphic to D.
Lazy recursive datatypes often have infinite as well as finite values.2 For example,
we can define a datatype of recursive lazy lists of booleans:
Finite values of type BoolList include total values like Cons False Nil, and
Cons True (Cons False Nil), along with partial finite values like Cons False
undefined. On the other hand, recursive definitions can yield infinite values:
trues :: BoolList
trues = Cons True trues
The function approx is so named because for any input value xs it generates
a sequence of finite approximations to xs. For example, the first few approxi-
mations to trues are ⊥, Cons True ⊥, Cons True (Cons True ⊥), and so on.
Each is finite, but the least upper bound of the sequence is the infinite value
trues. This property of a cpo, where every infinite value can be written as the
least upper bound of a chain of finite values, is called algebraicity. Thus BoolList
is an algebraic cpo.
The sequence of deflations approx n is a chain of functions whose least upper
bound is the identity function. In terms of image sets, we have a sequence of
partial orders whose limit is the whole type BoolList.
A further property of approx which may not be immediately apparent is that
for any n, the image of approx n is a finite set. This means that image sets of
approx n yield a sequence of finite partial orders. As a limit of finite partial
orders, we say that type BoolList is a bifinite cpo. More precisely, as a limit of
countably many finite partial orders, BoolList is an omega-bifinite cpo.3
The omega-bifinites are a useful class of cpos because bifiniteness is pre-
served by all of the type constructors defined in HOLCF. Furthermore, all
Haskell datatypes are omega-bifinite. Basically any type constructor that pre-
serves finiteness will preserve bifiniteness as well. More details about the formal-
ization of omega-bifinite domains in HOLCF can be found in [10].
In an algebraic cpo the set of finite elements, together with the ordering relation
on them, completely determines the structure of the entire cpo. We say that the
set of finite elements forms a basis for the cpo, and the entire cpo is a completion
of the basis.
Given a basis B with ordering relation ((), we can reconstruct the whole
algebraic cpo. The standard process for doing this is called ideal completion, and
it is done by considering the set of ideals over the basis.
An ideal is a non-empty, downward-closed, directed set—that is, it contains
an upper bound for any finite subset. A principal ideal is an ideal of the form
{y. y ( x} for some x, denoted ↓ x. The set of all ideals over B, ( is denoted
Idl(B); when ordered by subset inclusion, Idl(B) forms an algebraic cpo. The
compact elements of Idl(B) are exactly those represented by principal ideals.
Note that the relation (() does not need to be antisymmetric. For x and y
that are equivalent (that is, both x ( y and y ( x) the principal ideals ↓ x and
↓ y are equal. This means that the ideal completion construction automatically
takes care of quotienting by the equivalence induced by (().
Just as the structure of an algebraic cpo is completely determined by its
basis, a continuous function from an algebraic cpo to another cpo is completely
determined by its action on basis elements. This suggests a method for defining
continuous functions over ideal completions: First, define a function f from basis
3
“SFP domain” is another name, introduced by Plotkin [15], that is used for the same
concept—the name stands for Sequence of Finite Posets.
266 B. Huffman
B to cpo C such that f is monotone, i.e. x ( y implies f (x) ' f (y). Then we
can define the continuous extension of f as f(S) = x∈S f (x). The function
f is the unique continuous function of type Idl(B) → C that agrees with f on
principal ideals—that is, for all x : B, f(↓ x) = f (x).
In the next section, all of the constructions related to the universal domain will
be done in terms of basis values: The universal domain itself will be defined using
ideal completion, and the embedding and projection functions will be defined as
continuous extensions.
HOLCF includes a formalization of ideal completion and continuous exten-
sions, which was created to support the definition of powerdomains [10].
P0 P1 P2 P3
Fig. 2. A sequence of finite posets. Each Pn can be embedded into Pn+1 ; black nodes
indicate the range of the embedding function.
The strategy for embedding a bifinite domain into the universal domain is
built around increments. The universal domain is designed so that if a finite
partial order P is representable (i.e. by a deflation), and there is an increment
from P to P , then P will also be representable.
For all embeddings from Pn to Pn+1 that add more than one new value, we
will need to decompose the single large embedding into a sequence of smaller
increments. The challenge, then, is to determine in which order the new elements
should be inserted. The order matters: Adding elements in the wrong order can
cause problems, as shown in Fig. 3.
=⇒ =⇒ =⇒
=⇒ =⇒ =⇒
Fig. 3. The right (top) and wrong (bottom) way to order insertions. No ep-pair exists
between the 3-element and 4-element posets on the bottom row.
Fig. 4. A sequence of four increments going from P2 to P3 . Each new node may have
any number of upward edges, but only one downward edge.
Armed with this strategy, we can finally formalize the complete sequence of
increments for type D. To each element x of the basis of D we must assign
a sequence number place(x)—this numbering tells in which order to insert the
values. The HOLCF formalization breaks up the definition of place as follows.
First, each basis value is assigned to a rank, where rank (x) = n means that the
basis value x first appears in the poset Pn . Equivalently, rank (x) is the least
n such that approx n (x) = x. Then an auxiliary function pos assigns sequence
numbers to values in finite sets, by repeatedly removing an arbitrary maximal
element until the set is empty. Finally, place(x) is defined as the sequence number
of x within its (finite) rank set, plus the total size of all earlier ranks.
For the remainder of this paper, it will be sufficient to note that the place function
satisfies the following two properties:
– Values in earlier ranks come before values in later ranks: If rank (x) <
rank (y), then place(x) < place(y).
– Within the same rank, larger values come first: If rank (x) = rank (y) and
x ' y, then place(y) < place(x).
e a = ⊥
b = 1, a, {}
f g
c = 2, a, {}
d = 3, a, {b}
b h
e = 4, b, {}
d c f = 5, d, {e}
g = 6, c, {}
a h = 7, a, {e, f, g}
Figure 5 shows how this system works for embedding all the elements from
the poset P3 into the basis datatype. The elements have letter names from a–
h, assigned alphabetically by insertion order. In the datatype encoding of each
element, the subordinate and superiors are selected from the set of previously
inserted elements. Serial numbers are assigned sequentially.
The serial number is necessary to distinguish multiple values that are inserted
in the same position. For example, in Fig. 5, elements b and c both have a as the
subordinate, and neither has any superiors. The serial number is the only way
to tell such values apart.
270 B. Huffman
Note that the basis datatype seems to contain some junk—some subordi-
nate/superiors combinations are not well formed. For example, in any valid in-
crement, all of the superiors are positioned above the subordinate. One way to
take care of this requirement would be to define a well-formedness predicate for
basis elements. However, it turns out that it is possible (and indeed easier) to
simply ignore any invalid elements. In the set of superiors, only those values that
are above the subordinate will be considered. (This will be important to keep in
mind when we define the basis ordering relation.)
There is also a possibility of multiple representations for the same value.
For example, in Fig. 5 the encoding of h is given as 7, a, {e, f, g}, but the
representation 7, a, {f, g} would work just as well (since the sets have the same
upward closure). One could consider having a well-formedness requirement for
the set of superiors to be upward-closed. But this turns out not to be necessary,
since the extra values do not cause problems for any of the formal proofs.
The subordinate value a is computed using a helper function sub, which is defined
as sub(x) = approx n−1 (x), where n = rank (x). The ordering produced by the
place function ensures that no previously inserted value with the same rank as
x will be below x. Therefore the previously inserted value immediately below x
must be sub(x), which comes from the previous rank.
In order to complete the continuous extension, it is necessary to prove that
the basis embedding function is monotone. That is, we must show that for any
x and y in the basis of D, x ' y implies emb(x) ( emb(y). The proof is by
well-founded induction over the maximum of place(x) and place(y). There are
two main cases to consider:
– Case place(x) < place(y): Since x ' y, it must be the case that rank (x) <
rank (y). Then, using the definition of sub it can be shown that x ' sub(y);
thus by the inductive hypothesis we have emb(x) ( emb(sub(y)). Also, from
Eq. (6) we have emb(sub(y)) ( emb(y). Finally, by transitivity we have
emb(x) ( emb(y).
– Case place(y) < place(x): From the definition of sub we have sub(x) ' x. By
transitivity with x ' y this implies sub(x) ' y; therefore by the inductive
hypothesis we have emb(sub(x)) ( emb(y). Also, using Eq. (8), we have that
emb(y) is one of the superiors of emb(x). Ultimately, from Eq. (7) we have
emb(x) ( emb(y).
The projection function prj from U to D is also defined using continuous ex-
tension. The action of prj on basis elements is specified by the following recursive
definition:
emb −1 (a) if ∃x. emb(x) = a
prj (a) = (9)
prj (subordinate(a)) otherwise
To ensure that prj is well-defined, there are a couple of things to check. First
of all, the recursion always terminates: In the worst case, repeatedly taking the
subordinate of any starting value will eventually yield ⊥, at which point the first
branch will be taken since emb(⊥) = ⊥. Secondly, note that emb −1 is uniquely
defined, because emb is injective. Injectivity of emb is easy to prove, since each
embedded value has a different serial number.
Just like with emb, we also need to prove that the basis projection function
prj is monotone. That is, we must show that for any a and b in the basis of
U , a ( b implies prj (a) ' prj (b). Remember that the basis preorder (() is
an inductively defined relation; accordingly, the proof proceeds by induction on
a ( b. Compared to the proof of monotonicity for emb, the proof for prj is
relatively straightforward; details are omitted here.
Finally, we must prove that emb and prj form an ep-pair. The proof of prj ◦
emb = IdD is easy: Let x be any value in the basis of D. Then using Eq. (9), we
have prj (emb(x)) = emb −1 (emb(x)) = x. Since this equation is an admissible
predicate on x, proving it for compact x is sufficient to show that it holds for all
values in the ideal completion.
The proof of emb ◦ prj ' IdU takes a bit more work. As a lemma, we can show
that for any a in the basis of U , prj (a) is always equal to emb −1 (b) for some
272 B. Huffman
b ( a that is in the range of emb. Using this lemma, we then have emb(prj (a)) =
emb(emb −1 (b)) = b ( a. Finally, using admissibility, this is sufficient to show
that emb(prj (a)) ' a for all a in U .
To summarize the results of this section: We have formalized a type U , and
two polymorphic continuous functions emb and prj. For any omega-bifinite do-
main D, emb and prj form an ep-pair that embeds D into U . The full proof
scripts are available as part of the distribution of Isabelle2009, in the theory file
src/HOLCF/Universal.thy.
As a recursive datatype, ResumptionT uses the fixed point operator in its def-
inition. Also note that the definition of ResumptionT refers to ContT on the
right-hand side—since ContT is a continuous function, it may be used freely
within other recursive definitions. Thus it is not necessary to transform indirect
recursion into mutual recursion, like the Isabelle datatype package does.
Once the deflations have been constructed, the actual Cont and Resumption
types can be defined using the image sets of their respective deflations. That
the Resumption type satisfies the appropriate domain isomorphism follows from
the fixed-point definition. Also, a simple induction principle (a form of take
A Purely Definitional Universal Domain 273
induction, like what would be axiomatized by the current domain package) can
be derived from the fact that ResumptionT is a least fixed-point.
Finally, the simple take induction rule can be used to derive the higher-level
induction rule shown in Eq. (1). The appearance of mapCont is due to the fact
that (modulo some coercions to and from U ) it coincides with the deflation
combinator (λD. ContT (R, D)). (This is similar to how the function map doubles
as the deflation combinator for lists.)
5 Related Work
An early example of the purely definitional approach to defining datatypes is
described by Melham, in the context of the HOL theorem prover [12]. Melham
defines a type (α)Tree of labelled trees, from which other recursive types are
defined as subsets. The design is similar in spirit to the one presented in this
paper—types are modeled as values, and abstract axioms that characterize each
datatype are proved as theorems. The main differences are that it uses ordinary
types instead of bifinite domains, and ordinary subsets instead of deflations.
The Isabelle/HOL datatype package uses a design very similar to the HOL
system. The type α node, which was originally used for defining recursive types
in Isabelle/HOL, was introduced by Paulson [14]; it is quite similar to the HOL
system’s (α)Tree type. Gunter later extended the labelled tree type of HOL to
support datatypes with arbitrary branching [9]. Berghofer and Wenzel used a
similarly extended type to implement Isabelle’s modern datatype package [4].
Agerholm used a variation of Melham’s labelled trees to define lazy lists and
other recursive domains in the HOL-CPO system [1]. Agerholm’s cpo of infinite
trees can represent arbitrary polynomial datatypes as subsets; however, negative
recursion is not supported.
Recent work by Benton, et al. uses the colimit construction to define recursive
domains in Coq [3]. Like the universal domain described in this paper, their
technique can handle both positive and negative recursion. Using colimits avoids
the need for a universal domain, but it requires a logic with dependent types;
the construction will not work in ordinary higher-order logic.
On the theoretical side, various publications by Gunter [6,7,8] were the pri-
mary sources of ideas for my universal domain construction. The construction
of the sequence of increments in Section 3 is just as described by Gunter [7, §5].
However, the use of ideal completion is original—Gunter defines the universal
domain using a colimit construction instead. Given a cpo D, Gunter defines a
type D+ that can embed any increment from D to D . The universal domain is
then defined as a solution to the domain equation D = D+ . The construction
of D+ is similar to my Basis datatype, except that it is non-recursive and does
not include serial numbers.
It provides a universal domain type U , into which any omega-bifinite domain can
be embedded. It also provides a type T of algebraic deflations, which represent
bifinite domains as values. Both are included as part of the Isabelle2009 release.
While the underlying theory is complete, the automation is not yet finished.
The first area of future work is to connect the new theories to the existing domain
package, so that instead of axiomatizing the type isomorphism and induction
rules, the domain package can prove them from the fixed-point definitions.
The domain package will also need to be extended with automation for
indirect-recursive datatypes. Such datatypes may have various possible induc-
tion rules, so this will require some design decisions about how to formulate the
rules, in addition to work on automating the proofs.
Other future directions explore limitations in the current design:
– Higher-order type constructors. Higher-order types can be represented by
deflation combinators with types like (T → T ) → T . The problem is that
Isabelle’s type system only supports first-order types. Although, see [11] for
an admittedly complicated workaround.
– Non-regular (nested) datatypes [5]. Deflation combinators for non-regular
datatypes can be defined by taking least fixed points at type T → T , rather
than type T . However, since Isabelle does not support type quantification or
polymorphic recursion, induction rules and recursive functions could not be
defined in the normal way.
– Higher-rank polymorphism. This is not supported by Isabelle’s type system.
However, the universal domain U could be used to model such types, using
the construction described by Amadio and Curien [2].
– Generalized abstract datatypes (GADTs). These are usually modeled in terms
of some kind of type equality constraints. For example, type equality con-
straints are a central feature of System FC [16], a compiler intermediate
language used to represent Haskell programs. But to the extent of this au-
thor’s knowledge, there is no way to model type equality constraints using
deflations.
References
1. Agerholm, S.: A HOL Basis for Reasoning about Functional Programs. PhD thesis,
University of Aarhus (1994)
2. Amadio, R.M., Curien, P.-L.: Domains and Lambda-Calculi. Cambridge University
Press, New York (1998)
3. Benton, N., Kennedy, A., Varming, C.: Some domain theory and denotational
semantics in Coq. In: Proc. 22nd International Conference on Theorem Proving
in Higher Order Logics (TPHOLs 2009). LNCS, vol. 5674. Springer, Heidelberg
(2009)
A Purely Definitional Universal Domain 275
4. Berghofer, S., Wenzel, M.: Inductive datatypes in HOL - lessons learned in formal-
logic engineering. In: Bertot, Y., Dowek, G., Hirschowitz, A., Paulin, C., Théry, L.
(eds.) TPHOLs 1999. LNCS, vol. 1690, pp. 19–36. Springer, Heidelberg (1999)
5. Bird, R.S., Meertens, L.G.L.T.: Nested datatypes. In: Jeuring, J. (ed.) MPC 1998.
LNCS, vol. 1422, pp. 52–67. Springer, Heidelberg (1998)
6. Gunter, C.: Profinite Solutions for Recursive Domain Equations. PhD thesis, Uni-
versity of Wisconsin at Madison (1985)
7. Gunter, C.A.: Universal profinite domains. Information and Computation 72(1),
1–30 (1987)
8. Gunter, C.A.: Semantics of Programming Languages: Structures and Techniques.
In: Foundations of Computing. MIT Press, Cambridge (1992)
9. Gunter, E.L.: A broader class of trees for recursive type definitions for HOL. In:
Joyce, J.J., Seger, C.-J.H. (eds.) HUG 1993. LNCS, vol. 780, pp. 141–154. Springer,
Heidelberg (1994)
10. Huffman, B.: Reasoning with powerdomains in Isabelle/HOLCF. In: Mohamed,
O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 45–56.
Springer, Heidelberg (2008)
11. Huffman, B., Matthews, J., White, P.: Axiomatic constructor classes in Is-
abelle/HOLCF. In: Hurd, J., Melham, T. (eds.) TPHOLs 2005. LNCS, vol. 3603,
pp. 147–162. Springer, Heidelberg (2005)
12. Melham, T.F.: Automating recursive type definitions in higher order logic. In:
Current Trends in Hardware Verification and Automated Theorem Proving, pp.
341–386. Springer, Heidelberg (1989)
13. Müller, O., Nipkow, T., von Oheimb, D., Slotosch, O.: HOLCF = HOL + LCF.
Journal of Functional Programming 9, 191–223 (1999)
14. Paulson, L.C.: Mechanizing coinduction and corecursion in higher-order logic. Jour-
nal of Logic and Computation 7 (1997)
15. Plotkin, G.D.: A powerdomain construction. SIAM J. Comput. 5(3), 452–487
(1976)
16. Sulzmann, M., Chakravarty, M.M.T., Jones, S.P., Donnelly, K.: System F with
type equality coercions. In: TLDI 2007: Proceedings of the 2007 ACM SIGPLAN
international workshop on Types in languages design and implementation, pp. 53–
66. ACM, New York (2007)
Types, Maps and Separation Logic
1 Introduction
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 276–292, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Types, Maps and Separation Logic 277
memory, usually in a hardware-defined page table structure, and they are often
manipulated through the virtual memory layer.
As an example, the completion of the very first C implementation (at the time
untried and unverified) of the formally verified seL4 microkernel [8] in our group
was celebrated by loading the code onto our ARMv61 development board and
starting the boot process to generate a hello-world message. Quite expectedly,
nothing at all happened. The board was unresponsive and no debug information
was forthcoming. It took 3 weeks to write the C implementation following a
precise specification. It took 5 weeks debugging to get it running. It turned out
that the boot code had not set up the initial page table correctly, and since no
page fault handler was installed, the machine just kept faulting. This was the
first of a number of virtual-memory related bugs. What is worse, our verification
framework for C would, at the time, not have caught any of these bugs. We have
since explictly added the appropriate virtual memory proof obligations. They
are derived, in part, from the work presented in this paper.
We present a framework in Isabelle/HOL for the verification of low-level C
code with separation logic in the presence of virtual memory. The framework
itself is abstract and generic. In earlier work [16], we described a preliminary
version of it, instantiated to a hypothetical simple page table and a toy lan-
guage. In that work we concentrated on showing that the logic of the framework
is indeed an instance of abstract separation logic [5] and that it supports the
usual separation logic reasoning, including the frame rule. Here, we concentrate
on making the framework applicable to the verification of real C code. We have
instantiated the framework to the high-fidelity memory model for C by Tuch et
al [24] and connected it with the same C-parsing infrastructure for Isabelle/HOL
that was used there. On the hardware side, we have instantiated the framework
to a detailed and precise model of ARMv6 2-level hardware page tables. To
our knowledge, this is the first formalisation of the ARMv6 memory translation
mechanism. The resulting instantiation is a foundational, yet practical verifica-
tion framework for a large subset of standard C99 [13] with the ability to reason
about the effects of virtual memory when necessary and the ability to reason
abstractly in the traditional separation logic style when virtual memory is not
the focus.
The separation logic layer of the framework makes three additional basic pred-
icates available: mapping from a virtual address to a value, mapping from a phys-
ical address to a value, and mapping from a virtual to a physical address. For
the user of the framework, these integrate seamlessly with other separation logic
formulae and they support all expected, traditional reasoning principles. Inside
the framework, we invest significant effort to provide this nice abstraction, to
support the frame rule, and to shield the verification user from the considerable
complexity of the hardware page table layout in a modern architecture.
Our envisaged application area for this framework is low-level OS kernel code
that manipulates page tables and user-level page fault handlers in microkernel
1
The ARMv6 is a popular processor architecture for embedded systems, such as the
iPhone or Android.
278 R. Kolanski and G. Klein
systems. To stay in the same, foundational framework, it can also be used for
the remaining OS kernel without any significant reasoning overhead in a sepa-
ration logic setting. Our direct application area is the verification of the seL4
microkernel [8].
The remainder of this paper is structured as follows. After introducing nota-
tion in Sect. 2, we describe in Sect. 3 an abstract type class for encoding arbitrary
C types in memory. Sect. 4 describes our abstract, generic page table framework
and Sect. 5 instantiates this to ARMv6. Sect. 6 integrates virtual memory into
our abstract separation logic framework, first at the byte level, and then at the
structured types level. Sect. 7 makes the connection to C, and, finally, Sect. 8
discusses how translation caching mechanisms can be integrated into the model.
2 Notation
adjoins a new element None to a type a. We use a option to model partial
functions, writing *a+ instead of Some a and a b instead of a ⇒ b option.
The Some constructor has an underspecified inverse called the, satisfying the *x +
= x. Lifting functions to the option type is achieved by
option-map = (λf y. case y of None ⇒ None | !x " ⇒ !f x ")
Function update is written f (x := y) where f :: a ⇒ b, x :: a and y :: b and
f (x → y) stands for f (x := Some y). Finite integers are represented by the
type a word where a determines the word length in bits. The type supports
the usual bit operations like left-shift (<<) and bitwise and (&&). The function
unat converts to natural numbers (u for unsigned). Separation logic uses the
concepts of disjoined maps ⊥ and map addition ++. They are defined below.
m 1 ⊥ m 2 ≡ dom m 1 ∩ dom m 2 = ∅
m 1 ++ m 2 ≡ λx . case m 2 x of None ⇒ m 1 x | !y" ⇒ !y"
class to represent these types. This section describes the abstract operations of
this class and its axioms. The first such operations are serialising and restoring
a value into and from bytes:
to-bytes :: t::mem-type ⇒ byte list from-bytes (to-bytes v ) = v
from-bytes :: byte list ⇒ t::mem-type
For a particular type, all values occupy the same, non-zero number of bytes in
memory. We will refer to the number of these bytes as the size. The length of a
type’s serialisation is equal to its size. The term TYPE( t ) of type t itself makes
an Isabelle type avaiable as term.
size-of :: t::mem-type itself ⇒ nat length (to-bytes v ) = size-of TYPE( t)
0 < size-of TYPE( t)
For treating types as first-class values, we require each to map to a unique tag:
type-tag :: t::mem-type itself ⇒ type-tag
In order to respect the alignment requirements of C types, mem-type instances
carry alignment information. Types may only be aligned to sizes which are di-
visors of both the physical and virtual address space sizes:
align-of :: t::mem-type itself ⇒ nat
align-of TYPE( a) dvd memory-size ∧ align-of TYPE( a) dvd addr-space-size
The model we present in this paper allows representation of all packed C types,
i.e. atomic types such as int, array, and structs without padding. Tuch’s work on
structured C types [23] demonstrates how to extend this model to allow padding.
4 Virtual Memory
This section defines addressing and pointer conventions and describes our ab-
stract interface to page table encodings.
where a is the underlying address size (e.g. 32 word for 32-bit) and p is a tag:
one of physical or virtual. For particular architectures, we instantiate addr-t into
specific virtual and physical addresses. For the ARMv6 both virtual and physical
addresses are 32-bit words, yielding the instantiations:
vaddr = (32 word, virtual) addr-t paddr = (32 word, physical) addr-t
ARMv6 is capable of natively addressing 8, 16 and 32 bit values in memory
(corresponding to char, short and int in C). We have shown that these are
instances of mem-type. We use addr-val (Addr a) = a to extract the address.
We now introduce our abstract interface to page table encodings. There are many
such possible encodings: one-level tables, fixed multi-level tables, variable-depth
guarded page tables or even just hash tables. Usually, mappings are encoded in
blocks of addresses (pages, superpages, etc.), which are hardware-defined. The
page table also encodes extra information such as permissions and hardware-
defined flags. We generalise our previous abstract page table interface [16] slightly
to accomodate multiple page sizes and briefly summarise the other definitions.
ptable-lift :: ( paddr val ) ⇒ base ⇒ vaddr paddr
ptable-trace :: ( paddr val ) ⇒ base ⇒ vaddr ⇒ paddr set
get-page :: ( paddr val ) ⇒ base ⇒ vaddr ⇒ a
We use ptable-lift to extract a virtual map from memory, ptable-trace to find
all the physical addresses used looking up a virtual to a physical address, and
get-page to find which page a virtual address is on including any machine-specific
flags (such as permissions) that might be attached to it. The types paddr and
vaddr represent physical and virtual pointers, while base says where we can
find the page table in physical memory (e.g. the root of a two-level page table).
We leave a for a generic representation of what a page is.
In order to reason about memory access in the presence of a page table, we
require page table functions to conform to the rules in Fig. 1. Firstly, changing
memory in areas not related to a page table lookup must not affect the lookup:
if evaluation of ptable-lift and ptable-trace succeeds on smaller heap , it will also
succeed on a larger one. This corresponds to the safety monotonicity property of
p ∈
/ ptable-trace h r vp ptable-lift h r vp = p
ptable-trace (h(p → v )) r vp = ptable-trace h r vp
p ∈
/ ptable-trace h r vp ptable-lift h r vp = p
ptable-lift (h(p → v )) r vp = p
ptable-lift (h 0 ++ h 1 ) r vp = p h0 ⊥ h1
ptable-lift h 0 r vp = p ∨ ptable-lift h 0 r vp = None
The function works by looking up a virtual address just like the ARM hardware.
First, we look at the top 12 bits of the address as an index into the page directory.
We then shift the index by 2 as each PDE is 4 bytes in size, add it to the base
address of the page directory (root ). We decode the PDE at this address to
decide what to do next: fail on invalid/reserved, pass through the base address
for sections/supersections, and go look in the second-level table in the case of a
PTE pointer. We omit the definitions of decode-pde and decode-pte; they work
as described in the ARMv6 manual [3]. Second-level lookup is defined similarly:
Starting at the physical address of the second-level table, we use the next 8 bits
of the virtual address (bits 12-19) as an index, decode the PTE there, fail on
invalid or return the base address of the frame along with its size.
Using get-frame, we can then implement the main lookup function ptable-lift
by masking out the appropriate bits from the virtual address and adding them
to the physical address of the frame:
addr-seq p 0 = []
addr-seq p (Suc n) = p·addr-seq (p + 1) n
The final function needed to instantiate the abstract page table model from
Sect. 4 is ptable-trace. The trace contains the bytes in any page directory or
table entry which has successfully contributed to looking up the virtual address:
Types, Maps and Separation Logic 283
ptable-trace h root vp ≡
let vp-val = addr-val vp; pd-idx-offset = vaddr-pd-index vp-val << 2;
pt-idx-offset = vaddr-pt-index vp-val << 2;
pd-touched = set (addr-seq (root + pd-idx-offset) 4);
pt-touched = λpt-base. set (addr-seq (pt-base + pt-idx-offset) 4)
in case decode-pde h (root + pd-idx-offset) of None ⇒ ∅
| !PageTablePDE pt-base" ⇒ pd-touched ∪ pt-touched pt-base
| !-" ⇒ pd-touched
We have proved that the ptable-lift, ptable-trace and get-page functions in this sec-
tion instantiate the abstract model from Sect. 4, including the axioms of Fig. 1.
Based on our abstact page table interface of Sect. 4, we can now construct a
separation logic framework for reasoning about pointer programs with types.
This framework is independent of the particular page table instantiation.
Separation logic [18] is a tool for conventiently reasoning about memory and
aliasing. It views memory as a partial heap from addresses to values, allowing
for predicates which precisely state which part of the heap they hold on. At
its core is the concept of separating conjunction: when the assertion P ∧∗ Q
holds on a heap, the heap can be split into two disjoint parts, where P holds
on one part and Q on the other. Predicates which precisely define the domain
of the heap they hold on allow for convenient local reasoning. This leads to the
concept of local actions and the frame rule: for an action f , we can conclude
{P ∧∗ R} f {Q ∧∗ R} from {P } f {Q} for any R. This expresses that the actions
of f are local to the heaps described by P and Q, and therefore cannot affect
any separate heap described by R. We also say that predicates consume parts of
the heap under separating conjunction, because other predicates cannot depend
on the same parts of this heap.
The basic assertion of separation logic is the maps-to arrow, holding on a
heap containing only one address-value pair. From this simple assertion, more
complex ones can be built. For a simple heap (paddr byte) it takes the form:
(address → value) h ≡ h address = value ∧ dom h = {address}
Under separating conjunc-
tion, it consumes address in
the heap. Tuch et al extend
this basic concept all the way
to reasoning about C code
with structures [23].
A naive addition of virtual
Fig. 3. The three maps-to assertions memory to separation logic
breaks the concept of separat-
ing conjunction, the frame rule, as well as the assumption of Tuch’s work of val-
ues being stored contiguously in the heap. In previous work [16], we addressed
the first two in a simplified setting. In this section, we solve them in a realistic
284 R. Kolanski and G. Klein
setting and extend them to reasoning about typed pointers. We introduce new
maps-to arrows, as well as a new, more complex state that we use instead of a
simple heap.
Our eventual goal is to be able to write the new arrows of Fig. 3 with physical
or virtual addresses on the left and complex, typed C values on the right. The
new arrows in Fig. 3 describe (from left to right): mappings from physical address
to value, from virtual to physical address, and from virtual address to value. The
next section will introduce arrows that allow raw, single bytes and explicit type
information on the right. The section after that will lift this information to allow
structured C types on the right.
Following Tuch et al and our own previous work, to support both types and
virtual memory, we annotate the heap with extra information, extending the
state for our assertions in a first step to:
(paddr type-data × byte) × ptable-base
where ptable-base is any extra information needed by the virtual memory sub-
system, such as the page table root (paddr in the case of ARMv6); type-data
annotates which higher-level type a byte is part of. On this level it is just passed
through, we will explain its purpose in Sect. 6.2.
For our maps-to assertions
to be useful in separation
logic, we must define which
parts of the heap they con-
sume (what their domain is).
Here we run into a problem,
illustrated in Fig. 4: two dis-
tinct virtual addresses map to
two values via distinct phys-
ical addresses, but using the
same page table entry for the Fig. 4. Two virtual addresses resolving through the
same page table entry
lookup. Writing to one vir-
tual pointer does not affect
the value at the other, so in this sense the two maps-to predicates are sepa-
rate. However, a single page table entry is involved in the lookup of both virtual
pointers. Under separating conjunction we can allow the entry to be consumed
by either mapping or neither mapping, but not both mappings. If one consumes
it, the other lacks information for a successful lookup. If neither consumes it,
we lose locality: we could state the entry is separate from both mappings even
though updating the entry can affect both virtual addresses!
The solution to this problem is to divide the page table entry up into two parts
and share the slices between the maps-to predicates involved in the separating
conjunction. This idea is similar to that of the fractional permission model of
Bornat [4], with three important differences. Firstly, we do not wish to perform
Types, Maps and Separation Logic 285
any explicit accounting of fractions in the most common case of the page table
not being modified. Secondly, the number of virtual addresses an entry can map
varies with the type of page table and the size of the mapped page. Thirdly,
we want to utilise rather than recreate the proofs about partial maps and map
disjunction in Isabelle/HOL. These issues are addressed by using a constant,
large-enough number of slices for entries in the heap and placing them in the
domain. The maximum useful number of slices is one entry mapping all virtual
addresses. Thus our final state for assertions is:
fheap-state = (paddr × vaddr type-data × byte) × ptable-base
We refer to the first component of this state as the typed, fragmented heap tfh.
With this new state, our physical memory maps-to predicate becomes:
p :→p v ≡ λ(h, r ). (∀ vp. h (p, vp) = !v ") ∧ dom h = {p} × U
Like the simple maps-to predicate shown earlier, the heap at address p evaluates
to value v. In the new state, it does so for all vp slices. The domain covers all
slices of p, i.e. the universal set U. This arrow works for the physical-to-value
level. To define the virtual-to-physical arrow, we use our abstract page table
interface. Unfortunately, this page table model knows nothing about slices and
type annotations. So, to perform a lookup on vp, we derive a view of the heap
tfh containing only slices associated with vp and discard type annotations:
h-view tfh vp ≡ option-map snd ◦ (λp. tfh (p, vp))
We can now define the virtual-to-physical arrow for mapping vp to p. It is just
a ptable-lift on a heap made of slices associated with vp. The assertion consumes
the vp slice of each byte used in its lookup, i.e. in ptable-trace:
vp :→v p ≡ λ(h,r ). let heap = h-view h vp; vmap = ptable-lift heap r
in vmap vp = !p" ∧ dom h = ptable-trace heap r vp × {vp}
The virtual-to-value mapping is then just the separating conjunction of virtual-
to-physical and physical-to-value.
vp :→ v ≡ λs. ∃ p. (vp :→v p ∧∗ p :→p v ) s
P ∧∗ Q ≡ λ(h, r ). ∃ h 0 h 1 . h 0 ⊥ h 1 ∧ h = h 0 ++ h 1 ∧ P (h 0 , r ) ∧ Q (h 1 , r )
For any of these levels, we can define the usual arrow variations [18]:
(p :→ –) s ≡ ∃ v . (p :→ v ) s (p :→ v ) s ≡ (p :→ v ∧∗ sep-true) s
(p :→ –) s ≡ ∃ v . (p :→ v ∧∗ sep-true) s sep-true ≡ λs. True
One property of this framework is that it is mostly independent of the value
space, the right-hand side of the maps-to arrows. Only in the interface to the page
table have we touched it at all, and then only to discard additional type informa-
tion. The basic assertions we get from this section are of the form vp :→ (b, t )
where b is the byte at virtual address vp, and t is the associated type annotation.
This section uses the arrows for bytes and type information we have just defined
to higher-level, typed assertions for any mem-type values. We define the concept
286 R. Kolanski and G. Klein
We can now define maps-to predicates on typed pointers. Like Tuch et al [24]
we employ an arbitrary guard on the pointer itself to enforce constraints such as
alignment. We have not found it necessary yet to let the guard depend on the
state, but this could be added easily. Compared to Tuch et al, lifting sequences
of bytes to structured values is much simpler, because we already have byte-level
assertions available. Between virtual and physical levels only the arrows differ.
g p →p v ≡ ptr-seq p TYPE( t) [:→p ] value-seq v !∧" (λs. g p)
g vp →v p ≡ ptr-seq vp TYPE( t) [:→v ] ptr-seq p TYPE( t) !∧" (λs. g vp)
g vp → v ≡ ptr-seq vp TYPE( t) [:→] value-seq v !∧" (λs. g vp)
7 Connecting with C
In this section, we will connect the framework to C and define loading and storing
of typed values in the program state. In the previous section, we have enriched
the usual C heap with additional information: slices for specifying the domain
of predicates under separating conjunction and type annotation information.
We therefore need to be careful to not introduce unwanted dependencies on the
additional information in the state and we need to make sure that C updates
operate consistently on the extended state. We formalise load and store for vir-
tually addressed access. Direct physical access would be similar, but simpler.
In C, loading and storing are total functions. Loading from a wrongly typed
or unmapped address or storing to it will produce garbage. For our intended
application (seL4), we do not need to model page faults directly, but we anno-
tate the C program with guards that make sure no page faults will occur. These
annotations are added automatically during the translation into Isabelle/HOL
and will produce proof obligations. Should a page-fault model be required for
different applications, it is easy to add: an access to an unmapped page, instead
of a guard, simply produces a branch to the page fault handler.
For a generic map h from pointers p to values, loading a mem-type value at p
is merely loading its size’s worth of sequential bytes starting at p (load-list-basic),
making sure h contains no gaps in that range (deoption-list) and passing it to
from-bytes from the type class interface.
load-list-basic h 0 p = []
load-list-basic h (Suc n) p = h p·load-list-basic h n (p + 1)
deoption-list xs ≡ if None ∈ set xs then None else !map the xs"
load-list h n p ≡ deoption-list (load-list-basic h n p)
load-value h p ≡ option-map from-bytes (load-list h (size-of TYPE( t)) p)
A pointer access in C is then just an application of load-value to the address-
space view of memory, ignoring any read failures. We drop the additional type
information that is only used in assertions, not in C, resulting in the heap type
load-value expects. The as-view function is similar to h-view, but uses ptable-lift
to arrive at a map from virtual addresses to values.
load-value-c s vp ≡ the (load-value (as-view s) (ptr-val vp))
As mentioned above, this function is total. The guard generated for each such
access is c-guard vp !→ –, ensuring that the load-value-c will produce a valid
result. The predicate c-guard p ensures that p is not Null and is correctly aligned
for its type size.
Heap updates are similar. For a single physical address, we update all slices at
that address and we leave the type annotation untouched. We can ignore entries
with None, because, again, the generated guard c-guard vp !→ – will ensure
this case does not occur. We then lift the single-byte update first to the virtual
layer to provide address translation via vmap-view, and then like in Tuch et al
to byte sequences to accomodate structured types.
288 R. Kolanski and G. Klein
tfheap-update tfh p v ≡
λppv . if fst ppv = p then option-map (λ(td , v ). (td , v )) (tfh ppv )
else tfh ppv
state-update-v s vp v ≡
case vmap-view s vp of None ⇒ s | !p" ⇒ (tfheap-update (fst s) p v , snd s)
state-update-v-list s [] = s
state-update-v-list s ((vp, v )·us) = state-update-v-list (state-update-v s vp v ) us
c-state-update vp v s ≡ state-update-v-list s (zip (ptr-seq vp TYPE( a1)) (to-bytes v ))
For interfacing to C code, we have adapted the C parser of Tuch et al [24]. It
translates a significant subset of the C99 programming language into SIMPL [19],
a generic, imperative language framework in Isabelle/HOL.
As in the framework by Tuch et al we cannot prove the frame rule generically,
but we can prove it automatically for each individual program. This automatic
proof ultimately reduces everything to valid memory accesses and updates, based
on the following rule:
(c-guard vp → – ∧∗ P ) s
∗
(c-guard vp → v ∧ P ) (c-state-update vp v s)
With P = sep-true, this rule becomes the state update rule by Tuch et al, corre-
sponding to the assignment axiom in standard separation logic. The correspond-
ing rule for memory access holds as well, of course:
(g vp → v ) s
load-value-c s vp = v
Fig. 5 shows an excerpt of typical page table manipulation code that this frame-
work can handle. The last line of this code, for instance, would be translated
into the following SIMPL statement with guard:
Guard C-Guard {|c-guard ´ptSlot|}
(´globals :== heap-upd (c-state-update ´ptSlot ´pte))
The heap-upd function updates the C heap (our extended state) which is merely
a global variable in the semantics of the C program. The guard statement Guard
throws the guard error C-Guard if the condition {|c-guard ´ptSlot |} is false, and
otherwise executes the statement. In previous work [16], we have conducted a
detailed case study demonstrating how page table manipulations can be verified
in this framework for a simple, one-level page table. Reasoning on the C and
ARM level has precisely the same structure, it just involves more detail.
Types, Maps and Separation Logic 289
8 Translation Caching
Page table lookups are expensive; they potentially involve multiple memory
reads. To decrease this cost, these lookups are cached in most architectures
in a translation lookaside buffer (TLB). Abstractly, the TLB can be seen as a
finite, small set of virtual-to-physical mappings. They may include lookups for
code instructions as well as data. It is architecture-dependent whether these are
handled separately from each other or not, how large the TLBs are, and when a
mapping is removed from the TLB and replaced by another. Most architectures
provide assembler instructions for explicitly removing all or specific mappings
from the TLB, which is called flushing.
Although the page table should ultimately define what a mapping is, the hard-
ware will always first consult the TLB and ignore the contents of the page table if
a TLB entry is found. When we change the page table and the TLB contains the
mapping being changed, we may introduce an inconsistency. This inconsistency
can be resolved by flushing the TLB such that the new page table contents will
be loaded for future lookups. However, indiscriminate TLB flushes are expen-
sive, because they will incur additional memory reads. Kernel programmers like
to optimise by deferring TLB flushes as far as possible and by making them as
specific as possibly.
In our model, we can add the TLB by reducing it to its safety-relevant content:
whether the lookup for any specific virtual address may be inconsistent or not.
What makes a TLB entry inconsistent is a change to the page table. We can
turn this view around and instead keep track of inconsistent page table entries
— those that have been written to since the last flush. We can reduce machinery
by not caring whether a memory location currently is a page table entry or not,
we just keep track of all locations that have been changed since the last TLB
flush. If any memory read or write involves a page table entry whose location is
in this set, the TLB might be inconsistent for this lookup. We can now generate
guards that test for this case and require us to prove its absence.
This TLB model intergrates nicely with separating conjunction, because the
set mentioned above can be implemented as an additional boolean next to the
type information on the right-hand side of the maps-to arrow. Apart from the
type, none of the generic framework definitions would need to change.
9 Related Work
Our work touches three main areas: separation logic, virtual memory, and C
verification. For an overview on OS verification in general, see Klein [14].
Separation logic was originally conceived by O’Hearn and Reynolds et al.
[12,18] and has been formalised in mechanised theorem proving systems be-
fore [25,1]. We enhance these models with the ability to reason about properties
on virtual memory, adding new basic predicates, but preserving the feel and
reasoning principles of separation logic.
290 R. Kolanski and G. Klein
References
1. Affeldt, R., Marti, N.: Separation logic in Coq (2008),
http://savannah.nongnu.org/projects/seplog
2. Alkassar, E., Schirmer, N., Starostin, A.: Formal pervasive verification of a pag-
ing mechanism. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS,
vol. 4963, pp. 109–123. Springer, Heidelberg (2008)
3. ARM Limited. ARM Architecture Reference Manual (June 2000)
4. Bornat, R., Calcagno, C., O’Hearn, P., Parkinson, M.: Permission accounting in
separation logic. In: Proc. 32nd POPL, pp. 259–270. ACM, New York (2005)
5. Calcagno, C., O’Hearn, P.W., Yang, H.: Local action and abstract separation logic.
In: Proc. 22nd LICS, pp. 366–378. IEEE Computer Society, Los Alamitos (2007)
6. Cohen, E., Moskal, M., Schulte, W., Tobies, S.: A precise yet efficient memory
model for C (2008),
http://research.microsoft.com/apps/pubs/default.aspx?id=77174
7. Dalinger, I., Hillebrand, M.A., Paul, W.J.: On the verification of memory man-
agement mechanisms. In: Borrione, D., Paul, W.J. (eds.) CHARME 2005. LNCS,
vol. 3725, pp. 301–316. Springer, Heidelberg (2005)
8. Elphinstone, K., Klein, G., Derrin, P., Roscoe, T., Heiser, G.: Towards a practical,
verified kernel. In: Proc. 11th HOTOS, pp. 117–122 (2007)
9. Filliâtre, J.-C., Marché, C.: Multi-prover verification of C programs. In: Davies, J.,
Schulte, W., Barnett, M. (eds.) ICFEM 2004. LNCS, vol. 3308, pp. 15–29. Springer,
Heidelberg (2004)
10. Fox, A.: Formal specification and verification of ARM6. In: Basin, D., Wolff, B.
(eds.) TPHOLs 2003. LNCS, vol. 2758, pp. 25–40. Springer, Heidelberg (2003)
11. Hillebrand, M.: Address Spaces and Virtual Memory: Specification, Implementa-
tion, and Correctness. PhD thesis, Saarland University, Saarbrücken (2005)
12. Ishtiaq, S.S., O’Hearn, P.W.: BI as an assertion language for mutable data struc-
tures. In: Proc. 28th POPL, pp. 14–26. ACM, New York (2001)
13. Programming languages—C, ISO/IEC 9899:1999 (1999)
14. Klein, G.: Operating system verification—An overview. Sādhanā 34(1), 27–69
(2009)
15. Klein, G., Tuch, H.: Towards verified virtual memory in L4. In: Slind, K. (ed.)
TPHOLs Emerging Trends 2004, Park City, Utah, USA (2004)
292 R. Kolanski and G. Klein
16. Kolanski, R., Klein, G.: Mapped separation logic. In: Shankar, N., Woodcock, J.
(eds.) VSTTE 2008. LNCS, vol. 5295, pp. 15–29. Springer, Heidelberg (2008)
17. Mürk, O., Larsson, D., Hähnle, R.: KeY-C: A tool for verification of C programs. In:
Pfenning, F. (ed.) CADE 2007. LNCS, vol. 4603, pp. 385–390. Springer, Heidelberg
(2007)
18. Reynolds, J.C.: Separation logic: A logic for shared mutable data structures. In:
Proc. 17th IEEE Symposium on Logic in Computer Science, pp. 55–74 (2002)
19. Schirmer, N.: Verification of Sequential Imperative Programs in Isabelle/HOL. PhD
thesis, Technische Universität München (2006)
20. Tews, H.: Formal methods in the Robin project: Specification and verification of
the Nova microhypervisor. In: C/C++ Verification Workshop, Technical Report
ICIS-R07015, Oxford, UK, July 2007, pp. 59–68. Radboud University Nijmegen
(2007)
21. Tews, H., Weber, T., Völp, M.: Formal memory models for the verification of low-
level operating-system code. JAR 42(2–4), 189–227 (2009)
22. Tuch, H.: Formal Memory Models for Verifying C Systems Code. PhD thesis, School
Comp. Sci. & Engin., University NSW, Sydney 2052, Australia (August 2008)
23. Tuch, H.: Formal verification of C systems code: Structured types, separation logic
and theorem proving. JAR 42(2–4), 125–187 (2009)
24. Tuch, H., Klein, G., Norrish, M.: Types, bytes, and separation logic. In: Hofmann,
M., Felleisen, M. (eds.) POPL 2007, pp. 97–108. ACM, New York (2007)
25. Weber, T.: Towards mechanized program verification with separation logic. In:
Marcinkowski, J., Tarlecki, A. (eds.) CSL 2004. LNCS, vol. 3210, pp. 250–264.
Springer, Heidelberg (2004)
Acyclic Preferences and
Existence of Sequential Nash Equilibria:
A Formal and Constructive Equivalence
Stéphane Le Roux,
1 Introduction
In game theory a few classes of games, together with related concepts, can model
a wide range of real-world competitive interactions between agents. Game theory
is applied to economics, biology, computer science, political science, etc.
Sequential games (a.k.a. games in extensive form) are a widely studied class
of games. They may help model games where agents play in turn, such as Chess
in [22]. Given an arbitrary set of outcomes, an (abstract) sequential game is a
finite rooted tree where each internal node is owned by an agent and each leaf
encloses an outcome. The left-hand game below involves agents a and b and
outcomes oc1 , oc2 and oc3 .
a a
b oc3 b oc3
oc1 oc2 oc1 oc2
http://www.lix.polytechnique.fr/Labo/Stephane.Leroux
Anonymous referees, especially one of them, made very constructive comments.
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 293–309, 2009.
c Springer-Verlag Berlin Heidelberg 2009
294 S. Le Roux
Informally, a play starts at the root. If the root is a leaf, the play ends with the
enclosed outcome; otherwise the root owner chooses in which child, i.e. subgame,
the play continues. However, the concept of play is not needed. A strategy profile
(profile for short) is a game where each internal node has chosen a child. Choices
of a profile induce a unique path from the root to a leaf, which induces a unique
outcome. The right-hand strategy profile above, where choices are represented
by double lines, induces outcome oc3 . The left-hand profile below induces oc1 .
An agent can convert a profile into another one by changing its own nodes’
choices. For instance agent a can convert the right-hand profile above into the
left-hand one below; and below, agent b can convert the left-hand one into the
right-hand one. Note that for each agent, convertibility is an equivalence relation.
a a
b oc3 b oc3
oc1 oc2 oc1 oc2
To each agent, an arbitrary binary relation over outcomes is given. It is called
the agent’s preference and it induces a relation over profiles via their induced
outcomes. Generally speaking, a Nash equilibrium (NE for short) is a situation,
i.e. a profile in the present case, that makes every agent happy, where an agent
is happy if it cannot convert the situation into another situation that it prefers.
This concept is defined in [13] and [14], and it captures the notion of NE in
different types of games.
Traditional game theory involves real-valued payoff functions instead of ab-
stract outcomes, i.e. functions mapping agents to real numbers. Agents implicitly
prefer greater payoffs. The profiles below involve payoff functions where the first
figure relates to agent a. The first profile is not a NE since agent a is not happy:
if it played according to the profile, it would get 2, whereas by changing its choice
from right to left it converts the profile into the right-hand one yielding payoff
3. The second profile below is a NE: by changing its choice agent a gets payoff
1 instead of 2 (so it is happy with the current profile), and b has no influence on
the induced outcome so it is happy too. The third profile below is also a NE.
a a a
b 2, 2 b 2, 2 b 2, 2
1, 0 3, 1 1, 0 3, 1 1, 0 3, 1
Subgame perfect equilibria [18] (SPE for short) are NE each of whose subgame
is also an SPE. The second profile above is not an SPE because the subgame
whose root is owned by agent b is not a NE. The third profile above is an SPE.
Kuhn [9] proved that all sequential games involving real-valued payoff func-
tions have NE. His proof uses a recursive procedure, called backward induction in
game theory, to build from each game an SPE (also NE). The backward induc-
tion on the left-hand game below starts by letting agent b (more generally agents
at nodes closer to the leaves) maximise its payoff, in the middle picture. Then
agent a (more generally agents at nodes closer to the root) maximises its payoff
according to what has been chosen by b (more generally in all the subtrees).
Acyclic Preferences and Existence of Sequential Nash Equilibria 295
a a a
b 2, 2 b 2, 2 b 2, 2
1, 0 3, 1 1, 0 3, 1 1, 0 3, 1
a a
a [1, 1] a x
[2, 2] [0, 3] z y
Nonetheless, Krieger [8] proved that every multi-criteria sequential game has
a NE. His proof uses probabilities and strategic games, a class of games into
which sequential games can be embedded. However, Krieger’s result still does
not account for all games with abstract outcomes and moreover his proof would
not be easily formalised. Fortunately a generalisation of both [8] and [16] is given
in [13] (ch. 4). Furthermore, instead of a mere sufficient condition on preferences
like in [16], the following three propositions are proved equivalent.
1. The preference of each agent is acyclic.
2. Every sequential game has a Nash equilibrium.
3. Every sequential game has a subgame perfect equilibrium.
Existence of Nash equilibria in the traditional and mutlicriteria frameworks
are direct corollaries of the theorem above. This is also true for other frameworks
of interest as detailed in [13] (ch. 4).
The result above may be proved via three implications. 3) ⇒ 2) by definition.
2) ⇒ 1) is proved by contraposition: let an agent a prefer x1 to x0 , x2 to x1 ,
and so on, and x0 to xn . The game displayed below has no Nash equilibrium.
a
x0 x1 . . . xn
The main implication is 1) ⇒ 3). It may be proved as follows: first, since the
preferences are acyclic, they can be linearly extended; second, following Kuhn’s
proof structure, for any game there exists an SPE w.r.t. the linear preferences;
third, this SPE is also valid w.r.t. the smaller original preferences. This triple
equivalence is formalised constructively in Coq (v8.1). In terms of proof burden,
the main implication is still 1) ⇒ 3). However, the existence of a linear extension
constitutes a substantial part1 of the formal proof. In the second proof step
described above one cannot merely follow Kuhn’s proof structure: the definitions
and proofs have to be completely rephrased (often inductively, as in [21]) and
simplified in order to keep things practical and clear. Also, the formalisation is
constructive: it yields an algorithm for computing equilibria, which is the main
purpose of algorithmic game theory for various classes of games. Here in addition,
the algorithm is certified since it was written in Coq.
This paragraph suggests that all the ingredients used in this generalisation
were already well-known, although not yet put together: Utility theory prescribes
embedding abstract outcomes and preferences into the real numbers and their
usual total order, thus performing more than a linear extension, whereas the
first proof step above is a mere linear extension; Choice theory uses abstract
preferences and is aware of property preservation by preference inclusion, as
1
The linear extension proof was also slightly modified to be part of the Coq-related
CoLoR library [4].
Acyclic Preferences and Existence of Sequential Nash Equilibria 297
invoked in the third proof step above. Also, [16] uses reflexive preference; Kreps
[7] uses irreflexive ones; this paper assumes nothing on preferences but the way
they relate to NE. Osborne and Rubinstein were most likely aware of the above-
mentioned facts and techniques, but their totally preordered preferences seem
utterly general and non-improvable at first glance. On contrary when considering
the inverse of the negation of their preferences (i.e. a strict weak order), one
sees that generality was just an illusion. Like natural languages, mathematical
notations structure and drive our thoughts! All this suggests that in general,
formalisation (and the related mindset) may not only provide a guarantee of
correctness but also help build a deeper insight of the field being formalised.
Two alternative proofs of this paper’s main result are mentioned below. Unlike
the first proof, they cannot proceed by structural induction on trees, but strong
induction on the number of internal nodes works well. One proof of 1) ⇒ 2) is
given in [13] (ch. 5). It uses only transitive closure instead of linear extension,
but the proof technique suits only NE, not SPE. The (polymorphic) proof below
works for both 1) ⇒ 2) and 1) ⇒ 3). Note that both alternative proofs show
that the notion of SPE is not required to prove NE existence.
Proof. It suffices to prove it for strict weak orders. Assume a game g with n + 1
internal nodes. (The 0 case is a leaf case.) Pick one whose children are all leaves.
This node is the root of g0 and is owned by a. Let x be an a-maximal outcome
occurring in g0 . In g replace g0 with a leaf enclosing x. This new game g has n
or less internal nodes, so there is a NE (resp. SPE) s for g . In s replace the leaf
enclosing x by a profile on g0 where a chooses a leaf enclosing x. This yields a
NE (resp. SPE) for g. (Consider happiness of agent a, then other agents.)
This diversity of proofs not only proposes alternative viewpoints on the structure
of sequential equilibria, but also constitutes a pool of reusable techniques for
generalising the result, e.g., in graphs as started in [13] (ch. 6 and 7). Nonetheless,
only the first proof has been formalised.
Section 2 summarises the Coq proof for topological sorting, which was also
published [12] as emerging trend in this conference. Section 3 deals with game
theory, and is also meant for readers that are not too familiar with Coq.
2 Topological Sorting
The calculus of binary relations was developed by De Morgan around 1860. Then
the notion of transitive closure of a binary relation (smallest transitive binary
relation including a given binary relation) was defined in different manners by
different people around 1890. See Pratt [17] for a historical account. In 1930,
Szpilrajn [20] proved that, assuming the axiom of choice, any partial order has a
linear extension, i.e., is included in some total order. The proof invokes a notion
close to transitive closure. In the late 1950’s, The US Navy [1] designed PERT
(Program Evaluation Research Task or Project Evaluation Review Techniques)
for management and scheduling purposes. This tool partly consists in splitting
298 S. Le Roux
a big project into small jobs on a chart and expressing with arrows when one
job has to be done before another one can start up. In order to study the re-
sulting directed graph, Jarnagin [15] introduced a finite and algorithmic version
of Szpilrajn’s result. This gave birth to the widely studied topological sorting,
which spread to the industry in the early 1960’s (see [10] and [5]). Some technical
details and computer-oriented examples can be found in Knuth’s book [6].
Section 2 summarises a few folklore results involving transitive closure and
linear extension. No proof is given, but hopefully the definitions, statements,
and explanations will help understand the overall structure of the development.
This section requires a basic knowledge of Coq and its standard library.
In the remainder of this section, A is a Set ; x and y have type A; R is a
binary relation over A; l is a list over A; and n is a natural number. A finite
“subset” of A is represented by any list involving all the elements of the subset.
For the sake of readability, types will sometimes be omitted according to the
above convention, even in formal statements where Coq could not infer them.
Proving constructively or computing properties about binary relations will
require the following definitions about excluded middle and decidability.
In the Coq development similar results were proved for both excluded middle and
decidability, but this section focuses on excluded middle. The main result of the
section, which is invoked in section 3, says that given a middle-excluding relation
(rel midex ), it is acyclic and equality on its domain is middle-excluding iff its
restriction to any finite set has a middle-excluding irreflexive linear extension.
Section 2.1 gives basic new definitions about lists, relations, and finite restric-
tions, as well as part of the required lemmas; section 2.2 gives the definition of
paths and relates it to transitive closure; section 2.3 designs increasingly complex
functions leading to linear extension and the main result.
The lemma below will be used to extract simple, i.e. loop-free, paths from paths
represented by lists. It says that if equality on A is middle-excluding and if an
element occurs in a list over A, then the list can be decomposed into three parts:
a list, one occurrence of the element, and a second list free of the element.
The predicate repeat free in Prop says that no element occurs more than once
in a list. It is defined by recursion and used in the lemma below that will help
prove that a simple path is not longer than the path from which it is extracted.
Acyclic Preferences and Existence of Sequential Nash Equilibria 299
2.2 Paths
Transitive closure will help guarantee acyclicity and build linear extensions. But
its definition is not convenient if a witness is needed. A path is a witness, i.e. a
list recording consecutive steps of a given relation. It is formally defined below.
(Note that if relations were decidable, is path could return a Boolean.) The next
two lemmas state an equivalence between paths and transitive closure.
Fixpoint is path R x y l {struct l } : Prop :=
match l with
— nil ⇒ R x y
— z ::l’ ⇒ R x z ∧ is path R z y l’
end.
Lemma clos trans path : ∀ x y, clos trans R x y → ∃ l, is path R x y l.
Lemma path clos trans : ∀ y l x, is path R x y l → clos trans R x y.
The next lemma states that a path can be transformed into a simple path. It
will help bound the computation of transitive closure for finite relations.
Lemma path repeat free length : eq midex → ∀ y l x, is path R x y l →
∃ l’, ¬In x l’ ∧ ¬In y l’ ∧ repeat free l’ ∧
length l’ ≤ length l ∧ incl l’ l ∧ is path R x y l’.
300 S. Le Roux
The predicate bounded path R n below says whether two given elements are
related by a path of length n or less. It is intended to abstract over the list
witness of a path while bounding its length, which transitive closure cannot do.
Inductive bounded path R n : A → A → Prop :=
— bp intro : ∀ x y l, length l ≤ n → is path R x y l → bounded path R n x y.
Below, two lemmas show that bounded path is weaker than clos trans in gen-
eral, but equivalent on finite sets for some bound.
Lemma bounded path clos trans : ∀ R n,
sub rel (bounded path R n) (clos trans R).
Lemma clos trans bounded path : eq midex → ∀ R l,
is restricted R l → sub rel (clos trans R) (bounded path R (length l )) .
The first lemma below states that if a finite relation is excluding middle, so
are its ”bounded transitive closures”. Thanks to this and the lemma above, the
second lemma states that it also holds for the transitive closure.
Lemma bounded path midex : ∀ R l n,
is restricted R l → rel midex R → rel midex (bounded path R n).
Lemma restricted midex clos trans midex : eq midex → ∀ R l,
rel midex R → is restricted R l → rel midex (clos trans R).
The following theorems state the equivalence between decidability of a relation
and uniform decidability of the transitive closures of its finite restrictions. Note
that decidable equality is required only for the second implication. These results
remain correct when considering excluded middle instead of decidability.
Theorem clos trans restriction dec R dec : ∀ R
(∀ l, rel dec (clos trans (restriction R l ))) → rel dec R.
Theorem R dec clos trans restriction dec : eq dec → ∀ R
rel dec R → ∀ l, rel dec (clos trans (restriction R l )).
This section presents a way of extending linearly an acyclic finite relation. (This
is not the fastest topological sort algorithm though.) The intuitive idea is to
repeat the following while it is possible: take the transitive closure of the rela-
tion and add an arc to the relation without creating 2-step cycles. Repetition
ensures saturation, absence of 2-step cycle ensures acyclicity, and finiteness en-
sures termination. In this section this idea is implemented through several stages
of increasing complexity. This not only helps describe the procedure clearly, but
it also facilitate the proof of correctness by splitting it into intermediate results.
Total relations will help define linear extensions. They are defined below.
Definition trichotomy R x y : Prop := R x y ∨ x =y ∨ R y x.
Definition total R l : Prop := ∀ x y, In x l → In y l → trichotomy R x y.
Acyclic Preferences and Existence of Sequential Nash Equilibria 301
The definition below adds an arc to a relation if not creating 2-step cycles.
Inductive try add arc R x y : A → A → Prop :=
— keep : ∀ z t, R z t → try add arc R x y z t
— try add : x =y → ¬R y x → try add arc R x y x y.
As stated below, try add arc creates no cycle in strict partial orders.
Lemma try add arc irrefl : eq midex → ∀ R x y,
transitive R → irreflexive R → irreflexive (clos trans (try add arc R x y)).
3 Sequential Games
Section 3.1 presents three preliminary concepts and a lemma on lists and pred-
icates; section 3.2 defines sequential games and strategy profiles; section 3.3
defines functions on games and profiles; section 3.4 defines the notions of pref-
erence, NE, and SPE; and section 3.5 shows that universal existence of these
equilibria is equivalent to acyclicity of preferences.
3.1 Preliminaries
The function listforall expects a predicate on a Set called A, and returns a
predicate on lists stating that all the elements in the list comply with the original
predicate. This will help define, e.g., the third concept below. It is recursively
defined along the inductive structure of the list argument. It is typed as follows.
listforall : (A → Prop) → list A → Prop
The function rel vector expects a binary relation and two lists over A and states
that the lists are element-wise related (which implies that they have the same
length). This will help define the convertibility between two profiles. It is recur-
sively defined along the first list argument and it is typed as follows.
rel vector : (A → A → Prop) → list A → list A → Prop
Given a binary relation, the predicate is no succ returns a proposition saying
that no element in a given list is the successor of a given element.
Definition is no succ P x l := listforall (fun y → ¬P x y) l.
The definition above help state lemma Choose and split that expects a decidable
relation over A and a non-empty list over A and splits it into one element and two
lists by choosing the first (from the head) element that is maximal among the
remaining elements. The example below involves divisibility over the naturals.
2 :: 3 :: 9 :: 4 :: 9 :: 6 :: 2 :: 16 :: nil
oc oc
a a
g r s t
l
In Coq, given two sets Outcome and Agent, sequential games and strategy
profiles are defined as below. (gL stands for game leaf and gN for game node.)
Variables (Outcome : Set )(Agent : Set ).
Inductive Game : Set :=
| gL : Outcome → Game
| gN : Agent → Game → list Game → Game.
Inductive Strat : Set :=
| sL : Outcome → Strat
| sN : Agent → list Strat → Strat → list Strat → Strat.
The induction principle that Coq automatically associates to games (resp.
profiles) ignores the inductive structure of lists. Mutually defining lists with
games may solve this but rules out using the Coq standard library for these new
lists. The principle stated below is built manually via a Fixpoint. There are four
premises: two for the horizontal induction along empty lists and compound lists,
and two for the vertical induction along leaf games and compound games.
Game ind2 : ∀ (P : Game → Prop) (Q : Game → list Game → Prop),
(∀ oc, P (gL oc)) →
(∀ g, P g → Q g nil ) →
(∀ g, P g → ∀ g’ l, Q g’ l → Q g’ (g :: l )) →
(∀ g l, Q g l → ∀ a, P (gN a g l )) →
∀ g, P g
In order to prove a property ∀ g : Game, P g with the induction principle
Game ind2, the user has to provide a predicate Q that is easily (yet appar-
ently not automatically in general) derived from P. Also note that the induction
principle for profiles requires one more premise since two lists are involved. These
principles are invoked in most of the proofs of the Coq development.
304 S. Le Roux
a
s
oc oc s
l l’
The function s2g expects a profile and returns its underlying game by forget-
ting the nodes’ choices. It will help state that given a game (and preferences),
a well-chosen profile is a NE for this game. s2g is computed by the 3-step rule
below. (The Coq definition is omitted.) Note that the two big steps are a case
splitting along the structure of the first list, i.e., whether it is empty or not.
s2g
a a
a a
s2g
a a
s2g
s0 :: l s l’ s2g s0
map s2g (l + +s :: l )
Osborne and Rubinstein’s results into the abstract sequential game formalism
of this paper. However, it is not even needed in the remainder.
General case. Until now equilibrium and related concepts have been defined
w.r.t. implicit given preferences, but from now these concepts take arbitrary
preferences as an explicit parameter. For instance, instead of writing Eq s, one
shall write Eq OcPref s to say that s is a NE with respect to the family OcPref.
The following lemma says two things: first, the equilibrium-hood of a pro-
file depends only on the restrictions of agent preferences to the outcomes that
are used in (the underlying game of) the profile; second, removing arcs from
agent preferences preserves equilibrium-hood (informally because less demand-
ing agents are more likely to be happy). A similar result holds for SPE.
Lemma Eq order inclusion : ∀ OcPref OcPref ’ s,
(∀ a, sub rel (restriction (OcPref a) (UsedOutcomes (s2g s))) (OcPref ’ a)) →
Eq OcPref ’ s → Eq OcPref s.
The theorem below constitutes the main implication of the triple equivalence
referred to in section 1. It generalises the results in [9], [16] and [8] to acyclic
preferences. It invokes theorem linearly extendable from section 2.3.
4 Conclusion
This paper and the related Coq development have thus abstracted, generalised,
formalised, and clarified existing game-theoretic results in [9], [16] and [8]: in-
stead of real-valued (vector) payoffs they refer to abstract outcomes; instead of
the usual order over the reals they refer to abstract preferences; instead of a mere
sufficient condition on preferences they prove a necessary and sufficient condi-
tion; instead of a set-theoretic approach they use an inductive-type approach.
The difficulties that were encountered in this proof are of two sorts: first, the
more general framework brings new issues, e.g. collecting the outcomes used in a
game, topological sorting, defining backward induction for arbitrary preferences;
second, the formal proof must cope with issues that were ignored in pen-and-
paper proofs, e.g. rigorous definitions, associated proof principles, underlying
game of a strategy profile. These second issues are already addressed in [21].
In the abstract-preference framework, it may sound tempting to prove for-
mally the necessary and sufficient condition only for binary trees and then argue
informally that the general case is similar or reducible to the binary case. Such
an argument may contradict the idea that formalisation is a useful step towards
guarantee of correctness and deeper insight: it ignores that general trees require
more complex representations and induction principles and that some property
holds for binary trees but not in general. Nonetheless the author believes that it
308 S. Le Roux
would be possible to first prove the result for binary trees and then reduce for-
mally general trees to binary trees. However this would require to define games,
profiles, convertibility, equilibria, etc., (but backward induction) for both the bi-
nary and the general settings, and it would still require the proof of topological
sorting: so it would be more complex than the proof discussed in this paper.
In section 3.3 convertibility between strategy profiles was defined as an induc-
tive type in Prop. This allows not even assuming decidability of agents’ equal-
ity. Alternatively, assuming decidability of agents’ equality would allow defining
convertibility as a recursive function onto Booleans. More generally, if the Coq
development were to be rephrased or generalised to graph structure instead of
trees (as in [13], chap. 6 and 7), it might be interesting to discharge part of
the proof burden on Coq’s computational ability and keep in the proof script
only the subtle reasoning. This may be achieved in part by using more recursive
Boolean functions at the acceptable expense of decidability assumptions.
References
15. Jarnagin, M.P.: Automatic machine methods of testing pert networks for con-
sistency. Technical Memorandum K-24/60, U. S. Naval Weapons Laboratory,
Dahlgren, Va (1960)
16. Osborne, M.J., Rubinstein, A.: A Course in Game Theory. The MIT Press, Cam-
bridge (1994)
17. Pratt, V.: Origins of the calculus of binary relations. In: Logic in Computer Science
(1992)
18. Selten, R.: Spieltheoretische Behandlung eines Oligopolmodells mit Nach-
frageträgheit. Zeitschrift für die desamte Staatswissenschaft 121 (1965)
19. Simon, H.A.: A behavioral model of rational choice. The Quarterly Journal of
Economics 69(1), 99–118 (1955)
20. Szpilrajn, E.: Sur l’extension de l’ordre partiel. Fund. Math. 16 (1930)
21. Vestergaard, R.: A constructive approach to sequential Nash equilibria. Information
Processing Letter 97, 46–51 (2006)
22. Zermelo, E.: Über eine Anwendung der Mengenlehre auf die Theorie des
Schachspiels. In: Proceedings of the Fifth International Congress of Mathemati-
cians, vol. 2 (1912)
Formalising FinFuns – Generating Code for
Functions as Data from Isabelle/HOL
Andreas Lochbihler
Abstract. FinFuns are total functions that are constant except for a fi-
nite set of points, i.e. a generalisation of finite maps. We formalise them
in Isabelle/HOL and present how to safely set up Isabelle’s code genera-
tor such that operations like equality testing and quantification on Fin-
Funs become executable. On the code output level, FinFuns are explicitly
represented by constant functions and pointwise updates, similarly to as-
sociative lists. Inside the logic, they behave like ordinary functions with
extensionality. Via the update/constant pattern, a recursion combinator
and an induction rule for FinFuns allow for defining and reasoning about
operators on FinFuns that directly become executable. We apply the ap-
proach to an executable formalisation of sets and use it for the semantics
for a subset of concurrent Java.
1 Introduction
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 310–326, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Formalising FinFuns 311
The very same problems reoccur when provably correct code from a formalisation
is to be extracted, although one is willing to commit more effort in adjusting
the formalisation and setting up the code generator for it in that case. To apply
quickcheck to their formalisations, end-users expect to supply little or no effort.
In the area of programming languages, states (like memories, stores, and
thread pools) are usually finite, even though the identifiers (addresses, variable
names, thread IDs, ...) are typically taken from an infinite pool. Such a state is
most easily formalised as a (partial) function from identifiers to values. Hence,
enumerating all threads or comparing two stores is not executable by default.
Yet, a finite set of identifier-value pairs could easily store such state informa-
tion, which is normally modified point-wisely. Explicitly using associative lists
in one’s formalisation, however, incurs a lot of work because one state has in
general multiple representations and AC1 unification is not supported.
For such kind of data, we propose to use a new type FinFun of total functions
that are constant except for finitely many points. They generalise maps, which
formally are total functions of type a ⇒ b option that map to None (“undefined”)
almost everywhere, in two ways: First, they can replace (total) functions of
arbitrary type a ⇒ b. Second, their default value is not fixed to a predetermined
value (like None ). Our main technical contributions are:1
1. On the code level, every FinFun is represented as explicit data via two
datatype constructors: constant FinFuns and pointwise update (cf. Sec. 2).
quickcheck is set up for FinFuns and working.
2. Inside the logic, FinFuns feel very much like ordinary functions (e.g. exten-
sionality: f = g ←→ (∀ x. f x = g x)) and are thus easily integrated into
existent formalisations. We demonstrate this in two applications (Sec. 5):
(a) A formalisation of sets as FinFuns allows sets to be represented explicitly
in the generated code.
(b) We report on our experience in using FinFuns to represent state informa-
tion for JinjaThreads [12], a semantics for a subset of concurrent Java.
3. Equality tests on, quantification over and other operators on FinFuns are all
handled by Isabelle’s new code generator (cf. Sec. 3).
4. All equations for code generation have passed through Isabelle’s inference
kernel, i.e., the trusted code base cannot be compromised by ad-hoc transla-
tions where constants in the logic are explicitly substituted by functions of
the target language.
5. A recursion combinator allows to directly define functions that are recursive
in an argument of FinFun type (Sec. 4).
1
The FinFun formalisation is available in the Archive of Formal Proofs [13].
312 A. Lochbihler
To start with, we construct the new type a ⇒f b for FinFuns. This type contains
all functions from a to b which map only finitely many points a :: a to some
value other than some constant b :: b, i.e. are constant except for finitely many
points. We show that all elements of this type can be built from two constructors:
The everywhere constant FinFun and pointwise update of a FinFun (Sec. 2.1).
Code generated for operators on FinFuns will be recursive via these two kernel
functions (cf. Sec. 2.2).
In Isabelle/HOL, a new type is declared by specifying a non-empty carrier set
as a subset of an already existent type. The new type for FinFuns is isomorphic
to the set of functions that deviate from a constant at only finitely many points:
Apart from the new type ( a, b) finfun (written a ⇒f b), this introduces
the set finfun :: ( a ⇒ b) set given on the right-hand side and the two bijection
functions Abs-finfun and Rep-finfun between the sets UNIV :: ( a ⇒f b) set and
finfun such that Rep-finfun is surjective and they are inverses of each other:
Having manually defined the type, we now show that every FinFun can be
generated from two kernel functions similarly to a datatype element from
its constructors: The constant function and pointwise update. For b:: b, let
K f b:: a ⇒f b represent the FinFun that maps everything to b. It is defined
by lifting the constant function λx:: a. b via Abs-finfun to the FinFun type. Sim-
ilarly, pointwise update finfun-update, written ( :=f ), is defined in terms of
pointwise function update on ordinary functions:
Note that these two kernel functions replace λ-abstraction of ordinary func-
tions. Since the code generator will internally use these two constructors to
represent FinFuns as data objects, proper λ-abstraction (via Abs-finfun) is not
executable and is therefore deprecated. Consequently, all executable operators
on FinFuns are to be defined (recursively) in terms of these two kernel func-
tions. On the logic level, λ-abstraction is of course available via Abs-finfun, but
it will be tedious to reason about such functions: Arbitrary λ-abstraction does
not guarantee the finiteness constraint in the type definition for a ⇒f b, hence
this constraint must always be shown separately.
We can now already define what function application on a ⇒f b will be,
namely Rep-finfun. To facilitate replacing ordinary functions with FinFuns in
existent formalisations, we write function applications as a postfix subscript f :
f̂ f a ≡ Rep-finfun f̂ a. This directly gives the kernel functions their semantics:
Moreover, we already see that extensionality for HOL functions carries over to
FinFuns, i.e. = on FinFuns does denote what it intuitively ought to:
f̂ = ĝ ←→ (∀ x. f̂ f x = ĝ f x) (5)
There are only few characteristic theorems about these two kernel functions.
In particular, they are not free constructors, as e.g. the following equalities hold:
This is natural, because FinFuns are meant to behave like ordinary functions and
these equalities correspond to the standard ones for pointwise update on ordinary
functions. Only K f is injective: (K f b) = (K f b ) ←→ b = b . From a logician’s
point of view, non-free constructors are not desirable because recursion and
case analysis becomes much more complicated. However, the savings in proof
automation that extensionality for FinFuns permit are worth the extra effort
when it comes to defining operators on FinFuns.
314 A. Lochbihler
More importantly, these two kernel functions exhaust the type a ⇒f b. This
is most easily stated by the following induction rule, which is proven by induction
on the finite set on which Rep-finfun ĝ does not take the default value:
∀ b. P (K f b) ∀ f̂ a b. P f̂ −→ P f̂ (a :=f b)
(9)
P ĝ
Intuitively, P holds already for all FinFuns ĝ if (i) P (K f b) holds for all constant
FinFuns K f b and (ii) whenever P f̂ holds, then P f̂ (a :=f b) holds, too. From
this, a case distinction theorem is easily derived:
Both induction rule and case distinction theorem are weak in the sense that
the f̂ in the case for point-wise update is quantified without further constraints.
Since K f and pointwise update are not distinct – cf. (6), proofs that do case
analysis on FinFuns must always handle both cases even for constant FinFuns.
Stronger induction and case analysis theorems could, however, be derived.
default value. Using ( :=f ) in the logic ensures that on the code level, every
FinFun is stored with as few updates as possible given the fixed default value.2
Let, e.g., f̂ = (K f 0)(|1 :=f 5|)(|2 :=f 6|). When f̂ is updated at 1 to 0, f̂ (1 :=f 0)
evaluates on the code level to (K f 0)(|2 :=f 6|), where all redundant updates at 1
have been removed. If the explicit code update function had been used instead,
the last update would have been added to the list of updates: f̂ (|1 :=f 0|) evaluates
to (K f 0)(|1 :=f 5|)(|2 :=f 6|)(|1 :=f 0|). Exactly this problem of superfluous updates
would occur if ( :=f ) was directly used as a constructor in the exported code.
In case this optimisation is undesired, one can use finfun-update-code instead
of finfun-update. Redundant updates in the representation on the code level can
subsequently be deleted by invoking the finfun-clearjunk operator: Semantically,
this is the identity function: finfun-clearjunk ≡ id, but it is implemented using
the following to equations that remove all redundant updates:
finfun-clearjunk (K f b) = (K f b) and finfun-clearjunk f̂ (|a :=f b|) = f̂ (a :=f b)
This operator is most useful when two FinFuns are to be combined pointwise
by some combinator h, which is then ◦f -composed with this diagonal operator:
Suppose, e.g., that f̂ and ĝ are two integer FinFuns and we need their pointwise
sum, which is (λ(x, y). x + y) ◦f (f̂ , ĝ)f , i.e. h is uncurried addition. The code
equations are straight forward again:
(K f b, K f c)f = K f (b, c) (15)
(K b, ĝ(|a :=f c|))f = (K f b, ĝ)f (|a :=f (b, c)|)
f
(16)
f f
(f̂ (|a :=f b|), ĝ) = (f̂ , ĝ) (a :=f (b, ĝ f a)) (17)
Formalising FinFuns 317
Clearly, finfun-All = ff-All [] holds. The extra list as keeps track of which points
have already been updated and can be ignored in recursive calls:
ff-All as (K f b) ←→ b ∨ set as = UNIV (18)
ff-All as P̂(|a :=f b|) ←→ (a ∈ set as ∨ b) ∧ ff-All (a·as) P̂ (19)
In the recursive case, the update a to b must either be overwritten by a previous
update (a ∈ set as ) or have b equal to True. Then, for the recursive call, a is
added to the list as of visited points. In the constant case, either the constant is
True itself or all points of the domain a have been updated (set as = UNIV ).
Via finfun-All = ff-All [], finfun-All is now executable, provided the test
set as = UNIV can be operationalised. Since as:: a list is a (finite) list, set as
is by construction always finite. Thus, for infinite domains a, this test always
fails. Otherwise, if a is finite, such a test can be easily implemented.
Note that this distinction can be directly made on the basis of type informa-
tion. Hence, we shift this subtle distinction into a type class such that the code
automatically picks the right implementation for set as = UNIV based on type
information. Axiomatic type classes [7] allow for HOL constants being safely
overloaded for different types and are correctly handled by Haftmann’s code
generator [6]. If the output language supports type classes like e.g. Haskell does,
this feature is directly employed. Otherwise, functions in generated code are
provided with an additional dictionary parameter that selects the appropriate
implementation for overloaded constants at runtime.
For our purpose, we introduce a new type class card-UNIV with one parameter
card-UNIV and the axiom that card-UNIV :: a itself ⇒ nat returns the cardinality
of a’s universe:
card-UNIV x = card UNIV (20)
By default, the cardinality of a type’s universe is just a natural number of type
nat, which itself is not related to a at all. Hence, card-UNIV takes an artificial
parameter of type a itself, where itself represents types at the level of values:
TYPE( a) is the value associated with the type a.
318 A. Lochbihler
As every HOL type is inhabited, card-UNIV TYPE( a) can indeed be used to
discriminate between types with finite and infinite universes by testing against 0:
finite (UNIV :: a set) ←→ 0 < card-UNIV TYPE( a)
Moreover, the test set as = UNIV can now be written as is-list-UNIV as with
is-list-UNIV as ≡
let c = card-UNIV TYPE ( a) in if c = 0 then False else |remdups as| = c
where remdups as removes all duplicates from the list as.
Note that the constraint (20) on the type class parameter card-UNIV, which
is to be overloaded, is purely definitional. Thus, every type could be made mem-
ber of the type class card-UNIV by instantiating card-UNIV to λa. card UNIV.
However, for executability, it must be instantiated such that the code generator
can generate code for it. This has been done for the standard HOL types like
unit, bool, char, nat, int, and a list, for which it is straightforward if one remem-
bers that card A = 0 for all infinite sets A. For the type bool, e.g., card-UNIV
a ≡ 2 for all a::bool itself . The cardinality of the universe for polymorphic type
constructors like e.g. a × b is computed by recursion on the type parameters:
card-UNIV TYPE( a × b) = card-UNIV TYPE( a) · card-UNIV TYPE( b)
We have similarly instantiated card-UNIV for the type constructors a ⇒ b,
a option and a + b.
As we have the universal quantifier finfun-All, the executable existential
quantifier is straightforward by duality: finfun-Ex P̂ ≡ ¬ finfun-All (Not ◦f P̂). As
before, the pretty-print syntax ∃ x. P x for Ex (λx. P x) in HOL cannot be trans-
ferred to FinFuns because λ-abstraction is not suited for code generation.
3.5 Complexity
In this section, we briefly discuss the complexity of the above operators. We
assume that equality tests require constant time. For a FinFun f̂, let #f̂ denote
the number of updates in its code representation. For an ordinary function g, let
#g denote the complexity of evaluating g a for any a.
K f has constant complexity as it is a finfun constructor. Since ( :=f )
automatically removes redundant updates (11, 12), f̂ ( :=f ) is linear in #f̂, and
so is application f̂ f (4). For g ◦f f̂, eq. (13) is recursive in f̂ and each recursion
step involves ( :=f ) and evaluating g, so the complexity is O((#f̂ )2 + #f̂ · #g).
For the product (f̂ , ĝ)f , we get: The base case (K f b, ĝ)f (15, 16) is linear in
#ĝ and we have #(K f b, ĝ)f = #ĝ. An update in the first parameter (f̂ (|a :=f b|),
ĝ)f (17) executes ĝ f a (O(#ĝ)), the recursive call and the update (O(#(f̂ , ĝ)f )).
Since there are #f̂ recursive calls and #(f̂ , ĝ)f ≤ #f̂ + #ĝ, the total complexity
is bound by O(#f̂ · (#f̂ + #ĝ)).
Since finfun-All is directly implemented in terms of ff-All, it is sufficient to anal-
yse the latter’s complexity: The base case (18) essentially executes is-list-UNIV.
If we assume that the cardinality of the type universe is computed in constant
time, is-list-UNIV as is bound by O(|as|2 ) since remdups as takes O(|as|2 ) steps. In
case of an update (19), the updated point is checked against the list as (O(|as|))
and the recursive call is executed with the list as being one element longer, i.e.
|as| grows by one for each recursive call. As there are #P̂ many recursive calls,
ff-All as P̂ has complexity #P̂ · O(#P̂ + |as|) + O((#P̂ + |as|)2 ) = O((#P̂ + |as|)2 ).
Hence, finfun-All P̂ has complexity O((#P̂)2 ).
Equality on FinFuns f̂ and ĝ is then straightforward (21): (f̂ , ĝ)f is in
O(#f̂ · (#f̂ + #ĝ)). Composing this with λ(x, y). x = y takes O((#(f̂ , ĝ)f )2 )
⊆ O((#f̂ + #ĝ)2 ). Finally, executing finfun-All is quadratic in #((λ(x, y). x = y)
◦f (f̂ , ĝ)f ) ≤ #(f̂ , ĝ)f . In total, f̂ = ĝ has complexity O((#f̂ + #ĝ)2 ).
4 A Recursion Combinator
In the previous section, we have presented several operators on FinFuns that
suffice for most purposes, cf. Sec. 5. However, we had to define function com-
position with FinFuns on either side and operations on products manually by
going back to the type’s carrier set finfun via Rep-finfun and Abs-finfun. This is
not only inconvenient, but also loses the abstraction from the details of the finite
set of updated points that FinFuns provide. In particular, one has to derive extra
recursion equations for the code generator and prove each of them correct.
Yet, the induction rule (9) states that the recursive equations uniquely deter-
mine any function that satisfies these. Operations on FinFuns could therefore
be defined by primitive recursion similarly to datatypes (cf. [2]). Alas, the two
FinFun constructors are not free, so not every pair of recursive equations does
indeed define a function. It might also well be the case that the equations are
contradictory: For example, suppose we want to define a function count that
counts the number of updates, i.e. count (K f c) = 0 and count f̂ (|a :=f b|) = count
f̂ + 1. Such a function does not exist for FinFuns in Isabelle, although it could
320 A. Lochbihler
be defined in Haskell to, e.g., compute extra-logic data such as memory con-
sumption. Take, e.g., f̂ ≡ (K f 0)(|0 :=f 0|). Then, count f̂ = count (K f 0) + 1 = 1,
but f̂ = (K f 0) by (6) and thus count f̂ = 0 would equally have to hold, because
equality is congruent w.r.t. function application, a contradiction.
u a b (c b) = c b (24)
u a b (u a b (c b)) = u a b (c b) (25)
a = a −→ u a b (u a b d) = u a b (u a b d) (26)
finite UNIV −→ fold (λa. u a b ) (c b) UNIV = c b (27)
Eq. (24), (25), and (26) naturally reflect the equalities between the constructors
from (6), (7), and (8), respectively. It is sufficient to restrict overwriting updates
(25) to constant FinFuns because the general case directly follows from this by
induction and (26). The last equation (27) arises from the identity
finite UNIV −→ fold (λa f̂ . f̂ (a :=f b )) (K f b) UNIV = (K f b ). (28)
Eq. (24), (25), and (26) are sufficient for proving (23). For a FinFun operator
like ◦f , these constraints must be shown for specific c and u, which is usually
completely automatic. Even though (27), which is required to deduce (22), must
usually be proven by induction, this normally is also automatic, because for finite
types a, a ⇒ b and a ⇒f b are isomorphic via Abs-finfun and Rep-finfun.
5 Applications
In this section, we present two applications for FinFuns to demonstrate that the
operations from Sec. 3 form a reasonably complete set of abstract operations.
1. They can be used to represent sets as predicates with the standard opera-
tions all being executable: membership and subset test, union, intersection,
complement and bounded quantification.
2. FinFuns have been inspired by the needs of JinjaThreads [12], which is a
formal semantics of multithreaded Java in Isabelle. We show how FinFuns
prove essential on the way to generating an interpreter for concurrent Java.
However, if we were to reason with them directly, most theorems about sets
(as predicates) would have to be replicated for FinFuns. Although this would be
straightforward, loads of redundancy would be reintroduced this way. Instead,
we propose to inject FinFun sets via f into ordinary sets and use the standard
operations on sets to work with them. The code generator is set up such that
it preprocesses all equations for code generation and automatically replaces set
operations with their FinFun equivalents by unfolding equations such as Af ⊆ B f
←→ A ⊆f B and Af ∪ B f = (A ∪f B)f . This approach works for quickcheck , too.
Besides the above operations, bounded quantification is also straightforward:
finfun-Ball  P ≡ ∀ x∈Âf . P x and finfun-Bex  P ≡ ∃ x∈Âf . P x
Clearly, they are not executable right away. Take, e.g., Â = (K f True), i.e. the
universal set, then finfun-Ball  P ←→ (∀ x. P x), which is undecidable if x ranges
over an infinite domain. However, if we go for partial correctness, correct code can
be generated: Like for the universal quantifier finfun-All for FinFun predicates
(cf. Sec. 3.3), ff-Ball is introduced which takes an additional parameter xs to
remember the list of points which have already been checked at previous calls.
ff-Ball xs  P ≡ ∀ a∈Âf . a ∈ set xs ∨ P a.
This now permits to set up recursive equations for the code generator:
ff-Ball xs (K f b) P ←→ ¬ b ∨ set xs = UNIV ∨ loop (λu. ff-Ball xs (K f b) P)
ff-Ball xs Â(|a :=f b|) P ←→ (a ∈ set xs ∨ (b −→ P a)) ∧ ff-Ball (a·xs) Â P
In the constant case, if b is false, i.e. the set is empty, ff-Ball holds; similarly, if
all elements of the universe have been checked already, this test is again imple-
mented by the overloaded term is-list-UNIV xs (Sec. 3.3). Otherwise, one would
have to check whether P holds at all points except xs, which is not computable
for arbitrary P and a. Thus, instead of evaluating its argument, the code for
loop never terminates. In Isabelle, however, loop is simply the unit-lifted identity
function: loop f ≡ f (). Of course, an exception could equally be raised in place of
non-termination. The bounded existential quantifier is implemented analogously.
5.2 JinjaThreads
Jinja [9] is an executable formal semantics for a large subset of Java source-
code and bytecode in Isabelle/HOL. JinjaThreads [11] extends Jinja with Java’s
thread features on both levels. It contains a framework semantics which inter-
leaves the individual threads whose small-step semantics is given to it as a pa-
rameter. This framework semantics takes care of all management issues related
to threads: The thread pool itself, the lock state, monitor wait sets, spawning
and joining a thread, etc. Individual threads communicate via the shared mem-
ory with each other and via thread actions like Lock, Unlock, Join, etc. with the
framework semantics. At every step, the thread specifies which locks to acquire
or release how many times, which thread to create or join on. In our previous
work [12], this communication was modelled as a list of such actions, and a lot
Formalising FinFuns 323
of pointless work went into identifying permutations of such lists which are se-
mantically equivalent. Therefore, this has been changed such that every lock of
type l now has its own list. Since only finitely many locks need to be changed
in any single step, these lists are stored in a FinFun such that checking whether
a step’s actions are feasible in a given state is executable.
Moreover, in developing JinjaThreads, we have found that most lemmas about
the framework semantics contain non-executable assumptions about the thread
pool or the lock state, in particular universal quantifiers or predicates defined in
terms of them. Therefore, we replaced ordinary functions that model the lock
state (type l ⇒ t lock ) and the thread pool (type t ( x, l) thread ) with
FinFuns. Rewriting the existing proofs took very little effort because mostly,
only f s in subscript or superscript had to be added to the proof texts because
Isabelle’s simplifier and classical reasoner are set up such that FinFuns indeed
behave like ordinary functions.
Not to break the proofs, we did not remove the universal quantifiers in the
definitions of predicates themselves, but provided simple lemmas to the code
generator. For example, locks-ok ls t las checks whether all lock requests las of
thread t can be met in the lock state ls and is defined as locks-ok ls t las ≡ ∀ l.
lock-ok (ls f l) t (las f l), whereas the equation for code generation is
References
1. Berghofer, S., Nipkow, T.: Random testing in Isabelle/HOL. In: Proc. SEFM 2004,
pp. 230–239. IEEE Computer Society, Los Alamitos (2004)
2. Berghofer, S., Wenzel, M.: Inductive datatypes in HOL – lessons learned in formal-
logic engineering. In: Bertot, Y., Dowek, G., Hirschowitz, A., Paulin, C., Théry, L.
(eds.) TPHOLs 1999. LNCS, vol. 1690, pp. 19–36. Springer, Heidelberg (1999)
3. Berghofer, S., Nipkow, T.: Executing higher order logic. In: Callaghan, P., Luo, Z.,
McKinna, J., Pollack, R. (eds.) TYPES 2000. LNCS, vol. 2277, pp. 24–40. Springer,
Heidelberg (2002)
4. Collins, G., Syme, D.: A theory of finite maps. In: Schubert, E.T., Alves-Foss, J.,
Windley, P. (eds.) HUG 1995. LNCS, vol. 971, pp. 122–137. Springer, Heidelberg
(1995)
5. Dybjer, P., Haiyan, Q., Takeyama, M.: Combining testing and proving in dependent
type theory. In: Basin, D., Wolff, B. (eds.) TPHOLs 2003. LNCS, vol. 2758, pp.
188–203. Springer, Heidelberg (2003)
6. Haftmann, F., Nipkow, T.: A code generator framework for Isabelle/HOL. Techni-
cal Report 364/07, Dept. of Computer Science, University of Kaiserslautern (2007)
7. Haftmann, F., Wenzel, M.: Constructive type classes in Isabelle. In: Altenkirch,
T., McBride, C. (eds.) TYPES 2006. LNCS, vol. 4502, pp. 160–174. Springer,
Heidelberg (2007)
8. Harrison, J.: Metatheory and reflection in theorem proving: A survey and critique.
Technical Report CRC-053, SRI International Cambridge Computer Science Re-
search Centre (1995)
9. Klein, G., Nipkow, T.: A machine-checked model for a Java-like language, virtual
machine and compiler. ACM TOPLAS 28, 619–695 (2006)
10. Krauss, A.: Partial recursive functions in higher-order logic. In: Furbach, U.,
Shankar, N. (eds.) IJCAR 2006. LNCS, vol. 4130, pp. 589–603. Springer, Hei-
delberg (2006)
11. Lochbihler, A.: Jinja with threads. The Archive of Formal Proofs. Formal proof
development (2007), http://afp.sf.net/entries/JinjaThreads.shtml
12. Lochbihler, A.: Type safe nondeterminism - a formal semantics of Java threads. In:
FOOL 2008 (2008)
13. Lochbihler, A.: Code generation for functions as data. The Archive of Formal
Proofs. Formal proof development (2009),
http://afp.sf.net/entries/FinFun.shtml
14. Nipkow, T., Paulson, L.C.: Proof pearl: Defining functions over finite sets. In: Hurd,
J., Melham, T. (eds.) TPHOLs 2005. LNCS, vol. 3603, pp. 385–396. Springer,
Heidelberg (2005)
326 A. Lochbihler
15. Nipkow, T., Pusch, C.: AVL trees. The Archive of Formal Proofs. Formal proof
development (2004), http://afp.sf.net/entries/AVL-Trees.shtml
16. Urban, C.: Nominal techniques in Isabelle/HOL. Journal of Automatic Reason-
ing 40(4), 327–356 (2008)
17. Urban, C., Berghofer, S.: A recursion combinator for nominal datatypes implemen-
ted in Isabelle/HOL. In: Furbach, U., Shankar, N. (eds.) IJCAR 2006. LNCS,
vol. 4130, pp. 498–512. Springer, Heidelberg (2006)
A Notation
Isabelle/HOL formulae and propositions are close to standard mathematical
notation. This subsection introduces non-standard notation, a few basic data
types and their primitive operations.
Types is the set of all types which contains, in particular, the type of truth
values bool, natural numbers nat, integers int, and the singleton type unit with its
only element (). The space of total functions is denoted by a ⇒ b. Type variables
are written a, b, etc. The notation t::τ means that the HOL term t has type τ .
Pairs come with two projection functions fst and snd. Tuples are identified
with pairs nested to the right: (a, b, c) is identical to (a, (b, c)) and a × b × c
to a × ( b × c). Dually, the disjoint union of a and b is written a + b.
Sets are represented as predicates (type a set is shorthand for a ⇒ bool ), but
follow the usual mathematical conventions. UNIV :: a set is the set of all elements
of type a. The image operator f ‘ A applies the function f to every element of
A, i.e. f ‘ A ≡ {y | ∃ x∈A. y = f x}. The predicate finite on sets characterises all
finite sets. card A denotes the cardinality of the finite set A, or 0 if A is infinite.
fold f z A folds a left-commutative3 function f :: a ⇒ b ⇒ b over a finite set A ::
a set with initial value z:: b.
Lists (type a list) come with the empty list [] and the infix constructor ·
for consing. Variable names ending in “s” usually stand for lists and |xs| is the
length of xs. The function set converts a list to the set of its elements.
Function update is defined as follows: Let f :: a ⇒ b, a:: a and b:: b. Then
f (a := b) ≡ λx. if x = a then b else f x.
The option data type a option adjoins a new element None to a type a. All
existing elements in type a are also in a option, but are prefixed by Some. For
succinctness, we write !a" for Some a. Hence, for example, bool option has the
values None, !True" and !False".
Partial functions are modelled as functions of type a ⇒ b option where None
represents undefined and f x = !y" means x is mapped to y. Instead of a ⇒ b
option, we write a b and call such functions maps. f (x → y) is shorthand for
f (x := !y"). The domain of f (written dom f ) is the set of points at which f is
defined, ran f denotes the range of f. The function map-default b f takes a partial
function f and continues it at its undefined points with b.
The definite description ιx. Q x is known as Russell’s ι-operator. It denotes
the unique x such that Q x holds, provided exactly one exists.
3
f is left-commutative, if it satisfies f x (f y z) = f y (f x z) for all x, y, and z.
Packaging Mathematical Structures
1 Introduction
Large developments of formalized mathematics demand a careful organization.
Fortunately mathematical theories are quite organized, e.g., every algebra text-
book [1] describes a hierarchy of structures, from monoids and groups to rings
and fields. There is a substantial literature [2,3,4,5,6,7] devoted to their formal-
ization within formal proof systems.
In spite of this body of prior work, however, we have found it difficult to make
practical use of the algebraic hierarchy in our project to formalize the Feit-
Thompson Theorem in the Coq system; this paper describes some of the prob-
lems we have faced and how they were resolved. The proof of the Feit-Thompson
Theorem covers a broad range of mathematical theories, and organizing this for-
malization into modules is central to our research agenda. We’ve developed[8] an
extensive set of modules for the combinatorics and set and group theory required
for the “local analysis” part of the proof, which includes a rudimentary algebraic
hierarchy needed to support combinatorial summations[9].
Extending this hierarchy to accommodate the linear algebra, Galois theory
and representation theory needed for the “character theoretic” part of the proof
has proved problematic. Specifically, we have found that well-known encodings
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 327–342, 2009.
c Springer-Verlag Berlin Heidelberg 2009
328 F. Garillot et al.
of algebraic structures using dependent types and records [2] break down in the
face of complexity; we address this issue in section 2 of this paper.
Many of the cited works focused on the definition of the hierarchy rather than
its use, making simplifying assumptions that would have masked the problems
we encountered. For example some assume that only one or two structures are
involved at any time, or that all structures are explicitly specified. The examples
in section 4 show that such assumptions are impractical: they involve several
different structures, often within the same expression, and some of which need
to be synthesized for existing types.
We have come to realize that algebraic structures are not “modules” in the
software engineering sense, but rather “interfaces”. Indeed, the mathematical
theory of, say, an abstract ring, is fairly thin. However, abstract rings provide
an interface that allows “modules” with actual contents, such as polynomials
and matrices, to be defined and, crucially, composed. The main function of an
algebraic structure is to provide common notation for expressions and for proofs
(e.g., basic lemmas) to facilitate the composition and application of these generic
modules. Insisting that an interface be instantiated explicitly each time it is used
negates this function, so it is critical that structures be inferred on the fly; we’ll
see in the next section how this can be accomplished.
Similarly, we must ensure that our algebraic interfaces are consistent with
the other modules in our development: in particular they should integrate the
existing combinatoric interfaces [8], as algebra requires equality. As described in
section 3, we have therefore adapted classical algebra to our constructive com-
binatorics. In addition to philosophical motivations (viz., allowing constructive
proof of a finitary result like the Feit-Thompson Theorem), we have practical uses
for a constructive framework: it provides basic but quite useful proof automa-
tion, via the small-scale reflection methodology supported by the SSReflect
extension to Coq [10].
Due to space constraints, we will assume some familiarity with the Coq type
system [11] (dependent types and records, proof types, type inference with im-
plicit terms and higher-order resolution) in section 2, and with the basic design
choices in the Feit-Thompson Theorem development [8] (boolean reflection, con-
crete finite sets) in sections 3 and 4.
2 Encoding Structures
2.1 Mixins
An algebraic or combinatorial structure comprises representation types (usually
only one), constants and operations on the type(s), and axioms satisfied by the
operations. Within the propositions-as-types framework of Coq, the interface
for all of these components can be uniformly described by a collection of depen-
dent types: the type of operations depends on the representation type, and the
statement (also a “type”) of axioms depends on both the representation type and
the actual operations.
For example, a path in a combinatorial graph amounts to
– a representation type T for nodes
– an edge relation e : rel T
Packaging Mathematical Structures 329
– an initial node x0 : T
– the sequence p : seq T of nodes that follow x0
– the axiom pP : path e x0 p asserting that e holds pairwise along x0 :: p.
The path “structure” is actually best left unbundled, with each component being
passed as a separate argument to definitions and theorems, as there is no one-to-
one relation between any of the components (there can be multiple paths with the
same starting point and relation, and conversely a given sequence can be a path
for different relations). Because it depends on all the other components, only the
axiom pP needs to be passed around explicitly; type inference can figure out T ,
e, x0 and p from the type of pP , so that in practice the entire path “structure”
can be assimilated to pP .
While this unbundling allows for maximal flexibility, it also induces a prolifer-
ation of arguments that is rapidly overwhelming. A typical algebraic structure,
such as a ring, involves half a dozen constants and even more axioms. More-
over such structures are often nested, e.g., for the Cayley-Hamilton theorem one
needs to consider the ring of polynomials over the ring of matrices over a gen-
eral commutative ring. The size of the terms involved grows as C n , where C is
the number of separate components of a structure, and n is the structure nest-
ing depth. For Cayley-Hamilton we would have C = 15 and n = 3, and thus
terms large enough to make theorem proving impractical, given that algorithms
in user-level tactics are more often than not nonlinear.
Thus, at the very least, related operations and axioms should be packed using
Coq’s dependent records (Σ-types); we call such records mixins. Here is, for
example, the mixin for a Z-module, i.e., the additive group of a vector space or
a ring:
Module Zmodule.
Record mixin_of (M : Type) : Type := Mixin {
zero : M; opp : M -> M; add : M -> M -> M;
_ : associative add; _ : commutative add;
_ : left_id zero add; _ : left_inverse zero opp add
}. ...
End Zmodule.
Here we are using a Coq Module solely to avoid name clashes with similar mixin
definitions.
Note that mixins typically provide only part of a structure; for instance a ring
structure would actually comprise a representation type and three mixins: one
for equality, one for the additive group, and one for the multiplicative monoid
together with distributivity. A mixin can depend on another one: e.g., the ring
multiplicative mixin depends on the additive one for its distributivity axioms.
Since types don’t depend on mixins (it’s the converse) type inference usually
cannot fill in omitted mixin parameters; however, the type class mechanism of
Coq 8.2 [12] can do so by running ad hoc tactics after type inference.
that will fail dramatically when used in practice for even moderate values of n.
The only case when this does not occur is with C = 1 — when each structure
is encapsulated into a single object. Thus, in addition to aesthetics, there is a
strong pragmatic rationale for achieving full encapsulation.
While mixins provide some degree of packaging, it falls short of C = 1.
However, mixins require one object per level in the structure hierarchy. This
is far from C = 1 because theorem proving requires deeper structure hierarchies
than programming, as structures with identical operations can differ by axioms;
indeed, despite our best efforts, our algebraic hierarchy is nine levels deep.
For the topmost structure in the hierarchy, encapsulation just amounts to
using a dependent record to package a mixin with its representation type. For
example, the top structure in our hierarchy, which describes a type with an
equality comparison operation (see [8]), could be defined as follows:
Module Equality.
Record mixin_of (T : Type) : Type :=
Mixin {op : rel T; _ : forall x y, reflect (x = y) (op x y)}.
Structure type : Type :=
Pack {sort :> Type; mixin : mixin_of sort}.
End Equality.
Notation eqType := Equality.type.
Notation EqType := Equality.Pack.
Definition eq_op T := Equality.op (Equality.mixin T).
Notation "x == y" := (@eq_op _ x y).
Coq provides two features that support this style of interface, Coercion and
Canonical Structure. The sort :> Type declaration above makes the sort pro-
jection into a coercion from type to Type. This form of explicit subtyping allows
any T : eqType to be used as a Type, e.g., the declaration x : T is understood
as x : sort T . This allows x == x to be understood as @eq_op T x x by simple
first-order unification in the Hindley-Milner type inference, as @eq_op α expects
arguments of type sort α.
Coercions are mostly useful for establishing generic theorems for abstract
structures. A different mechanism is needed to work with specific structures and
types, such as integers, permutations, polynomials, or matrices, as this calls for
construing a more specific Type as a structure object (e.g., an eqType): coercions
and more generally subtyping will not do, as they are constrained to work in the
opposite direction.
Coq solves this problem by using higher-order unification in combination
with Canonical Structure hints. For example, assuming int is the type of signed
integers, and given
2.3 Telescopes
This makes zmodType a subtype of eqType and (transitively) of Type, and allows
for the declaration of generic operator syntax (0, x + y, −x, x − y, x ∗ i), and
the declaration of canonical structures such as
Canonical Structure int_zmodType := Zmodule.Pack int_zmodMixin.
Many authors [2,13,7,5] have formalized an algebraic hierarchy using such nested
packed structures, which are sometimes referred to as telescopes [14], the term
we shall use henceforth.
As the coercion of a telescope to a representation Type is obtained by transitiv-
ity, it comprises a chain of elementary coercions: given T : zmodType, the declara-
tion x : T is understood as x : Equality.sort(Zmodule.sort T ). It is this explicit
chain that drives the resolution of higher-order unification problems and allows
structure inference for specific types. For example, the implicit α : zmodType in
the term 2 + 2 is resolved as follows: first Hindley-Milner type inference gener-
ates the constraint Equality.sort(Zmodule.sort α) ≡βιδ int. Coq then looks
up the Canonical Structure int_eqType declaration associated with the pair
(Equality.sort, int), reduces the constraint to Zmodule.sort α ≡βιδ int_eqType
which it solves using the Canonical Structure int_zmodType declaration asso-
ciated with the pair (Zmodule.sort, int_eqType). Note that int_eqType is an
eqType, not a Type: canonical projection values are not restricted to types.
Although this clever double use of coercion chains makes telescopes the sim-
plest way of packing structure hierarchies, it raises several theoretical and prac-
tical issues for deep or complex hierarchies.
Perhaps the most obvious one is that telescopes are restricted to single inher-
itance. While multiple inheritance is rare, it does occur in classical algebra, e.g.,
rings can be unitary and/or commutative. possible to fake multiple inheritance
by extending one base structure with the mixin of a second one (similarly to
what we do in Section 3.2), provided this mixin was not inlined in the definition
of the second base structure.
A more serious limitation is that the head constant of the representation type
of any structure in the hierarchy is always equal to the head of the coercion chain,
i.e., the Type projection of the topmost structure (Equality.sort here). This is
a problem because for both efficiency and robustness, coercions and canonical
projections for a type are determined by its head constant, and the topmost
projection says very little about the properties of the type (e.g., only that it has
equality, not that it is a ring or field).
There is also a severe efficiency issue: the complexity of Coq’s term compar-
ison algorithm is exponential in the length of the coercion chain. While this is
clearly a problem specific to the current Coq implementation, it is hard and
unlikely to be resolved soon, so it seems prudent to seek a design that does not
run into it.
332 F. Garillot et al.
type
Zmod
Class
Zmod
Fig. 1. Telescopes for Equality and Fig. 2. Packed class for Zmodule
Zmodule
The definitions of the class_of and type records are straightforward; unpack
is a general dependent destructor for cT : type whose type is expressed in terms
of sort cT and class cT. Almost all of the code is fixed by the design pattern1 ;
indeed the definitions of type and unpack are literally identical for all packed
classes, while usually only the name of the parent class module (here, Equality)
changes in the definitions of class_of and pack.
Indeed, the code assumes that Module Equality is similarly defined. Because
Equality is a top structure, the definitions of class_of and pack in Equality
reduce to
1
It is nevertheless impractical to use the Coq Module construct to package these three
fixed definitions, because of its verbose syntax and technical limitations.
Packaging Mathematical Structures 333
While Pack is the primitive constructor for type, the usual constructor is pack,
whose only explicit argument is a Z-module mixin: it uses Equality.unpack to
break the packed eqType supplied by type inference into a type and class, which
it combines with the mixin to create the packed zmodType class. Note that pack
ensures that the canonical Type projections of the eqType and zmodType structure
are exactly equal.
The inconspicuous Canonical Structure Zmodule.eqType declaration is the
keystone of the packed class design, because it allows Coq’s higher order unifica-
tion to unify Equality.sort and Zmodule.sort. Note that, crucially, int_eqType
and Zmodule.eqType int_zmodType and are convertible; this holds in general be-
cause Zmodule.eqType merely rearranges pieces of a zmodType. For a deeper struc-
ture, we will need to define one such conversion for each parent of the structure.
This is hardly inconvenient since each definition is one line, and the convertibility
property holds for any composition of such conversions.
Figure 3 gives an account for the organization of the main structures defined in
our libraries. Starred blocks denote algebraic structures that would collapse on an
unstarred one in either a classical or an untyped setting. The interface for each
structure supplies notation, definitions, basic theory, and generic connections
with other structures (like a field being a ring).
In the following, we comment on the main design choices governing the defini-
tion of interfaces. For more details, the complete description of all the structures
and their related theory, see module ssralg on http://coqfinitgroup.gforge.
inria.fr/.
We do not package as interfaces all the possible combinations of the mixins
we define: a structure is only packaged when it will be populated in practice. For
instance integral domains and fields are defined on top of commutative rings as
in standard textbooks [1], and we do not develop a theory for non commutative
algebra, which hardly shares results with its commutative counterpart.
Type
Equality
*
Type
Choice
*
SubType
*
Zmodule CountType
Ring FinType
Commutative Unit *
Ring Ring
Commutative *
Unit
Ring
IntegralDomain
Field
Decidable Field
*
Closed
Field
This interface gathers a new type sub_sort for the inhabitants of type T satisfying
the boolean predicate P, with a projection val on type T, a Sub constructor,
and an elimination scheme. Now, the val projection can be proved injective: to
compare two elements of a subType structure on type T it is enough to compare
their projections on T. A simple example of subType structure equips the type of
finite ordinals:
Inductive ordinal (n : nat) := Ordinal m of m < n.
where < stands for the boolean strict order on natural numbers. Crucially, replac-
ing a primitive Coq Σ-type by this encoding makes it possible to coerce ordinal
Packaging Mathematical Structures 335
The xfun choice operator for boolean predicates should return a witness satis-
fying P, given a proof of the existence of such a witness. It is extensional with
respect to both the proofs of existence and the predicates.
Countable structures. A choice structure will still not be transmitted to
any desired construction (like product) over types featuring themselves a choice
structure. Types with countably many inhabitants on the other side are more
amenable to transmit their countability. This leads us to define a structure for
these countable types, by requiring an injection pickle : T -> nat on the un-
derlying type T.
Since the Calculus of Inductive Constructions [11] validates the axiom of
countable choice, it is possible to derive a Choice structure from any count-
able type. However since a generic choice construction on arbitrary countable
types would not always lead to the expected choice operator, we prefer to embed
a Choice structure as base class for the Countable structure.
Finite types structures. The structure of types with a finite number of inhab-
itants is at the heart of our formalization of finite quotients [8]. The Finite mixin
still corresponds to the description given in this reference, but the FinType struc-
ture now packs this mixin with a Countable base instead of an eqType. Proofs
like the cardinal of the cartesian product of finite types make the most of this
computational content for the enumeration. Indeed the use of (computations of)
list iterators shrinks the sizes of such proofs by a factor of five compared to the
abstract case.
Its class packages the class of a ComRing structure with the mixin of a UnitRing
(which reflects a natural order for further instantiation). The base1 projection
Packaging Mathematical Structures 337
coerces the ComUnitRing class to its ComRing base class. Note that this definition
does not provide the required coercion path from a ComUnitRing class to its
underlying UnitRing class, which is only provided by base2. Now the canonical
structures of ComRing and UnitRing for a ComUnitRing structure will let the latter
enjoy both theories with a correct treatment of type constraints.
Closed fields. Algebraically closed fields are defined by requiring that any non
constant monic polynomial has a root. Since such a structure enjoys quantifier
elimination, any closed field canonically enjoys a structure of decidable field.
The correspondence of this lemma with its natural language counterpart be-
comes straightforward, once we dispense with a few notations:
%R : is a scope notation for ring operations.
{group gT} :
The types we defined in section 3 are convenient for framing their elements in
a precise algebraic setting. However, since a large proportion of the properties
338 F. Garillot et al.
However, writing functions that deal with those types implies some challenges,
among which is dealing with size arguments. We want our library to simplify
this task, while sharing operator symbols, and exposing structural properties of
objects as soon as their shape ensures they are valid.
The rest of our notations can be readily interpreted, with 1 and 0 coercing
respectively to identity and null matrices of the right dimension, and (A i j)
returning the appropriate ai,j coefficient of A through coercion.
3
As crafted in line 2 above
340 F. Garillot et al.
Correctness of the algorithm. We will omit in this article some of the steps
involved in proving that the LUP decomposition is correct: showing that P is a
permutation matrix, for instance, involved building a theory about those ma-
trices that correspond to a permutation map over a finite vector. But while
studying the behavior of this subclass with respect to matrix operations gave
some hint of the usability of our matrix library 4 , it is not the part where our
infrastructure shines the most.
The core of the correction lies in the following equation:
Lemma cormen_lup_correct : forall n A,
let: (P, L, U) := @cormen_lup F n A in P * A = L * U.
Its proof proceeds by induction on the size of the matrix. Once we make sure
that A’ and Q (line 7) are defined coherently, it is not hard to see that we are
proving is5 :
⎛ ⎞ ⎛ ⎞ ⎛ A ⎞
1 0 ... 0 1 0 ... 0 ul A ur
⎜0 ⎟ ⎜ ⎟ ⎜ 0 ⎟
⎜ ⎟ ⎜ ⎜
⎟*⎜ . ⎟
⎜ .. ⎟ = ⎝ −1 ⎠ ⎝ . ⎟ (1)
⎝. P *A’ ⎠
ak,0 · P’ *m A’ll L . U ⎠
0 0
Notice that we transcribe the distinction Coq does with the three product oper-
ations involved: the scalar multiplication (·), the square matrix product (*), and
the matrix product, accepting arbitrary sized matrices (*m ). Using block product
expansion and a few easy lemmas allows us to transform (1) into:
⎛ ⎞ ⎛ ⎞
A ul A ur A ul A ur
⎜ ⎟ ⎜ ⎟
⎝ P =⎜ ⎟ (3)
*m A’ll P *m A’lr ⎠ ⎝ a−1
k,0 · P’ *m A’ll *m A’ul a−1
k,0 · P’ *m A’ll *m A’ur
⎠
+ L’ *m U’
At this stage, we would like to rewrite our goal with (2) —named IHn in our
script—, even though its right-hand side does not occur exactly in the equation.
However, SSReflect has no trouble expanding the definition of the ring mul-
tiplication provided in (2) to see it exactly matches the pattern6 -[L’ *m U’]IHn.
We conclude by identifying the blocks of (3) one by one. The most tedious
step consists in treating the lower left block, which depends on whether we have
been able to chose a non-null pivot in creating A’ from A. Each alternative is
resolved by case on the coefficients of that block, and it is only in that part that
we use the fact that the matrix coefficients belong to a field. The complete proof
is fourteen lines long.
4
The theory, while expressed in a general manner, is less than ninety lines long.
5
We will write block expressions modulo associativity and commutativity, to reduce
parenthesis clutter.
6
See [10] for details on the involved notation for the rewrite tactic.
Packaging Mathematical Structures 341
5 Related Work
The need for packaging algebraic structures and formalizing their relative in-
heritance and sharing inside proof assistants is reported in literature as soon as
these tools prove mature enough to allow the formalisation of significant pieces
of algebra [2]. The set-theoretic Mizar Mathematical Library (MML) certainly
features the largest corpus of formalized mathematics, yet covering rather differ-
ent theories than the algebraic ones we presented here. Little report is available
on the organization a revision of this collection of structures, apart from com-
ments [7] on the difficulty to maintain it. The Isabelle/HOL system provides
foundations for developing abstract algebra in a classical framework contain-
ing algebraic structures as first-class citizens of the logic and using a type-class
like mechanism [6]. This library proves Sylow theorems on groups and the basic
theory of rings of polynomials.
Two main algebraic hierarchies have been built using the Coq system: the
seminal abstract Algebra repository [4], covering algebraic structures from mo-
noids to modules, and the CCorn hierarchy [5], mainly devoted to a constructive
formalisation of real numbers, and including a proof of the fundamental theorem
of algebra. Both are axiomatic, constructive, and setoid based. They have proved
rather difficult to extend with theories like linear or multilinear algebra, and to
populate with more concrete instances. In both cases, limitations mainly come
from the pervasive use of setoids and the drawbacks of telescope based hierarchies
pointed in section 2.
The closest work to ours is certainly the hierarchy built in Matita [21], us-
ing telescopes and a more liberal system of coercions. This hierarchy, despite
including a large development in constructive analysis [22], is currently less pop-
ulated than ours. For example, no counterpart of the treatment of polynomials
presented in section 4 is described in the Matita system.
We are currently extending our hierarchy to extend the infrastructure to the
generic theory of vector spaces and modules.
References
8. Gonthier, G., Mahboubi, A., Rideau, L., Tassi, E., Théry, L.: A Modular Formali-
sation of Finite Group Theory. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007.
LNCS, vol. 4732, pp. 86–101. Springer, Heidelberg (2007)
9. Bertot, Y., Gonthier, G., Ould Biha, S., Pasca, I.: Canonical big operators. In:
Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp.
86–101. Springer, Heidelberg (2008)
10. Gonthier, G., Mahboubi, A.: A small scale reflection extension for the Coq system.
INRIA Technical report, http://hal.inria.fr/inria-00258384
11. Paulin-Mohring, C.: Définitions Inductives en Théorie des Types d’Ordre
Supérieur. Habilitation à diriger les recherches, Université Claude Bernard Lyon I
(1996)
12. Sozeau, M., Oury, N.: First-Class Type Classes. In: Mohamed, O.A., Muñoz, C.,
Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 278–293. Springer, Heidelberg
(2008)
13. Pollack, R.: Dependently typed records in type theory. Formal Aspects of Com-
puting 13, 386–402 (2002)
14. Bruijn, N.G.D.: Telescopic mappings in typed lambda calculus. Information and
Computation 91, 189–204 (1991)
15. Paulson, L.C.: Defining Functions on Equivalence Classes. ACM Transactions on
Computational Logic 7(4), 658–675 (2006)
16. Barthe, G., Capretta, V., Pons, O.: Setoids in type theory. Journal of Functional
Programming 13(2), 261–293 (2003)
17. Altenkirch, T., McBride, C., Swierstra, W.: Observational equality, now! In: Pro-
ceedings of the PLPV 2007 workshop, pp. 57–68. ACM, New York (2007)
18. Olteanu, G.: Computing the Wedderburn decomposition of group algebras by the
Brauer-Witt theorem. Mathematics of Computation 76(258), 1073–1087 (2007)
19. Rotman, J.J.: An Introduction to the Theory of Groups. Springer, Heidelberg
(1994)
20. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms,
2nd edn. McGraw-Hill, New York (2003)
21. Sacerdoti Coen, C., Tassi, E.: Working with Mathematical Structures in Type
Theory. In: Miculan, M., Scagnetto, I., Honsell, F. (eds.) TYPES 2007. LNCS,
vol. 4941, pp. 157–172. Springer, Heidelberg (2008)
22. Sacerdoti Coen, C., Tassi, E.: A constructive and formal proof of Lebesgue Domi-
nated Convergence Theorem in the interactive theorem prover Matita. Journal of
Formalized Reasoning 1, 51–89 (2008)
Practical Tactics for Separation Logic
Andrew McCreight
1 Introduction
Separation logic [1] is an extension of Hoare logic for reasoning about shared mu-
table data structures. Separation logic specifies the contents of individual cells of
memory in a manner similar to linear logic [2], avoiding problems with reasoning
about aliasing in a very natural fashion. For this reason, it has been successfully
applied to the verification of a number of pointer-intensive applications such as
garbage collectors [3,4].
However, most work on separation logic has involved paper, rather than
machine-checkable, proofs. Mechanizing a proof can increase our confidence in
the proof and potentially automate away some of the tedium in its construction.
We would like to use separation logic in a proof assistant to verify deep proper-
ties of programs that may be hard to check fully automatically. This is difficult
because the standard tactics of proof assistants such as Coq [5] cannot effectively
deal with the linearity properties of separation logic. In contrast, work such as
Smallfoot [6] focuses on the automated verification of lightweight specifications.
We discuss other related work in Sect. 7.
In this paper, we address this problem with a suite of tools for separation-
logic-based program verification of complex pointer-intensive programs. These
tools are intended for the interactive verification of Cminor programs [7] in the
Coq proof assistant, but should be readily adaptable to similar settings. We
have chosen Cminor because it can be compiled using the CompCert verified
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 343–358, 2009.
c Springer-Verlag Berlin Heidelberg 2009
344 A. McCreight
compiler [7], allowing for some properties of source programs to be carried down
to executable code. We have tested the applicability of these tools by using them
to verifying the safety of a Cheney garbage collector [8], as well as a number of
smaller examples.
The main contributions of this paper are a comprehensive set of tactics for
reasoning about separation logic assertions (including simplification, rearrang-
ing, splitting, matching and rewriting) and a program logic and accompanying
set of tactics for program verification using separation logic that strongly sep-
arate reasoning about memory from more standard reasoning. Together these
tactics essentially transform Coq into a proof assistant for separation logic. The
tactics are implemented in a combination of direct and reflective styles. The Coq
implementation is available online from http://cs.pdx.edu/~mccreigh/ptsl/
Our tool suite has two major components. First, we have tactics for reason-
ing about separation logic assertions. These are focused on easing the difficulty
of working with a linear-style logic within a more conventional proof assistant.
These tools enable the simplification and manipulation of separation logic hy-
potheses and goals, as well as the discharging of goals that on paper would
be trivial. These tactics are fairly modular and should be readily adaptable to
other settings, from separation logic with other memory models to embeddings
of linear logic in proof assistants.
The second component of our tool set is a program logic and related tactics.
The program logic relates the dynamic semantics of the program to its specifica-
tion. The tactics step through a procedure one statement at a time, enabling the
“programmer’s intuition” to guide the “logician’s intuition”. At each program
step, there is a separation logic-based description of the current program state. A
verified verification condition generator produces a precondition given the post-
condition of the statement. The tactics are able to automatically solve many
such steps, and update the description of the state once the current statement
has been verified. Loop and branch join point annotations must be manually
specified.
2 Cminor
Our program tools verify programs written in Cminor, a C-like imperative lan-
guage. Cminor is an intermediate language of the CompCert [7] compiler, which
is a semantics preserving compiler from C to PowerPC assembly, giving us a
Practical Tactics for Separation Logic 345
that is a linked list and reverses the linked list in place, returning a pointer to
the new head of the list. A linked list structure at address a has two fields, at a
and a + 4.
Imperative programs often have complex data structures. To reason about these
data structures, separation logic assertions [1] describe memory by treating mem-
ory cells as a linear resource. In this section, we will describe separation logic
assertions and associated tactics for Cminor, but they should be applicable to
other imperative languages.
Fig. 3 gives the standard definitions of the separation logic assertions we use
in this paper. We write P for propositions and T for types in the underlying
logic, which in our case is the Calculus of Inductive Constructions [10] (CIC).
Propositions have type Prop. We write A and B for separation logic assertions,
implemented using a shallow embedding [11]. Each separation logic assertion is a
memory predicate with type Mem → Prop, so we write A m for the proposition
that memory m can be described by separation logic predicate A.
The separation logic assertion contains, written v → v , holds on a memory m
if v is an address that is the only element of the domain of m and m(v) = v . The
empty assertion emp only holds on empty memory. The trivial assertion true
holds on every memory. The modal operator !P from linear logic (also adapted
to separation logic by Appel [12]) holds on a memory m if the proposition P
is true and m is empty. The existential ∃x : T. A is analogous to the standard
existential operator. We omit the type T when it is clear from context, and follow
common practice and write a → − for ∃x. a → x.
The final and most crucial separation logic operator we will be using in this
paper is separating conjunction, written A ∗ B. This holds on a memory m if m
can be split into two non-overlapping memories m1 and m2 such that A holds
on m1 and B holds on m2 . (m = m1 m2 holds if m is equal to m1 ∪ m2
and the domains of m1 and m2 are disjoint.) This operator is associative and
commutative, and we write (A ∗ B ∗ C) for (A ∗ (B ∗ C)). This operator is used to
specify the frame rule, which is written as follows in conventional Hoare logic:
{A}s{A }
{A ∗ B}s{A ∗ B}
B describes parts of memory that s does not interact with. The frame rule
is most commonly applied at procedure call sites. We have found that we do
not need to manually instantiate the frame rule, thanks to our tactics and
program logic.
We can use these basic operators in conjunction with Coq’s standard facilities
for inductive and recursive definitions to build assertions for more complex data
structures. For instance, we can inductively define a separation logic assertion
llist(v, l) that holds on a memory that consists entirely of a linked list with its
head at v containing the values in the list l. A list l (at the logical level) is either
empty (written nil) or contains an element X appended to the front of another
list l (written X :: l).
From this definition and basic lemmas about separation logic assertions, we
can prove a number of useful properties of linked lists. For instance, if llist(v, l)
holds on a memory and v is not NULL then the linked list is non-empty.
We can use this predicate to define part of the loop invariant for the list
reversal example given in Section 2. If the variables x and y have the value v1 and
v2 , then memory m must contain two separate, non-overlapping linked lists with
values l1 and l2 . In separation logic, this is written (llist(v1 , l1 ) ∗ llist(v2 , l2 )) m.
3.1 Tactics
Defining the basic separation logic predicates and verifying their basic properties
is not difficult, even in a mechanized setting. What can be difficult is actually
constructing proofs in a proof assistant such as Coq because we are attempting
to carry out linear-style reasoning in a proof assistant with a native logic that is
not linear.
If A, B, C and D are regular propositions, then the proposition that (A ∧
B ∧ C ∧ D) implies (B ∧ (A ∧ D) ∧ C) can be easily proved in a proof assistant.
The assumption and goal can be automatically decomposed into their respective
components, which can in turn be easily solved.
If A, B, C and D are separation logic assertions, proving the equivalent goal,
that for all m, (A ∗ B ∗ C ∗ D) m implies (B ∗ (A ∗ D) ∗ C) m, is more difficult.
Unfolding the definition of * from Fig. 3 and breaking down the assumption in a
similar way will involve large numbers of side conditions about memory equality
and disjointedness. While Marti et al. [13] have used this approach, it throws
away the abstract reasoning of separation logic.
Instead, we follow the approach of Reynolds [1] and others and reason about
separation logic assertions using basic laws like associativity and commutativity.
However this is not the end of our troubles. Proving the above implication re-
quires about four applications of associativity and commutativity lemmas. This
can be done manually, but becomes tedious as assertions grow larger. In real
proofs, these assertions can contain more than a dozen components.
348 A. McCreight
Next the simplification tactic extracts !(v = NULL), creating a new subgoal
v = NULL (not shown). Finally, a matching tactic cancels out the common parts
of the hypothesis and goal, leaving a smaller proof obligation.
H : (B ∗ A) m H : B m
→ →
(A ∗ B ) m B m
The m in the final proof state is fresh: it must be shown that B m implies
B m for all m1 . This example shows most of our tactics for separation logic
1
The final step could be more specific (for instance, we know that m must be a subset
of m), but this is rarely useful in practice.
Practical Tactics for Separation Logic 349
tree [[3, 1], 2] would become the list [3, 1, 2] and the assertion (A ∗ B ∗ C) would
become the list [A, B, C]. The permutation list [3, 1, 2] is used to reorder the
assertion list [A, B, C]. In this example, the resulting assertion list is [C, A, B].
The initial assertion list is logically equivalent to the final one if the permuta-
tion list is a permutation of the indices of the assertion list. This requirement
is dynamically checked by the tactic. Reassociation is implemented directly by
examining the shape of the tree.
Splitting. The tactic ssplit subdivides a separation logic proof by creating a new
subgoal for each corresponding part of the hypothesis and goal. This uses the
standard separation logic property that if (∀m. A m → A m) and (∀m. B m →
B m), then (∀m. (A ∗ B) m → (A ∗ B ) m). Here is an example of the basic
use of this tactic:
H : (A ∗ B ∗ C) m H1 : A m 1 H2 : B m 2 H3 : C m 3
−→
(E ∗ F ∗ G) m E m1 F m2 G m3
Initially there is a goal and a hypothesis H describing memory. Afterward,
there are three goals, each with a hypothesis. Memories m1 , m2 and m3 are
freshly created, and represent disjoint subsets covering the original memory m.
Splitting must be done with care as it can lead to a proof state where no further
progress is possible.
The splitting tactic also has a number of special cases to solve subgoals involv-
ing →, and applies heuristics to try to solve address equalities that are generated.
Here are two examples of this:
H : (a → v) m H : (a → v) m
−→ −→
(a → v ) m v = v (b → v) m a=b
Matching. The matching tactic searchMatch cancels out matching parts of the
hypothesis and goal. This matching is just syntactic equality, with the addition
of some special cases for →. Here is an example of this tactic:
E : v1 = v1 E : v1 = v1
H : (D ∗ A ∗ a → v1 ∗ B ∗ b → v2 ∗ true) m H : (B ∗ true) m
−→
(B ∗ b → − ∗ D ∗ a → v1 ∗ A ∗ true) m (B ∗ true) m
The assertion for address a is cancelled due to the equality v1 = v1 . Notice that
the predicate true, present in both the hypothesis and goal, is not cancelled. If B
implies (B ∗ true) then cancelling out true from goal and hypothesis will cause
a provable goal to become unprovable. This is the same problem presented by
the additive unit in linear logic, which can consume any set of linear resources.
We do not have this problem for matching other separation logic predicates as
they generally do not have this sort of slack.
Matching is implemented by iterating over the assertions in the goal and
hypothesis, looking for matches. Any matches that are found are placed in cor-
responding positions in the assertions, allowing the splitting tactic to carry out
the actual cancellation.
Practical Tactics for Separation Logic 351
4 Program Logic
reduction of this term differs, as shown in Fig. 4. In the case where M = None,
evaluation has failed, so the entire VC becomes equivalent to False, because
failure is not allowed.
The cases for variable assignment and storing to memory follow the dynamic
semantics of the machine. Variable assignment attempts to evaluate the expres-
sion e, then attempts to update the value of the variable x. If it succeeds, then
the postcondition q must hold on the resulting state σ . Store works in the same
way: evaluation of the two expressions and a store are attempted. For the store
to succeed, v1 must be a valid address in memory. As with assignment, the post-
condition q must hold on the resulting state. With both of these statements, if
any of the intermediate evaluations fail then the entire VC will end up being
False, and thus impossible to prove.
The case for while loops is fairly standard. A state predicate I must be selected
as a loop invariant. I must hold on the initial state σ. Furthermore, for any other
states σ such that I σ holds, it must be possible to evaluate the expression e to
a value v, which cannot be Vundef. If the value v is a “true” value (i.e., is either
a pointer or a non-zero word value) then the precondition of the loop body s
must hold, where the postcondition of the body is I. If the value is false (equal
to Vword(0)) then the postcondition of the entire loop must hold.
We have mechanically verified the soundness of the verification condition gen-
erator as part of the safety of the program logic: if the program is well-formed,
then we can either take another step or we have reached a valid termination
state for the program.
4.3 Tactics
The tactic vcSteps lazily unfold the VC and attempts to use the separation logic
description of the state to perform symbolic execution to step through the VC.
Fig. 5 shows the rough sequence of steps that vcSteps carries out automatically
at an assignment. Each numbered line below the horizontal line gives an inter-
mediate goal state as vcSteps is running. The two hypothesis V and H above
the line describe the variable environment and memory of the initial state σ,
and are the precondition of this statement. The first stmPre below the line is
the initial VC that must be verified. The tactic unfolds the definition of the VC
for a sequence (line 2), then for an assignment (line 3), then determines that
the value of the variable x is a by examining the hypothesis V (line 4). The
load expression now has a value as an argument, so the tactic will examine the
hypothesis H to determine that address a contains value v (line 5). The relevant
binding a → v can occur as any subtree of the separation logic assertion. Now
the do-notation can be reduced away (line 6).
Once this is done, all that remains is to actually perform the assignment. The
hypothesis V proves that y is in the domain the variable file of σ, so setting the
value of y to v will succeed, producing a new state σ . The tactic simplifies the
goal to step through this update, and uses V to produce a new V that describes
σ . H can still be used as the memory of σ is the same as the memory of σ. This
results in the proof state shown in Fig. 6. The tactic can now begin to analyze
the statement s in a similar manner.
This may seem like a lengthy series of steps, but it is largely invisible to
the user. Breaking down statements in this manner allows the tactics to easily
handle a wide variety of expressions. vcSteps will get “stuck” at various points
that require user intervention, such as loops and branches where invariants must
be supplied, and where the tactic cannot easily show that a memory or variable
operation is safe. In the latter case, the tactics described in the previous section
can be applied to manipulate the assertion to a form the program logic tactics
can understand, then vcSteps can be invoked again to pick up where it left off.
In addition to the tactic for reasoning about VCs, there is a tactic veEqvSolver
to automatically solve goals involving veEqv. This is straightforward, as it only
needs to reason about concrete finite sets.
In the first line, the predicate veEqv requires that in the current state that
at least the program variables x, y, t are valid, and that the variables x and y
are equal to some values v1 and v2 , respectively. The second line is a separation
logic predicate specifying that memory contains two disjoint linked lists as seen in
Sect. 3. From these two descriptions, we can deduce that the variable x contains
a pointer to a linked list containing the values l1 . Finally, the invariant requires
that reversing l1 and appending l2 results in the original list l0 . We write rev(l)
for the reversal of list l and l ++ l for appending list l to the end of l.
To save space, we will only go over the verification of the loop body and not
describe in detail the invocation of standard Coq tactics. In the loop body, we
know that the loop invariant inv holds on the current state and that the value
of x is not Vword(0) (i.e., NULL). We must show that after the loop body has
executed that the loop invariant is reestablished.
Our initial proof state is thus:
NE : v1 = V int0
L : rev(l1 ) ++ l2 = rev(l0 )
H : (llist(v1 , l1 ) ∗ llist(v2 , l2 )) (mem(σ))
V : veEqv {x, y, t} {(x, v1 ), (y, v2 )} (venv(σ))
stmPre (t := [x + 4]; [x + 4] := y; y := x; x := t) (inv l0 ) σ
Our database of rewriting rules for separation logic data structures includes
the following rule: if v is not 0, then a linked list llist(v, l) must have at least one
element. Thus, applying our rewriting tactic to the hypothesis H triggers this
rule for v = v1 . After applying a standard substitution tactic, we have this proof
state (where everything that is unchanged is left as ...):
Practical Tactics for Separation Logic 355
...
L : rev(v :: l1 ) ++ l2 = rev(l0 )
H : ((v1 → v) ∗ (v1 +4 → v1 ) ∗ llist(v1 , l1 ) ∗ llist(v2 , l2 )) (mem(σ))
...
Now that we know that the address v1 + 4 contains the value v1 , we can
show that it is safe to execute the loop body. The tactic vcSteps, described in
Sect. 4.3, is able to automatically step through the entire loop body, leaving an
updated state description and the goal of showing that the loop invariant holds
on the final state of the loop σ :
...
H : ((v1 → v) ∗ (v1 +4 → v2 ) ∗ llist(v1 , l1 ) ∗ llist(v2 , l2 )) (mem(σ ))
V : veEqv {x, y, t} {(x, v1 ), (y, v1 ), (t, v1 )} (venv(σ ))
inv l0 σ
Now we must instantiate two existential variables and show that they are
the values of the variables x and y (and that x, y and t are valid variables).
These existentials can be automatically instantiated, and the part of the goal
using veEqv solved, using a few standard Coq tactics along with the veEqvSolver
described in Sect. 4.3.
The remaining goal is
...
∃l3 , l4 . (llist(v1 , l3 ) ∗ llist(v1 , l4 )) (mem(σ )) ∧ rev(l3 ) ++ l4 = rev(l0 )
We manually instantiate the existentials with l1 and (v :: l2 ) and split the
resulting conjunction using standard Coq tactics. This produces two subgoals.
The second subgoal is rev(l2 ) ++ (v :: l1 ) = rev(l0 ) and can be solved using
standard tactics. The first subgoal is an assertion containing llist(v1 , v :: l2 ),
which we can be simplified using standard tactics leaving the proof state
...
(llist(v1 , l1 ) ∗ (∃x .(v1 → v) ∗ (v1 +4 → x ) ∗ llist(x , l2 ))) (mem(σ ))
Invoking our simplification tactic ssimpl replaces the existential with a Coq
meta-level existential variable “?100”, leaving the goal3
...
H : ((v1 → v) ∗ (v1 +4 → v2 ) ∗ llist(v1 , l1 ) ∗ llist(v2 , l2 )) (mem(σ ))
(llist(v1 , l1 ) ∗ (v1 → v) ∗ (v1 +4 → ?100) ∗ llist(?100, l2 )) (mem(σ ))
tactics. Tuch et al. [17] define a mechanized program logic for reasoning about
C-like memory models. They are able to verify programs using separation logic,
but do not have any complex tactics for separation logic connectives.
Other work, such as Smallfoot [6], has focused on automated verification of
lightweight separation logic specifications. This approach has been used as the
basis for certified separation logic decisions procedures in Coq [18] and HOL [19].
Calcagno et al. [20] use separation logic for an efficient compositional shape
analysis that is able to infer some specifications.
Still other work has focused on mechanized reasoning about imperative pointer
programs outside of the context of separation logic [11,21,22] using either deep
or shallow embeddings. Expressing assertions via more conventional propositions
enables the use of powerful preexisting theorem provers. Another approach to
program verification decompiles imperative programs into functional programs
that are more amenable to analysis in a proof assistant [23,15].
The tactics we have described in this paper provide a solid foundation for
the use of separation logic in a proof assistant but there is room for further au-
tomation. Integrating a Smallfoot-like decision procedure into our tactics would
automate reasoning about standard data structures.
We have presented a set of separation logic tactics that allows the verification
of programs using separation logic in a proof assistant. These tactics allow Coq
to be used as a proof assistant for separation logic by allowing the assertions
to be easily manipulated via simplification, rearranging, splitting, matching and
rewriting. They also provide tactics for proving a verification condition by means
of a separation logic based description of the program state. These tactics are
powerful enough to verify a garbage collector.
References
1. Reynolds, J.C.: Separation logic: A logic for shared mutable data structures. In:
LICS 2002, Washington, DC, USA, pp. 55–74. IEEE Computer Society, Los Alami-
tos (2002)
2. Girard, J.Y.: Linear logic. Theoretical Computer Science 50, 1–102 (1987)
3. Birkedal, L., Torp-Smith, N., Reynolds, J.C.: Local reasoning about a copying
garbage collector. In: POPL 2005, pp. 220–231. ACM Press, New York (2004)
4. McCreight, A., Shao, Z., Lin, C., Li, L.: A general framework for certifying gcs and
their mutators. In: PLDI 2007, pp. 468–479. ACM, New York (2007)
5. The Coq Development Team: The Coq proof assistant, http://coq.inria.fr
6. Berdine, J., Calcagno, C., O’Hearn, P.W.: Smallfoot: Modular automatic asser-
tion checking with separation logic. In: de Boer, F.S., Bonsangue, M.M., Graf,
S., de Roever, W.-P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 115–137. Springer,
Heidelberg (2006)
7. Leroy, X.: Formal certification of a compiler back-end, or: programming a compiler
with a proof assistant. In: POPL 2006, pp. 42–54. ACM Press, New York (2006)
358 A. McCreight
1 Introduction
Explicit pointer manipulation is an endless source of errors in low-level programs.
Functional programming languages hide pointers and thereby achieve a more
abstract programming environment. The downside with functional programming
(and Java/C# programming) is that the programmer has to trust automatic
memory management routines built into run-time environments.
In this paper we report on a case study, which we believe is the first to
produce a formally verified end-to-end implementation of a functional program-
ming language. We have implemented, in ARM, x86 and PowerPC machine code,
a program which parses, evaluates and prints LISP; and furthermore formally
proved that our implementation respects a semantics of the core of LISP 1.5 [6].
Instead of assuming correctness of run-time routines, we build on a verified im-
plementation of allocation and garbage collection.
For a flavour of what we have implemented and proved consider an example:
if our implementation is supplied with the following call to pascal-triangle,
(pascal-triangle ’((1)) ’6)
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 359–374, 2009.
c Springer-Verlag Berlin Heidelberg 2009
360 M.O. Myreen and M.J.C. Gordon
(label pascal-triangle
(lambda (rest n)
(cond ((equal n ’0) rest)
(’t (pascal-triangle
(cons (pascal-next ’0 (car rest)) rest) (- n ’1))))))
The theorem we have proved about our LISP implementation can be used to
show e.g. that running pascal-triangle will terminate and print the first n + 1
rows of Pascal’s triangle, without a premature exit due to lack of heap space. One
can use our theorem to derive sufficient conditions on the inputs to guarantee
that there will be enough heap space.
We envision that our verified LISP interpreter will provide a platform on top
of which formally verified software can be produced with much greater ease than
at lower levels of abstraction, i.e. in languages where pointers are made explicit.
Why LISP? We chose to implement and verify a LISP interpreter since LISP
has a neat definition of both syntax and semantics [12] and is still a very powerful
language as one can see, for example, in the success of ACL2 [8]. By choosing
LISP we avoided verifying machine code which performs static type checking.
Our proofs [14] are mechanised in the HOL4 theorem prover [19].
2 Methodology
Instead of delving into the many detailed invariants developed for our proofs,
this paper will concentrate on describing the methodology we used:
" First, machine code for various LISP primitives, such as car, cdr, cons, was
written and verified (Section 3);
• The correctness of each code snippets is expressed as a machine-code
Hoare triple [15]: { pre ∗ pc p } p : code { post ∗ pc (p + exit) }.
• For cons and equal we used previously developed proof automation [15],
which allows for proof reuse in between different machine languages.
" Second, the verified LISP primitives were input into a proof-producing com-
piler in such a way that the compiler can view the processors as a machine
with six registers containing LISP s-expressions (Section 4);
• The compiler [16] we use maps tail-recursive functions, defined in the
logic of HOL4, down to machine code and proves that the generated
code executes the original HOL4 functions.
• Theorems describing the LISP primitives were input into the compiler,
which can use them as building blocks when deriving new code/proofs.
Verified LISP Implementations on ARM, x86 and PowerPC 361
Sections 8 and 9 give quantitative data on the effort and discuss related work,
respectively. Some definitions and proofs are presented in the Appendixes.
3 LISP Primitives
LISP programs are expressed in and operate over s-expressions, expressions that
are either a (natural) number, a symbol or a pair of s-expressions. In HOL,
s-expressions are readily modelled using a data-type with constructors:
Num : N → SExp
Sym : string → SExp
Dot : SExp → SExp → SExp
(car x) means Dot (Sym "car") (Dot (Sym "x") (Sym "nil"))
(1 2 3) means Dot (Num 1) (Dot (Num 2) (Dot (Num 3) (Sym "nil")))
’f means Dot (Sym "quote") (Dot (Sym "f") (Sym "nil"))
(4 . 5) means Dot (Num 4) (Num 5)
car (Dot x y) = x
cdr (Dot x y) = y
362 M.O. Myreen and M.J.C. Gordon
cons x y = Dot x y
Before writing and verifying the machine code implementing primitive LISP
operations, a decision had to be made how to represent Num, Sym and Dot on a
real machine. To keep memory usage to a minimum each Dot-pair is represented
as a block of two pointers stored consecutively on the heap, each Num n is
represented as a 32-bit word containing 4 × n + 2 (i.e. only natural numbers
0 ≤ n < 230 are representable), and each Sym s is represented as a 32-bit word
containing 4×i+3, where i is the row number of symbol s in a symbol table which,
in our implementation, is a linked-list kept outside of the garbage-collected heap.
Here ‘+2’ and ‘+3’ are used as tags to make sure that the garbage collector
can distinguish Num and Sym values from proper pointers. Pointers to Dot-pairs
are word-aligned, i.e. a mod 4 = 0, a condition the collector tests by computing
a & 3 = 0, where & is bitwise-and.
This simple and small representation of SExp allows most LISP primitives
from the previous section to be implemented in one or two machine instruc-
tions. For example, taking car of register 3 and storing the result in register 4 is
implemented on ARM as a load instruction:
Similarly, ARM code for performing LISP operation plus of register 3 and 4, and
storing the result into register 3 is implemented by:
the level of abstraction to a level where specific machine instructions make the
processor seem as if it has six1 registers containing s-expressions, of type SExp.
(∃x y. Dot x y = v1 ) ⇒
{ lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc p }
p : E5934000
{ lisp (v1 , car v1 , v3 , v4 , v5 , v6 , l) ∗ pc (p + 4) }
lisp’ (v1 , v2 , v3 , v4 , v5 , v6 , l) =
∃x1 x2 x3 x4 x5 x6 m1 m2 m3 a. m m1 ∗ m m2 ∗ m m3 ∗
eax x1 ∗ ecx x2 ∗ edx x3 ∗ ebx x4 ∗ esi x5 ∗ edi x6 ∗ ebp a ∗
lisp inv (v1 , v2 , v3 , v4 , v5 , v6 , l) (x1 , x2 , x3 , x4 , x5 , x6 , a, m1 , m2 , m3 )
lisp” (v1 , v2 , v3 , v4 , v5 , v6 , l) =
∃x1 x2 x3 x4 x5 x6 m1 m2 m3 a temp. m m1 ∗ m m2 ∗ m m3 ∗
r2 temp ∗ r3 x1 ∗ r4 x2 ∗ r5 x3 ∗ r6 x4 ∗ r7 x5 ∗ r8 x6 ∗ r10 a ∗
lisp inv (v1 , v2 , v3 , v4 , v5 , v6 , l) (x1 , x2 , x3 , x4 , x5 , x6 , a, m1 , m2 , m3 )
The following examples will use only lisp defined for ARM.
The specification of cons guarantees that its implementation will always succeed
as long as the number of reachable Dot-pairs is less than the capacity of the
heap, i.e. less than l. This precondition under approximates pointer aliasing.
size v1 + size v2 + size v3 + size v4 + size v5 + size v6 < l ⇒
{ lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc p }
p : E50A3018 E50A4014 E50A5010 E50A600C ... E51A8004 E51A7008
{ lisp (cons v1 v2 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc (p + 332) }
The implementation of cons includes a copying collector which implements
Cheney’s algorithm [2]. This copying collector requires the heap to be split into
two heap halves of equal size; only one of which is used for heap data at any
one point in time. When a collection request is issued, all live elements from the
currently used heap half are copied over to the currently unused heap half. The
proof of cons is outlined in the first author’s PhD thesis [14].
The fact that one half of the heap is left empty might seem to be a waste
of space. However, the other heap half need not be left completely unused, as
the implementation of equal can make use of it. The LISP primitive equal tests
whether two s-expressions are structurally identical by traversing the expression
tree as a normal recursive procedure. This recursive traversal requires a stack,
but the stack can in this case be built inside the unused heap half as the garbage
collector will not be called during the execution of equal. Thus, the implementa-
tion of equal uses no external stack and requires no conditions on the size of the
expressions v1 and v2 , as their depths cannot exceed the length of a heap half.
{ lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc p }
p : E1530004 03A0300F 0A000025 E50A4014 ... E51A7008 E51A8004
{ lisp (equal v1 v2 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc (p + 164) }
The compiler can use such theorems to create branches on the expression as-
signed to status bits. The above theorem adds support for the if-statement:
Once the compiler was given sufficient Hoare-triple theorems it could be used
to compile functions operating over s-expressions into machine code. An example
will illustrate the process. From the following function
sumlist(v1 , v2 , v3 ) = if v1 = Sym "nil" then (v1 , v2 , v3 ) else
let v3 = car v1 in
let v1 = cdr v1 in
let v2 = plus v2 v3 in
sumlist(v1 , v2 , v3 )
the compiler produces the theorem below, containing the generated ARM ma-
chine code and a precondition sumlist pre(v1 , v2 , v3 ).
sumlist pre(v1 , v2 , v3 ) ⇒
{ lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc p ∗ s }
p : E3330003 0A000004 E5935000 E5934004 E0844005 E2444002 EAFFFFF8
{ let (v1 , v2 , v3 ) = sumlist(v1 , v2 , v3 ) in
lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc (p + 28) ∗ s }
(p, ρ) →eval nil ∧ ([gl], ρ) →eval s (p, ρ) →eval x ∧ x = nil ∧ (e, ρ) →eval s
([p → e; gl], ρ) →eval s ([p → e; gl], ρ) →eval s
We have proved that whenever the relation for LISP 1.5 evaluation →eval
relates expression s under environment ρ to expression r, then lisp eval will do
the same. Here t and u are translation functions, from one form of s-expressions
to another. Let nil = Sym "nil" and fst (x, y, . . .) = x.
∀s ρ r. (s, ρ) →eval r ⇒ fst (lisp eval (t s, nil, nil, nil, u ρ, nil, l)) = t r
vacuously true? To remedy this shortcoming, we have verified machine code that
will set-up an appropriate state from scratch.
The set-up and tear-down code includes a parser and printer that will, re-
spectively, read in an input s-expression and print out the resulting s-expression.
The development of the parser and printer started by first defining a function
sexp2string which lays down how s-expressions are to be represented in string
form (Appendix D). Then a function string2sexp was defined for which we proved:
Here sexp ok s makes sure that s does not contain symbols that print ambigu-
ously, e.g. Sym "", Sym "(" and Sym "2". The parsing function was defined as a
composition of a lexer sexp lex and a token parser sexp parse (Appendix D).
string2sexp str = car (sexp parse (reverse (sexp lex str)) (Sym "nil") [])
Machine code was written and verified based on the high-level functions sexp lex,
sexp parse and sexp2string. Writing these high-level definitions first was a great
help when constructing the machine code (using the compiler from [16]).
The overall theorems about our LISP implementations are of the following
form. If →eval relates s with r under the empty environment (i.e. (s, []) →eval r),
no illegal symbols are used (i.e. sexp ok (t s)), running lisp eval on t s will not run
out of memory (i.e. lisp eval pre(t s, nil, nil, nil, nil, nil, l)), the string representation
of t s is in memory (i.e. string a (sexp2string (t s))), and there is enough space to
parse t s and set up a heap of size l (i.e. enough space (t s) l), then the code will
execute successfully and terminate with the string representation of t r stored
in memory (i.e. string a (sexp2string (t r))). The ARM code expects the address
of the input string to be in register 3, i.e. r3 a.
∀s r l p.
(s, []) →eval r ∧ sexp ok (t s) ∧ lisp eval pre(t s, nil, nil, nil, nil, nil, l) ⇒
{ ∃a. r3 a ∗ string a (sexp2string (t s)) ∗ enough space (t s) l ∗ pc p }
p : ... code not shown ...
{ ∃a. r3 a ∗ string a (sexp2string (t r)) ∗ enough space’ (t s) l ∗ pc (p+10404) }
The input needs to be in register 3 for PowerPC and the eax register for x86.
8 Quantitative Data
The idea for this project first arose approximately two years ago. Since then
a decompiler [15] and compiler [16] have been developed to aid this project,
which produced in total some 4,580 lines of proof automation and 16,130 lines
of interactive proofs and definitions, excluding the definitions of the instruction
set models [5,9,18]. Running through all of the proofs takes approximately 2.5
hours in HOL4 using PolyML.
368 M.O. Myreen and M.J.C. Gordon
References
1. Boyer, R.S., Yu, Y.: Automated proofs of object code for a widely used micropro-
cessor. J. ACM 43(1), 166–192 (1996)
2. Cheney, C.J.: A non-recursive list compacting algorithm. Commun. ACM 13(11),
677–678 (1970)
3. Chlipala, A.J.: A certified type-preserving compiler from lambda calculus to as-
sembly language. In: Programming Language Design and Implementation (PLDI),
pp. 54–65. ACM, New York (2007)
4. Dargaye, Z., Leroy, X.: Mechanized verification of CPS transformations. In: Der-
showitz, N., Voronkov, A. (eds.) LPAR 2007. LNCS, vol. 4790, pp. 211–225.
Springer, Heidelberg (2007)
5. Fox, A.: Formal specification and verification of ARM6. In: Basin, D., Wolff, B.
(eds.) TPHOLs 2003. LNCS, vol. 2758, pp. 25–40. Springer, Heidelberg (2003)
6. Gordon, M.: Defining a LISP interpreter in a logic of total functions. In: The ACL2
Theorem Prover and Its Applications, ACL2 (2007)
7. Guttman, J., Ramsdell, J., Wand, M.: VLISP: A verified implementation of scheme.
Lisp and Symbolic Computation 8(1/2), 5–32 (1995)
8. Kaufmann, M., Moore, J.S.: An ACL2 tutorial. In: Mohamed, O.A., Muñoz, C.,
Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 17–21. Springer, Heidelberg
(2008)
9. Leroy, X.: Formal certification of a compiler back-end, or: programming a compiler
with a proof assistant. In: Principles of Programming Languages (POPL), pp. 42–
54. ACM Press, New York (2006)
10. Li, G., Owens, S., Slind, K.: A proof-producing software compiler for a subset of
higher order logic. In: European Symposium on Programming (ESOP). LNCS, pp.
205–219. Springer, Heidelberg (2007)
11. Manolios, P., Strother Moore, J.: Partial functions in ACL2. J. Autom. Reason-
ing 31(2), 107–127 (2003)
12. McCarthy, J., Abrahams, P.W., Edwards, D.J., Hart, T.P., Levin, M.I.: LISP 1.5
Programmer’s Manual. The MIT Press, Cambridge (1966)
13. McCreight, A., Shao, Z., Lin, C., Li, L.: A general framework for certifying garbage
collectors and their mutators. In: Ferrante, J., McKinley, K.S. (eds.) Proceedings
of the Conference on Programming Language Design and Implementation (PLDI),
pp. 468–479. ACM, New York (2007)
14. Myreen, M.O.: Formal verification of machine-code programs. PhD thesis, Univer-
sity of Cambridge (2009)
15. Myreen, M.O., Slind, K., Gordon, M.J.C.: Machine-code verification for multiple
architectures – An application of decompilation into logic. In: Formal Methods in
Computer Aided Design (FMCAD). IEEE, Los Alamitos (2008)
16. Myreen, M.O., Slind, K., Gordon, M.J.C.: Extensible proof-producing compilation.
In: Compiler Construction (CC). LNCS. Springer, Heidelberg (2009)
17. Pike, L., Shields, M., Matthews, J.: A verifying core for a cryptographic language
compiler. In: Manolios, P., Wilding, M. (eds.) Proceedings of the Sixth Interna-
tional Workshop on the ACL2 Theorem Prover and its Applications. HappyJack
Books (2006)
18. Sarkar, S., Sewell, P., Nardelli, F.Z., Owens, S., Ridge, T., Myreen, T.B.M.O.,
Alglave, J.: The semantics of x86-CC multiprocessor machine code. In: Principles
of Programming Languages (POPL). ACM, New York (2009)
370 M.O. Myreen and M.J.C. Gordon
19. Slind, K., Norrish, M.: A brief overview of HOL4. In: Mohamed, O.A., Muñoz, C.,
Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 28–32. Springer, Heidelberg
(2008)
builtin =
["nil"; "t"; "quote"; "+"; "-"; "*"; "div"; "mod"; "<"; "car"; "cdr";
"cons"; "equal"; "cond"; "atomp"; "consp"; "numberp"; "symbolp"; "lambda"]
The verification proofs of the primitive LISP operations build on lemmas about
lisp inv. The following lemma is used in the proof of the theorem about car
described in Section 3.1. This lemma can be read as saying that, if lisp inv relates
x1 to Dot-pair v1 , then x1 is a word-aligned address into memory segment m,
Verified LISP Implementations on ARM, x86 and PowerPC 371
(∃x y. Dot x y = v1 ) ∧
lisp inv (v1 , v2 , v3 , v4 , v5 , v6 , l) (x1 , x2 , x3 , x4 , x5 , x6 , a, m, m2 , m3 ) ⇒
(x1 & 3 = 0) ∧ x1 ∈ domain m ∧
lisp inv (v1 , car v1 , v3 , v4 , v5 , v6 , l) (x1 , m(x1 ), x3 , x4 , x5 , x6 , a, m, m2 , m3 )
One of our tools derives the following Hoare triple theorem for the ARM instruc-
tion that is to be verified: ldr r4,[r3] (encoded as E5934000).
All of the primitive LISP operations were verified in the same manner. For the
HOL4 implementation, a 50-line ML program was written to automate these
proofs given the appropriate lemmas about lisp inv.
372 M.O. Myreen and M.J.C. Gordon
Internally the compiler runs through a short proof when constructing the the-
orem presented in Section 4. This proof makes use of the following five proof
rules derived from the definition of our machine-code Hoare triple, developed in
previous work [15]. Formal definitions and detailed explanations are given in the
first author’s PhD thesis [14]. Here ∪ is simply set union.
The last rule mentions tailrec and pre, which are functions that satisfy:
∀x. tailrec(G, F, D)(x) = if G(x) then tailrec(G, F, D)(F (x)) else D(x)
∀x. pre(G, F, P )(x) = if G(x) then pre(G, F, P )(F (x)) ∧ P (x) else P (x)
Note that any tail-recursive function can be defined as an instance of tailrec, in-
troduced using a trick by Manolios and Moore [11]. Another noteworthy feature:
if pre(G, F, P )(x) is true then tailrec(G, F, D) terminates for input x.
The compiler starts its proof from the following theorems describing the test
v1 = Sym "nil" as well as operations car, cdr and plus.
1. { lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc p ∗ s }
p : E3330003
{ lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc (p + 4) ∗ sz (v1 = Sym "nil") ∗
∃n c v. sn n ∗ sc c ∗ sv v }
2. (∃x y. Dot x y = v1 ) ⇒
{ lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc p ∗ s }
p : E5935000
{ lisp (v1 , v2 , car v1 , v4 , v5 , v6 , l) ∗ pc (p + 4) }
3. (∃x y. Dot x y = v1 ) ⇒
{ lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc p ∗ s }
p : E5933004
{ lisp (cdr v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc (p + 4) }
4. (∃m n. Num m = v2 ∧ Num n = v3 ∧ m+n < 230 ) ⇒
{ lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc p ∗ s }
p : E0844005 E2444002
{ lisp (v1 , plus v2 v3 , v3 , v4 , v5 , v6 , l) ∗ pc (p + 4) }
Verified LISP Implementations on ARM, x86 and PowerPC 373
The compiler next generates two branches to glue the code together; the branch
instructions have the following specifications:
The specifications above are collapsed into theorems describing one pass through
the code by composing 1,5 and 1,6,2,3,4,7, which results in:
Code extension is applied to theorem 8, and then the rule for introducing a
tail-recursive function is applied. The compiler produces the following total-
correctness specification.
sumlist pre(v1 , v2 , v3 ) =
if v1 = Sym "nil" then true else
let cond = (∃x y. Dot x y = v1 ) in
let v3 = car v1 in
let cond = cond ∧ (∃x y. Dot x y = v1 ) in
let v1 = cdr v1 in
let cond = cond ∧ (∃m n. Num m = v2 ∧ Num n = v3 ∧ m+n < 230 ) in
let v2 = plus v2 v3 in
sumlist pre(v1 , v2 , v3 ) ∧ cond
374 M.O. Myreen and M.J.C. Gordon
When the loop rule is applied above, its parameters are assigned values:
p = λ(v1 , v2 , v3 ). lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc p ∗ s
q = λ(v1 , v2 , v3 ). lisp (v1 , v2 , v3 , v4 , v5 , v6 , l) ∗ pc (p + 28) ∗ s
G = λ(v1 , v2 , v3 ). v1 = Sym "nil"
F = λ(v1 , v2 , v3 ). (cdr v1 , plus v2 (car v1 , l), car v1 )
D = λ(v1 , v2 , v3 ). (v1 , v2 , v3 )
P = λ(v1 , v2 , v3 ). (v1 = Sym "nil") ⇒
(∃x y. Dot x y = v1 ) ∧
(∃m n. Num m = v2 ∧ Num n = car v1 ∧ m+n < 230 )
The lexing function sexp lex splits a string into a list of strings, e.g.
sexp lex "(car (’23 . y))" = ["(", "car", "(", "’", "23", ".", "y", ")", ")"]
1 Introduction
Now and then we must program a partially recursive function whose domain
of definedness we cannot decide or is undecidable, e.g., an interpreter. Reactive
programs such as operating systems and data base systems are not supposed
to terminate. To reason about such programs properly, we need semantics that
account for both terminating and non-terminating program runs. Compilers, for
example, should preserve both terminating and non-terminating behaviors of
source programs [10,13]. Standard operational semantics ignore (or say too little
about) non-terminating runs, so finer semantic accounts are necessary.
In this paper, we present four coinductive semantics for the While language
that we claim to be both adequate for reasoning about non-terminating runs as
well as well-designed. They represent four different styles of operational seman-
tics: big-step and small-step relational and big-step and small-step functional
semantics. Our semantics are based on traces, defined coinductively as possibly
infinite non-empty sequences of states. What is more, the evaluation and normal-
ization relations and functions are also coinductive/corecursive. The functional
semantics are constructively possible thanks to the fact that in the trace-based
setting, While becomes a total rather than partial language (every run defines
a trace, even if it may be infinite). All four semantics are constructively equiva-
lent. We have formalized our development in the Coq proof assistant, using the
Ssreflect syntax extension, see http://cs.ioc.ee/~keiko/majas.tar.gz.
It might be objected against this paper that the results are unsurprising, since
the semantics appear simple and enjoy all expected properties. They are simple
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 375–390, 2009.
c Springer-Verlag Berlin Heidelberg 2009
376 K. Nakata and T. Uustalu
indeed, but in the case of the two big-step semantics, this is a consequence of
very careful design decisions. As a matter of fact, getting coinductive big-step
semantics right is tricky, and in this situation it is really fortunate that simple
solutions are available. In the paper, we discuss some of the design considerations
and also show some design options that we rejected deliberately. Previous work
in the literature [6,9,14] also contains some designs that are more complicated
than ours or fail to have some clearly desirable properties or both. A skeptical
reader may also worry that While is a toy language. We argue that While is
sufficient for highlighting all important issues. In fact, our designs scale without
pain to procedures and language constructs for effects such as exceptions, non-
determinism and interactive input-output.
Programming and reasoning with coinductive types in type theory require
taking special care about productivity. Here the type checker of Coq help us avoid
mistakes by ruling out improductivity. But some limitations are imposed by the
implementation. For instance, 15 years ago a type-based approach for ensuring
productivity of corecursive definitions was developed [8]. This approach is more
flexible than the syntactic guardedness approach [7] of Coq, but it has not been
implemented. Several coding techniques have been proposed to circumvent the
limitations [2,14]. In our development, we rely on syntactic productivity.
The remainder of the paper is organized as follows. We introduce traces in
Section 2. We present the big-step relational semantics in Section 3, the small-
step relational semantics in Section 4, and the big-step and small-step functional
semantics in Sections 5 and 6, proving the equivalent along the way. We discuss
related work in Section 7 to conclude in Section 8.
The language we consider is the While language, defined inductively by the
following productions:
2 Traces
We describe the semantics of statements in terms of traces. A trace is a possibly
infinite non-empty sequence of states, the sequence of all states that the run
of the statement passes through, including the given initial state. We enforce
non-emptiness by having the nil constructor to also take a state as an argument.
Formally traces are defined coinductively by the following productions:
∗
(s0 , σ) ⇒ τ (s1 , τ ) ⇒ τ
(x := e, σ) ⇒ σ :: σ[x → eσ] (skip, σ) ⇒ σ (s0 ; s1 , σ) ⇒ τ
∗ ∗
σ |= e (st , σ :: σ) ⇒ τ σ |= e (sf , σ :: σ) ⇒ τ
(if e then st else sf , σ) ⇒ τ (if e then st else sf , σ) ⇒ τ
∗ ∗
σ |= e (st , σ :: σ) ⇒ τ (while e do st , τ ) ⇒ τ σ |= e
(while e do st , σ) ⇒ τ (while e do st , σ) ⇒ σ :: σ
∗
(s, σ) ⇒ τ (s, τ ) ⇒ τ
∗ ∗
(s, σ) ⇒ τ (s, σ :: τ ) ⇒ σ :: τ
Discussions on alternative designs. In the rest of this section we reveal some sub-
tleties in designing coinductive big-step semantics, by looking at several seem-
ingly not so different but problematic alternatives that we reject2 .
Since progress of loops is not required for wellformedness of the definitions of
∗
⇒ and ⇒, one might be tempted to regards guard testing to be instantaneous
and modify the rules for the while-loop to take the form
∗
σ |= e (st , σ) ⇒ τ (while e do st , τ ) ⇒ τ σ |= e
(while e do st , σ) ⇒ τ (while e do st , σ) ⇒ σ
2
Our Coq development includes complete definitions of these alternative semantics.
Trace-Based Coinductive Operational Semantics for While 379
This leads to undesirable outcomes. We can derive (while true do skip, σ) ⇒ σ,
which means that the non-terminating while true do skip is considered semanti-
cally equivalent to the terminal (immediately terminating) skip. Worse, we can
also derive (while true do skip; x := 17, σ) ⇒ σ :: σ[x → 17], which is even
more inadequate: a sequence can continue to run after the non-termination of
the first statement. Yet worse, inspecting the rules closer we discover we are
also able to derive (while true do skip, σ) ⇒ τ for any τ ! Mathematically, giving
up insisting on progress in terms of growing the trace has also the consequence
that the relational semantics cannot be turned into a functional one, although
While should intuitively be total and deterministic. In a functional semantics,
evaluation must be a trace-valued function and in a constructive setting such a
function must be productive.
Another option, where assignments and test of guards are properly taken to
∗
constitute steps, could be to define ⇒ by case distinction on the statement by
rules such as
∗ ∗
τ |=∗ e (st , duplast τ ) ⇒ τ (while e do st , τ ) ⇒ τ τ |=∗ e
∗ ∗
(while e do st , τ ) ⇒ τ (while e do st , τ ) ⇒ duplast τ
Here, duplast τ , defined corecursively, traverses τ and duplicates its last state,
if it is finite. Similarly, τ |=∗ e and τ |=∗ e traverse τ and evaluate e in the last
state, if it is finite:
τ |=∗ e σ |= e τ |=∗ e σ |= e
∗ ∗ ∗
σ :: τ |= e σ |= e σ :: τ |= e σ |=∗ e
(The rules for skip and sequence are very simple and appealing in this design.)
The relation ⇒ would then be defined uniformly by the rule
∗
(s, σ) ⇒ τ
(s, σ) ⇒ τ
It turns out that we can still derive (while true do skip, σ) ⇒ τ for any τ . We can
even derive (while true do x := x + 1, σ) ⇒ τ for any τ !
The third alternative (Leroy and Grall use this technique in [14]) is most
∗
close to ours. It introduces, instead of our ⇒ relation, an auxiliary relation split ,
defined coinductively by
split τ τ0 σ τ1
split σ σ σ σ split (σ :: τ ) σ σ (σ :: τ ) split (σ :: τ ) (σ :: τ0 ) σ τ1
so that split τ τ0 σ τ1 expresses that the trace τ can be split into a concate-
nation of traces τ0 and τ1 glued together at a mid-state σ . Then the evaluation
∗
relation is defined by replacing the uses of ⇒ with split , e.g., the rule for the
sequence statement would be:
split τ τ0 σ τ1 (s0 , σ) ⇒ τ0 (s1 , σ ) ⇒ τ1
(s0 ; s1 , σ) ⇒ τ
380 K. Nakata and T. Uustalu
This third alternative does not cause any outright anomalies for While. But
alarmingly s1 has to be run from some (underdetermined) state within a run
of s0 ; s1 even if the run of s0 does not terminate. In a richer language with
abnormal terminations, we get a serious problem: no evaluation is derived for
(while true do skip); abort although the abort statement should not be reached.
s0 s1
skip s0 ; s1
functional semantics. The first approach is stronger in that it does not rely on the
determinism of the semantics, thus prepares a better avenue for generalization to
a language with non-determinism. (Our functional semantics deals with single-
valued functions and thus the second approach relies on the determinism.)
The following lemma connects the big-step semantics with the terminality
predicate and one-step reduction relation and is proved by induction.
Lemma 5. For any s, σ and τ , if (s, σ) ⇒ τ then either s and τ = σ, or else
there are s , σ , τ such that (s, σ) → (s , σ ) and τ ≈ σ :: τ and (s , σ ) ⇒ τ .
Then correctness of the big-step semantics relative to the small-step semantics
follows by coinduction:
Proposition 1. For any s, σ and τ , if (s, σ) ⇒ τ then (s, σ) τ .
The opposite direction, that the small-step semantics is correct relative to the
big-step semantics, is more interesting. The proof proceeds by coinduction. At
the crux is the case of the sequence statement: we are given a normalization
(s0 ; s1 , σ) τ and the coinduction hypotheses for s0 (resp. s1 ) that enable us to
deduce (s0 , σ ) ⇒ τ (resp. (s1 , σ ) ⇒ τ ) from (s0 , σ ) τ (resp. (s1 , σ ) τ )
for any σ , τ . Naively, we have what we need to close the case. The assumption
(s0 ; s1 , σ) τ ensures that τ can be split into two parts τ0 and τ1 such that τ0
corresponds to running s0 and τ1 to running s1 . If τ0 is finite, we can traverse
τ0 until we hit its last state, to then invoke the coinduction hypothesis on s1 . If
∗
τ0 is infinite, we can deduce τ ≈ τ0 and (s1 , τ0 ) ⇒ τ0 by coinduction.
The actual proof is more involved. First we have to explicitly construct τ0 and
τ1 . This is possible by examining the proof of (s0 ; s1 , σ) τ . Our proof defines
an auxiliary function midp (s0 s1 : stmt) (σ : state) (τ : trace) (h : (s0 ; s1 , σ)
τ ) : trace by corecursion as follows. We look at the last inference in the proof h
of (s0 ; s1 , σ) τ . If s0 ; s1 is terminal, we return σ. Otherwise we have a proof
h0 of (s0 ; s1 , σ) → (s , σ ) and a proof h of (s , σ ) τ for some σ , τ such
that τ = σ :: τ . We look at the last inference in h0 . If s0 is terminal, we also
return σ. Else it must be the case that (s0 , σ) → (s0 , σ ) for some s0 such that
s = s0 ; s1 and we return σ :: midp s0 s1 σ τ h . The corecursive call is guarded
by consing σ. The following lemma is proved by coinduction.
Lemma 6. For any s0 , s1 , σ, τ , h : (s0 ; s1 , σ) τ , (s0 , σ) midp s0 s1 σ τ h.
Second, we cannot decide whether τ0 is finite as this would amount to deciding
whether running s0 from σ terminates. Our big-step semantics was carefully
crafted to avoid stumbling upon this problem, by introduction of the coinductive
∗
prefix closure ⇒ of ⇒ to uniformly handle the cases of both the finite and infinite
already accumulated trace. We need a small-step counterpart to it:
∗
(s, σ) τ (s, τ ) τ
∗ ∗
(s, σ) τ (s, σ :: τ ) σ :: τ
∗
The proposition (s, τ ) τ states that running s from the last state of an already
accumulated trace τ (if it has one) results in the total trace τ . The following
lemma is proved by coinduction.
382 K. Nakata and T. Uustalu
∗
Lemma 7. For any s0 , s1 , σ, τ , h : (s0 ; s1 , σ) τ , (s1 , midp s0 s1 σ τ h) τ .
Only now we can finally prove that the small-step relational semantics is correct
relative to the big-step relational semantics.
Proposition 2. For any s, σ, τ , τ , the following two conditions hold:
– if (s, σ) τ then (s, σ) ⇒ τ ,
∗ ∗
– if (s, τ ) τ then (s, τ ) ⇒ τ .
Proof. Both conditions are proved at once by mutual coinduction. We only show
the first condition in the case of the sequence statement, to demonstrate how
∗
the relation helps us avoid having to decide finiteness. Suppose we have h :
(s0 ; s1 , σ) τ . By Lemmata 6 and 7, we have (s0 , σ) midp s0 s1 σ τ h and
∗
(s1 , midp s0 s1 σ τ h) τ . By invoking the coinduction hypothesis on them, we
∗
obtain (s0 , σ) ⇒ midp s0 s1 σ τ h. and (s1 , midp s0 s1 σ τ h) ⇒ τ , from which
we deduce (s0 ; s1 , σ) ⇒ τ .
Differences between Coq’s Prop and Set force normalization to be Set-valued
rather than Prop-valued, since our definition of the trace-valued midp function
relies on case distinction on the proof of the given normalization proposition.
Case distinction on a proof of a Prop-proposition is not available for constructing
an element of a Set-set. This in turn requires the evaluation relation to also be
Set-valued, to be comparable to normalization. A further complication is that,
for technical reasons, the proofs of Lemmata 6 and 7 must rely on John Major
equality [15] and the principle that two JM-equal elements of the same type are
equal. Given that Coq’s support for programming with (co)inductive families (in
ML-style, as opposed to proving in the tactic language) is also weak (so midp
was easily manufactured in the tactic language, but we failed to construct it in
ML-style), one might wish to prove the equivalence of the big-step and small-
step semantics in some altogether different way. In the subsequent sections we
study functional semantics. These offer us a less direct route that is less painful
in the aspects we have just described.
Proposition 3 states that the inductive semantics is correct relative to the coin-
ductive semantics. Proposition 4 states that the coinductive semantics is correct
relative to the inductive semantics for terminating runs. Both propositions are
proved by induction.
Proposition 3. For any s and σ, if (s, σ) ind σ then there is τ such that
(s, σ) τ and τ ↓ σ .
Proposition 4. For any s, τ, σ, σ , if (s, σ) τ and τ ↓ σ then (s, σ) ind σ .
The connection between our coinductive big-step semantics and the inductive
big-step semantics can now be concluded from the well-known equivalence be-
tween the inductive big-step and small-step semantics. Y. Bertot has formalized
the proof of this equivalence in Coq [3].
We conclude this section by citing an observation by V. Capretta [4]. The
infiniteness predicate on traces is defined coinductively by the rule
τ
(σ :: τ )
We can prove in Coq the proposition ∀τ, (¬∃σ, τ ↓ σ) → τ . However the propo-
sition ∀τ, (∃σ, τ ↓ σ) ∨ τ can only be proved from ∀τ, (∃σ, τ ↓ σ) ∨ ¬(∃σ, τ ↓ σ).
Constructively, this instance of the classical law of excluded middle states de-
cidability of finiteness. For this reason, we reject what could be called sum-type
semantics. For instance, a relational semantics could relate a statement-state pair
to either a state for a terminating run or a special token ∞ for a non-terminating
run, i.e., an element from the sum type state +1, where 1 is the one-element type.
Or, it could be given as the disjunction of an inductive trace-based semantics,
describing terminating runs, and a coinductive trace-based semantics, describing
non-terminating runs, an approach studied in [14].
loop body from a state; p for testing the boolean guard on a state; and a state
σ, which is the initial state. loopseq takes a trace τ , the initial trace, instead of
a state, as the third argument. The two functions work as follows. loop takes
care of repeating of the loop body, once the guard of a while loop has been
evaluated. It analyzes the result and, if the guard is false, then the run of the
loop terminates. If it is true, then the loop body is evaluated by calling k. loop
then constructs the trace of the loop body by examining the result of k. If the
loop body does not augment the trace, which can only happen, if the loop body
is a sequence of skips, a new round of repeating the loop body is started by a
corecursive call to loop. The corecursive call is guarded by first augmenting the
trace, which corresponds to the new evaluation of the boolean guard. If the loop
body augments the trace, the new round is reached by reconstruction of the trace
of the current repetition with loopseq. On the exhaustion of this trace, loopseq
corecursively calls loop, again appropriately guarded. Our choice of augmenting
traces at boolean guards facilitates implementing loop in Coq: we exploit it to
satisfy Coq’s syntactic guardedness condition.
sequence, defined by simple corecursion, is similar to loopseq, but does not
involve repetition. It takes two arguments: k for running a statement (the second
Trace-Based Coinductive Operational Semantics for While 385
Equipped with these lemmata we are in the position to show the big-step and
small-step functional semantics to agree up to bisimilarity.
We can now prove that the small-step relational semantics correct relative to
the big-step relational semantics without having to rely on dependent pattern-
matching or JM equality by going through the functional semantics:
7 Related Work
X. Leroy and H. Grall [14] study two approaches to big-step relational semantics
for lambda-calculus, accounting for both terminating and non-terminating eval-
uations, fully formalized in Coq. In both approaches, evaluation relates lambda-
terms to normal forms or to reduction sequences.
The first approach, inspired by Cousot and Cousot [5], uses two evaluation
relations, an inductive one for terminating evaluations and a coinductive one
for non-terminating evaluations. The proof of equivalence to the small-step se-
mantics requires the use of an instance of the excluded middle, constructively
amounting to deciding halting. In essence, this means adopting a sum-type so-
lution. This has deep implications even for the big-step semantics alone: the
388 K. Nakata and T. Uustalu
determinism of the evaluation relation, for example, can only be shown by going
through the small-step semantics.
Leroy used this approach in his work on a certified C compiler [13]. To the
best of our knowledge, this was the first practical application of mechanized
coinductive semantics. C is one of the most used languages for developing pro-
grams that are not supposed to terminate, such as operating systems. Hence
it is important that a certified compiler for C preserves the semantics of non-
terminating programs. The work on the Compcert compiler is a strong witness
of the importance and practicality of mechanized coinductive semantics.
In our approach to While, we have a single evaluation relation for both ter-
minating and non-terminating runs. The big-step semantics is equivalent to the
small-step semantics constructively. Furthermore, the big-step semantics is con-
structively deterministic and the proof of this is without an indirection through
the small-step semantics.
Leroy and Grall [14] also study a different big-step semantics where both
terminating and non-terminating runs are described by a single coinductively
defined evaluation relation (“coevaluation”) relating lambda-terms to normal
forms or reduction sequences. This semantics does not agree with the small-step
semantics, since it assigns a result even to an infinite reduction sequence and
continues reducing a function even after the argument diverges.
Coinductive big-step relational semantics for While similar in some aspects to
Leroy and Grall’s work on lambda-calculus appear in the works of Glesner [9] and
Nestra [16,17]. Regardless of whether evaluation relates statement-state pairs to
possibly infinite traces, possibly non-wellfounded trees of states (“fractions”) or
transfinite traces, these approaches have it in common that the result of a non-
terminating run can be non-deterministic even for While-programs, which should
be deterministic. For one technical reason or another, it becomes possible in all
these semantics that after an infinite number of small steps a run reaches an under-
determined limit state and continues from there. In the case of Nestra [16,17], this
seems intended: he devised his non-standard “fractional” and transfinite seman-
tics to justify a program slicing transformation that is unsound under the standard
semantics. Elsewhere, the outcome appears accidental and undesired.
In our approach, we the take result of a program run to be given precisely by
what can be finitely observed: we record the state of the program at every finite
time instant. We never run ahead of the clock by jumping over some intermediate
states (in particular, we never run ahead of the clock infinitely much) and we
reject transfinite time. As a result of this design decision, the big-step semantics
agrees precisely with the small-step semantics and does so even constructively.
Coinductive functional semantics similar to ours have appeared in the works of
J. Rutten and V. Capretta. A difference is that instead of trace-based semantics
they looked at delayed state based semantics, i.e., semantics that, for a given
statement-state pair, return a possibly infinitely delayed state. Delayed states,
or Burroni conaturals over states, are like conatural numbers (possibly infinite
natural numbers), except that instead of the number zero their deconstruction
terminates (if it does) with a state.
Trace-Based Coinductive Operational Semantics for While 389
8 Conclusion
We have devised four trace-based coinductive semantics for While in different
styles of operational semantics. We were pleased to find that simple semantics
covering both terminating and non-terminating program runs are possible even
in the big-step relational and functional styles. The metatheory of our coinduc-
tive semantics is remarkably analogous to that of the textbook inductive seman-
tics and on finite runs they agree. Remarkably, everything can be arranged so
that in a constructive setting we never have to decide whether a trace is finite
or infinite.
References
1. Bertot, Y., Castéran, P.: Coq’Art: Interactive Theorem Proving and Program De-
velopment. Springer, Heidelberg (2004)
2. Bertot, Y.: Filters on coinductive streams, an application to Eratosthenes’ sieve. In:
Urzyczyn, P. (ed.) TLCA 2005. LNCS, vol. 3461, pp. 102–115. Springer, Heidelberg
(2005)
3. Bertot, Y.: A survey of programming language semantics styles. Coq development
(2007), http://www-sop.inria.fr/marelle/Yves.Bertot/proofs.html
4. Capretta, V.: General recursion via coinductive types. Logical Methods in Com-
puter Science 1(2), 1–18 (2005)
5. Cousot, P., Cousot, R.: Inductive definitions, semantics and abstract interpreta-
tion. In: Conf. Record of 19th ACM SIGPLAN-SIGACT Symp. on Principles of
Programming Languages, POPL 1992, Albuquerque, NM, pp. 83–94. ACM Press,
New York (1992)
6. Cousot, P., Cousot, R.: Bi-inductive structural semantics. Inform. and Com-
put. 207(2), 258–283 (2009)
7. Giménez, E.: Codifying guarded definitions with recursive schemes. In: Smith, J.,
Dybjer, P., Nordström, B. (eds.) TYPES 1994. LNCS, vol. 996, pp. 39–59. Springer,
Heidelberg (1995)
8. Giménez, E.: Structural recursive definitions in type theory. In: Larsen, K.G.,
Skyum, S., Winskel, G. (eds.) ICALP 1998. LNCS, vol. 1443, pp. 397–408. Springer,
Heidelberg (1998)
9. Glesner, S.: A proof calculus for natural semantics based on greatest fixed point
semantics. In: Knoop, J., Necula, G.C., Zimmermann, W. (eds.) Proc. of 3rd
Int. Wksh. on Compiler Optimization Meets Compiler Verification, COCV 2004,
Barcelona. Electron. Notes in Theor. Comput. Sci., vol. 132(1), pp. 73–93. Elsevier,
Amsterdam (2005)
10. Glesner, S., Leitner, J., Blech, J.O.: Coinductive verification of program optimiza-
tions using similarity relations. In: Knoop, J., Necula, G.C., Zimmermann, W.
(eds.) Proc. of 5th Int. Wksh. on Compiler Optimization Meets Compiler Verifi-
cation, COCV 2006, Vienna. Electron. Notes in Theor. Comput. Sci., vol. 176(3),
pp. 61–77. Elsevier, Amsterdam (2007)
11. Gonthier, G., Mahboubi, A.: A small scale reflection extension for the Coq system.
Technical Report RR-6455, INRIA (2008)
12. Hasuo, I., Jacobs, B., Sokolova, A.: Generic trace semantics via coinduction. Logical
Methods in Comput. Sci. 3(4), article 11(2007)
13. Leroy, X.: The Compcert verified compiler. Commented Coq development (2008),
http://compcert.inria.fr/doc/
14. Leroy, X., Grall, H.: Coinductive big-step operational semantics. Inform. and Com-
put. 207(2), 285–305 (2009)
15. McBride, C.: Elimination with a motive. In: Callaghan, P., Luo, Z., McKinna, J.,
Pollack, R. (eds.) TYPES 2000. LNCS, vol. 2277, pp. 197–216. Springer, Heidelberg
(2002)
16. Nestra, H.: Fractional semantic. In: Johnson, M., Vene, V. (eds.) AMAST 2006.
LNCS, vol. 4019, pp. 278–292. Springer, Heidelberg (2006)
17. Nestra, H.: Transfinite semantics in the form of greatest fixpoint. J. of Logic and
Algebr. Program (to appear)
18. Rutten, J.: A note on coinduction and weak bisimilarity for While programs. Theor.
Inform. and Appl. 33(4–5), 393–400 (1999)
A Better x86 Memory Model: x86-TSO
University of Cambridge
http://www.cl.cam.ac.uk/users/pes20/weakmemory
1 Introduction
Most previous research on the semantics and verification of concurrent programs
assumes sequential consistency: that accesses by multiple threads to a shared
memory occur in a global-time linear order. Real multiprocessors, however, in-
corporate many performance optimisations. These are typically unobservable by
single-threaded programs, but some have observable consequences for the be-
haviour of concurrent code. For example, on standard Intel or AMD x86 proces-
sors, given two memory locations x and y (initially holding 0), if two processors
proc:0 and proc:1 respectively write 1 to x and y and then read from y and x,
as in the program below, it is possible for both to read 0 in the same execution.
One can view this as a visible consequence of write buffering: each processor
effectively has a FIFO buffer of pending memory writes (to avoid the need to
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 391–407, 2009.
c Springer-Verlag Berlin Heidelberg 2009
392 S. Owens, S. Sarkar, and P. Sewell
block while a write completes), so the reads from y and x can occur before the
writes have propagated from the buffers to main memory. Such optimisations
destroy the illusion of sequential consistency, making it impossible (at this level
of abstraction) to reason in terms of an intuitive notion of global time.
To describe what programmers can rely on, processor vendors document ar-
chitectures. These are loose specifications, claimed to cover a range of past and
future actual processors, which should reveal enough for effective programming,
but without unduly constraining future processor designs. In practice, however,
they are informal prose documents, e.g. the Intel 64 and IA-32 Architectures
SDM [2] and AMD64 Architecture Programmer’s Manual [1]. Informal prose is
a poor medium for loose specification of subtle properties, and, as we shall see
in §2, such documents are often ambiguous, are sometimes incomplete (too weak
to program above), and are sometimes unsound (with respect to the actual pro-
cessors). Moreover, one cannot test programs above such a vague specification
(one can only run programs on particular actual processors), and one cannot use
them as criteria for testing processor implementations.
Architecture specifications are, therefore, prime targets for rigorous mech-
anised formalisation. In previous work [19] we introduced a rigorous x86-CC
model, formalised in HOL4 [11], based on the informal prose causal-consistency
descriptions of the then-current Intel and AMD documentation. Unfortunately
those, and hence also x86-CC, turned out to be unsound, forbidding some be-
haviour which actual processors exhibit.
In this paper we describe a new model, x86-TSO, also formalised in HOL4. To
the best of our knowledge, x86-TSO is sound, is strong enough to program above,
and is broadly in line with the vendors’ intentions. We present two equivalent def-
initions of the model: an abstract machine, in §3.1, and an axiomatic version, in
§3.2. We compensate for the main disadvantage of formalisation, that it can make
specifications less widely accessible, by extensively annotating the mathematical
definitions. To explore the consequences of the model, we have a hand-coded
implementation in our memevents tool, which can explore all possible executions
of litmus-test examples such as that above, and for greater confidence we have a
verified execution checker extracted from the HOL4 axiomatic definition, in §4.
We discuss related work in §5 and conclude in §6.
2.2 IWP/AMD64-3.14/x86-CC
In August 2007, an Intel White Paper [12] (IWP) gave a somewhat more pre-
cise model, with 8 informal-prose principles supported by 10 examples (known
as litmus tests). This was incorporated, essentially unchanged, into later revi-
sions of the Intel SDM (including rev.26–28), and AMD gave similar, though not
identical, prose and tests [1]. These are essentially causal-consistency models [4].
They allow independent readers to see independent writes (by different proces-
sors to different addresses) in different orders, as below (IRIW, see also [6]),
but require that, in some sense, causality is respected: “P5. In a multiprocessor
system, memory ordering obeys causality (memory ordering respects transitive
visibility)”.
These informal specifications were the basis for our x86-CC model, for which
a key issue was giving a reasonable interpretation to this “causality”. Apart
from that, the informal specifications were reasonably unambiguous — but they
turned out to have two serious flaws.
First, they are arguably rather weak for programmers. In particular, they
admit the IRIW behaviour above but, under reasonable assumptions on the
strongest x86 memory barrier, MFENCE, adding MFENCEs would not suffice
to recover sequential consistency [19, §2.12]. Here the specifications seem to be
much looser than the behaviour of implemented processors: to the best of our
knowledge, and following some testing, IRIW is not observable in practice. It
appears that some JVM implementations depend on this fact, and would not be
correct if one assumed only the IWP/AMD64-3.14/x86-CC architecture [9].
Second, more seriously, they are unsound with respect to current processors.
The following n6 example, due to Paul Loewenstein [14], shows a behaviour that
is observable (e.g. on an Intel Core 2 duo), but that is disallowed by x86-CC,
and by any interpretation we can make of IWP and AMD64-3.14.
n6 proc:0 proc:1
poi:0 MOV [x]←$1 MOV [y]←$2
poi:1 MOV EAX←[x] MOV [x]←$2
poi:2 MOV EBX←[y]
Final: 0:EAX=1 ∧ 0:EBX=0 ∧ [x]=1
cc : Forbid; tso : Allow
To see why this may be allowed by multiprocessors with FIFO write buffers,
suppose that first the proc:1 write of [y]=2 is buffered, then proc:0 buffers its
write of [x]=1, reads [x]=1 from its own write buffer, and reads [y]=0 from main
memory, then proc:1 buffers its [x]=2 write and flushes its buffered [y]=2 and
[x]=2 writes to memory, then finally proc:0 flushes its [x]=1 write to memory.
394 S. Owens, S. Sarkar, and P. Sewell
Given these problems with the informal specifications, we cannot produce a use-
ful rigorous model by formalising the “principles” they contain (as we attempted
with x86-CC [19]). Instead, we have to build a reasonable model that is consis-
tent with the given litmus tests, with observed processor behaviour, and with
what we know of the needs of programmers and of the vendors intentions.
The fact that write buffering is observable (iwp2.3.a/amd4 and n6) but IRIW
is not, together with the other tests that prohibit many other reorderings, strongly
suggests that, apart from write buffering, all processors share the same view of
memory (in contrast to x86-CC, where each processor had a separate view or-
der). This is broadly similar to the SPARC Total Store Ordering (TSO) memory
model [20,21], which is essentially an axiomatic description of the behaviour of
write-buffer multiprocessors. Moreover, while the term “TSO” is not used, infor-
mal discussions suggest this matches the intention behind the rev.29 informal
specification. Accordingly, we present here a rigorous x86-TSO model, with two
equivalent definitions.
The first definition, in §3.1, is an abstract machine with explicit write buffers.
The second definition, in §3.2, is an axiomatic model that defines valid executions
in terms of memory orders and reads-from maps. In both, we deal with x86
CISC instructions with multiple memory accesses, with x86 LOCK’d instructions
(CMPXCHG, LOCK;INC, etc.), with potentially non-terminating computations,
and with dependencies through registers. Together with our earlier instruction
semantics, x86-TSO thus defines a complete semantics of programs. The abstract
machine conveys the programmer-level operational intuition behind x86-TSO,
whereas the axiomatic model supports constraint-based reasoning about example
programs, e.g., by our memevents tool in §4.
The intended scope of x86-TSO, as for the x86-CC model, covers typical
user code and most kernel code: programs using coherent write-back memory,
without exceptions, misaligned or mixed-size accesses, ‘non-temporal’ operations
(e.g. MOVNTI), self-modifying code, or page-table changes.
Basic Types: Actions, Events, and Event Structures. As in our earlier
work, the action of (any particular execution of) a program is abstracted into a
set of events (with additional data) called an event structure. An event represents
a read or write of a particular value to a memory address, or to a register, or
the execution of a fence. Our earlier work includes a definition of the set of
event structures generated by an assembly language program. For any such event
structure, the memory model (there x86-CC, here x86-TSO) defines what a valid
execution is.
In more detail, each machine-code instruction may have multiple events asso-
ciated with it: events are indexed by an instruction ID iiid that identifies which
processor the event occurred on and the position in the instruction stream of the
instruction it comes from (the program order index, or poi). Events also have an
event ID eiid to identify them within an instruction (to permit multiple, other-
wise identical, events). An event structure indicates when one of an instruction’s
396 S. Owens, S. Sarkar, and P. Sewell
events has a dependency on another event of the same instruction with an intra
causality relation, a partial order over the events of each instruction. An event
structure also records which events occur together in a locked instruction with
atomicity data, a set of (disjoint, non-empty) sets of events which must occur
atomically together.
Expressing this in HOL, we index processors by a type proc = num, take types
address and value to both be the 32-bit words, and take a location to be either
a memory address or a register of a particular processor:
location = Location reg of proc reg
| Location mem of address
The model is parameterised by a type reg of x86 registers, which one should
think of as an enumeration of the names of ordinary registers EAX, EBX, etc.,
the instruction pointer EIP, and the status flags. To identify an instance of an
instruction in an execution, we specify its processor and its program order index.
Finally, an event has an instruction instance id, an event id (of type eiid = num,
unique per iiid), and an action:
event =[ eiid : eiid; iiid : iiid; action : action]
An event structure E comprises a set of processors, a set of events, an intra-
instruction causality relation, and a partial equivalence relation (PER) capturing
sets of events which must occur atomically, all subject to some well-formedness
conditions which we omit here.
There are four events — the inner (blue in the on-line version) boxes. The event
ids are pretty-printed alphabetically, as a,b,c,d, etc. We also show the assembly
A Better x86 Memory Model: x86-TSO 397
instruction that gave rise to each event, e.g. MOV [x]←$1, though that is not
formally part of the event structure.
Note that events contain concrete val-
ues: in this particular event structure,
there are two writes of x, with values
a: W [x]=1 d: W [x]=2 1 and 2, a read of [x] with value 2, and
proc:0 poi:0 proc:1 poi:0
a write of proc:0’s EAX register with
MOV [x]←$1 MOV [x]←$2
value 2. Later we show two valid exe-
cutions for this program, one for this
po rf event structure and one for another
(note also that some event structures
b: R [x]=2 may not have any valid executions).
proc:0 poi:1 In the diagram, the instructions of
MOV EAX←[x] each processor are clustered together,
into the outermost (magenta) boxes,
intra causality
with program order (po) edges be-
c: W 0:EAX=2
tween them, and the events of each
proc:0 poi:1
instruction are clustered together into
MOV EAX←[x]
the intermediate (green) boxes, with
intra-causality edges as appropriate —
here, in the MOV EAX←[x], the write
tso1 rfmap 0 (of ess 0)
of EAX is dependent on the read of x.
Computation Computation
(bypass)
(bypass)
Write Buffer
Write Buffer
Registers Registers
Lock RAM
(Note that there is nothing specific to any particular memory model in this
interface.) The states of the x86-TSO machine are records, with fields R, giving
a value for each register on each processor; M , giving a value for each shared
memory location; B , modelling a write buffer for each processor, as a list of
address/value pairs; and L, which is a global lock, either Some p, if p holds the
lock, or None. The HOL type is below.
machine state =[ R : proc → reg → value option; (* per-processor registers *)
M : address → value option; (* main memory *)
B : proc → (address#value)list; (* per-processor write buffers *)
L : proc option(* which processor holds the lock *)]
l
The behaviour of the x86-TSO machine, the transition relation s − → s , is
defined by the rules in Fig. 2. The rules use two auxiliary definitions: processor
p is not blocked in machine state s if either it holds the lock or no processor
does; and there are no pending writes in a buffer b for address a if there are no
(a, v ) pairs in b. Restating the rules informally:
1. p can read v from memory at address a if p is not blocked, has no buffered
writes to a, and the memory does contain v at a;
A Better x86 Memory Model: x86-TSO 399
Write to register
T
Evt p (Access W (Location reg p r )v )
s −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→
s ⊕ [R := s.R ⊕ (p → ((s.R p) ⊕ (r → Some v )))]
Barrier
(b = Mfence) =⇒ (s.B p = [ ])
Evt p (Barrier b)
s −−−−−−−−−−−−−−−→ s
Lock
(s.L = None) ∧ (s.B p = [ ])
Lock p
s −−−−−−→ s ⊕ [L := Some p]
Unlock
(s.L = Some p) ∧ (s.B p = [ ])
Unlock p
s −−−−−−−−→ s ⊕ [L := None]
2. p can read v from its write buffer for address a if p is not blocked and has
v as the newest write to a in its buffer;
3. p can read the stored value v from its register r at any time;
4. p can write v to its write buffer for address a at any time;
5. if p is not blocked, it can silently dequeue the oldest write from its write
buffer to memory;
6. p can write value v to one of its registers r at any time;
7. if p’s write buffer is empty, it can execute an MFENCE (so an MFENCE
cannot proceed until all writes have been dequeued, modelling buffer flush-
ing); LFENCE and SFENCE can occur at any time, making them no-ops;
8. if the lock is not held, and p’s write buffer is empty, it can begin a LOCK’d
instruction; and
9. if p holds the lock, and its write buffer is empty, it can end a LOCK’d
instruction.
1l 2l
Consider execution paths through the machine s0 − → s1 −→ s2 · · · consisting of
finite or infinite sequences of states and labels. We define okMpath to hold for
paths through the machine that start in a valid initial state (with empty write
buffers, etc.) and satisfy the following progress condition: for each memory write
in the path, the corresponding Tau transition appears later on. This ensures
that no write can stay in the buffer forever. (We actually formalize okMpath for
the event-annotated machine described below.)
We emphasise that this is an abstract machine: we are concerned with its
extensional behaviour: the (completed, finite or infinite) traces of labelled tran-
sitions it can perform (which should include the behaviour of real implementa-
tions), not with its internal states and the transition rules. The machine should
provide a good model for programmers, but may bear little resemblance to the
internal structure of implementations. Indeed, a realistic design would certainly
not implement LOCK’d instructions with a global lock, and would have many
other optimisations — the force of the x86-TSO model is that none of those
have programmer-visible effects, except perhaps via performance observations.
There are several variants of the machine with different degrees of locking which
we conjecture are observationally equivalent. For example, one could prohibit all
activity by other processors when one holds the lock, or not require write buffers
to be flushed at the start of a LOCK’d instruction.
We relate the machine to event structures in two steps, which we summarise
here (the HOL details can be found on-line [16]). First, we define a more in-
tensional event-machine: we annotate each memory and register location with
an event option, recording the most recent write event (if any) to that location,
refine write buffers to record lists of events rather than of plain location/value
pairs, and annotate labels with the relevant events. Second, we relate paths of
annotated labels and event structures with a predicate okEpath that holds when
the path is a suitable linearization of the event structure: there is a 1:1 corre-
spondence between non-Tau/Lock/Unlock labels of path and the events of E ,
the order of labels in path is consistent with program order and intra-causality,
and atomic sets are properly bracketed by Lock/Unlock pairs. Thus, okMpath
A Better x86 Memory Model: x86-TSO 401
describes paths that are valid according to the memory model, and okEpath de-
scribes those that are valid according to an event structure (that encapsulates
the other aspects of processor semantics).
Theorem 1. The annotation-erasure of the event-machine is exactly the ma-
chine presented above. [HOL proof]
A final state of a valid execution takes the last write in memory order for
each memory location, together with a maximal write in program order for each
register (or the initial state, if there is no such write). This is uniquely defined
assuming that no instruction has multiple unrelated writes to the same register
— a reasonable property for x86 instructions.
The definition of valid execution E X comprising the above conditions is
equivalent to one in which <X .memory order is required to be a linear order, not
just a partial order (again, the full details are on-line):
Theorem 2
1. If linear valid execution E X then valid execution E X .
404 S. Owens, S. Sarkar, and P. Sewell
c: W 0:EAX=2 c: W 0:EAX=1
proc:0 poi:1 proc:0 poi:1
MOV EAX←[x] MOV EAX←[x]
tso1 vos 0 (of productive ess 0) showing Require tso1 vos 0 (of productive ess 2) showing Require
Fig. 3. Example valid execution witnesses (for two different event structures)
stream-like linear order over labels that satisfies several conditions (label order
in the HOL sources) describing labels in an okMpath. We then have:
Theorem 4. For any well-formed event structure E , and valid execution X
for E , there exists some event-machine path, such that okEpath E path and
okMpath path, in which the memory reads and write-buffer flushes both respect
<X .memory order . [hand proof, relying on the preceding lemma]
5 Related Work
There is an extensive literature on relaxed memory models, but most of it does
not address x86, and we are not aware of any previous model that addresses the
concerns of §2. We touch here on some of the most closely related work.
406 S. Owens, S. Sarkar, and P. Sewell
There are several surveys of weak memory models, including those by Adve
and Gharachorloo [3], and by Higham et al. [13]; the latter formalises a range of
models, including a TSO model, in both operational and axiomatic styles, and
proves equivalence results. Their axiomatic TSO model is rather closer to the
operational style than ours is, and both are idealised rather than x86-specific.
Burckhardt and Musuvathi [8, Appendix A] also give operational and axiomatic
definitions of a TSO model and prove equivalence, but only for finite executions.
Their models treat memory reads and writes and barrier events, but lack regis-
ter events and locked instructions with multiple events that happen atomically.
Hangel et al. [10] describe the Sun TSOtool, checking the observed behaviour
of pseudo-randomly generated programs against a TSO model. Roy et al. [17]
describe an efficient algorithm for checking whether an execution lies within an
approximation to a TSO model, used in Intel’s Random Instruction Test (RIT)
generator. Boudol and Petri [7] give an operational model with hierarchical write
buffers (thereby permitting IRIW behaviours), and prove sequential consistency
for data-race-free (DRF) programs. Loewenstein et al. [15] describe a “golden
memory model” for SPARC TSO, somewhat closer to a particular implementa-
tion microarchitecture than the abstract machine we give in §3.1, that they use
for testing implementations. They argue that the additional intensional detail
increases the effectiveness of simulation-based verification. Saraswat et al. [18]
also define memory models in terms of local reordering, and prove a DRF the-
orem, but focus on high-level languages. Several groups have used proof tools
to tame the intricacies of these models, including Yang et al. [22], using Pro-
log and SAT solvers to explore an axiomatic Itanium model, and Aspinall and
Ševčı́k [5], who formalised and identified problems with the Java Memory Model
using Isabelle/HOL.
6 Conclusion
We have described x86-TSO, a memory model for x86 processors that does not
suffer from the ambiguities, weaknesses, or unsoundnesses of earlier models. Its
abstract-machine definition should be intuitive for programmers, and its equiva-
lent axiomatic definition supports the memevents exhaustive search and permits
an easy comparison with related models; the similarity with SPARCv8 suggests
x86-TSO is strong enough to program above. Mechanisation in HOL4 revealed a
number of subtle points of detail, including some of the well-formed event struc-
ture conditions that we depend on (e.g. that instructions have no internal data
races). We hope that this will clarify the semantics of x86 architectures.
References
1. AMD64 Architecture Programmer’s Manual (3 vols). Advanced Micro Devices,
rev. 3.14 (September 2007)
2. Intel 64 and IA-32 Architectures Software Developer’s Manual (5 vols). Intel Cor-
poration, rev. 29 (November 2008)
3. Adve, S., Gharachorloo, K.: Shared memory consistency models: A tutorial. IEEE
Computer 29(12), 66–76 (1996)
4. Ahamad, M., Neiger, G., Burns, J., Kohli, P., Hutto, P.: Causal memory: Definitions,
implementation, and programming. Distributed Computing 9(1), 37–49 (1995)
5. Aspinall, D., Ševčı́k, J.: Formalising Java’s data race free guarantee. In: Schnei-
der, K., Brandt, J. (eds.) TPHOLs 2007. LNCS, vol. 4732, pp. 22–37. Springer,
Heidelberg (2007)
6. Boehm, H.-J., Adve, S.: Foundations of the C++ concurrency memory model. In:
Proc. PLDI (2008)
7. Boudol, G., Petri, G.: Relaxed memory models: an operational approach. In: Proc.
POPL, pp. 392–403 (2009)
8. Burckhardt, S., Musuvathi, M.: Effective program verification for relaxed memory
models. Technical Report MSR-TR-2008-12, Microsoft Research (2008); Gupta, A.,
Malik, S. (eds.) CAV 2008. LNCS, vol. 5123, pp. 107–120. Springer, Heidelberg (2008)
9. Dice, D.: Java memory model concerns on Intel and AMD systems (January 2008),
http://blogs.sun.com/dave/entry/java_memory_model_concerns_on
10. Hangal, S., Vahia, D., Manovit, C., Lu, J.-Y.J., Narayanan, S.: TSOtool: A program
for verifying memory systems using the memory consistency model. In: Proc. ISCA,
pp. 114–123 (2004)
11. The HOL 4 system, http://hol.sourceforge.net/
12. Intel. Intel 64 architecture memory ordering white paper. SKU 318147-001 (2007)
13. Higham, L., Kawash, J., Verwaal, N.: Defining and comparing memory consistency
models. PDCS, Full version as TR #98/612/03, U. Calgary (1997)
14. Loewenstein, P.: Personal communication (November 2008)
15. Loewenstein, P.N., Chaudhry, S., Cypher, R., Manovit, C.: Multiprocessor memory
model verification. In: Proc. AFM (Automated Formal Methods), FLoC workshop
(August 2006), http://fm.csl.sri.com/AFM06/
16. Owens, S., Sarkar, S., Sewell, P.: A better x86 memory model: x86-TSO (extended
version). Technical Report UCAM-CL-TR-745, Univ. of Cambridge (2009), Sup-
porting material at, www.cl.cam.ac.uk/users/pes20/weakmemory/
17. Roy, A., Zeisset, S., Fleckenstein, C.J., Huang, J.C.: Fast and generalized polyno-
mial time memory consistency verification. In: Ball, T., Jones, R.B. (eds.) CAV
2006. LNCS, vol. 4144, pp. 503–516. Springer, Heidelberg (2006)
18. Saraswat, V., Jagadeesan, R., Michael, M., von Praun, C.: A theory of memory
models. In: Proc. PPoPP (2007)
19. Sarkar, S., Sewell, P., Zappa Nardelli, F., Owens, S., Ridge, T., Braibant, T.,
Myreen, M., Alglave, J.: The semantics of x86-CC multiprocessor machine code.
In: Proc. POPL 2009 (January 2009)
20. Sindhu, P.S., Frailong, J.-M., Cekleov, M.: Formal specification of memory models.
In: Scalable Shared Memory Multiprocessors, pp. 25–42. Kluwer, Dordrecht (1991)
21. SPARC International, Inc. The SPARC architecture manual, v. 8. Revision
SAV080SI9308 (1992), http://www.sparc.org/standards/V8.pdf
22. Yang, Y., Gopalakrishnan, G., Lindstrom, G., Slind, K.: Nemos: A framework for
axiomatic and executable specifications of memory consistency models. In: IPDPS
(2004)
Formal Verification of Exact Computations
Using Newton’s Method
1 Introduction
The Standard Library of the Coq proof assistant [4,1] contains a formalization
of real numbers based on a set of axioms. This gives the real numbers all the
desired theoretical properties and makes theorem proving more agreeable and
close to “pencil and paper” proofs [16]. However, this formalization has no (or
little) computational meaning. During this paper we shall refer to the reals from
this implementation as “axiomatic reals”. We note that Coq is not a special case
and proof assistants in general provide libraries with results from real analysis
[5,7,8,10], but with formalizations for real numbers that are not well suited for
computations. However, in a proof process, it is often the case that we are in-
terested in computing with the real numbers (or at least approximating such
computations), so a considerable effort has been invested in having libraries of
exact computations for proof systems [13,15,18]. We shall refer to numbers from
such implementations as “exact reals”. These libraries provide verified computa-
tions for a set of operations and elementary functions on real numbers.
The results in this paper are concerned with Newton’s method. Under certain
conditions, this method ensures the convergence at a certain speed towards a
root of the given function, the unicity of this root in a certain domain and the
local stability. But, as the “paper” proof for these results depends on non-trivial
theorems from analysis like the mean value theorem and concepts like continuity,
derivation etc. the formal development conducted around them is based on the
axiomatic reals of Coq. We would like to transfer these “theoretical” properties
to the computations done with exact reals. Our work is thus conducted in two
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 408–423, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Formal Verification of Exact Computations Using Newton’s Method 409
f (x(n) )
x(n+1) = x(n) − , n = 0, 1, 2, . . . (1)
f (x(n) )
converges and lim x(n) = x∗ is a solution of the initial system, so that |x∗ −
n→∞
x(0) | ≤ 2B0 ≤ ε.
The convergence of the process ensures that Newton’s method is indeed appro-
priate for determining the root of the function. The unicity of the solution in
a certain domain is used in practice for isolating the roots of the function. The
result on the speed of the convergence means we know a bound for the distance
between a given element of the sequence and the root of the function. This rep-
resents the precision at which an element of the sequence approximates the root.
In practice this theorem is used to determine the number of iterations needed
in order to achieve a certain precision for the solution. The result on the sta-
bility of the process will help with efficiency issues as it allows the use of an
approximation rather than an exact real.
We do not present here the proofs of the theorems, we just give a few elements
of these proofs that are needed in understanding the next section. For details
on the proofs we refere the reader to [6]. The central element of the proof is
an induction process that establishes a set of properties for each element of
the Newton sequence. The proof introduces the auxiliary sequences {An }n∈N ,
{Bn }n∈N and {μn }n∈N :
An = 2An−1 (2)
2 1
Bn = An−1 Bn−1 C= μn−1 Bn−1 (3)
2
μ20 + 46μ0 + 17
B = B0 (6)
8(7 + μ0 )μ0
this makes that
μ20 + 46μ0 + 17
μ = <1 (7)
(7 + μ0 )2
We summarize these results in:
Corollary 1. If the conditions of Theorem 1 are satisfied and if, additionally,
0 < μ0 < 1 and [x(0) − μ20 B0 , x(0) + μ20 B0 ] ⊂ [a, b], then for any initial approxi-
mation x(0) that satisfies |x(0) − x(0) | ≤ 1−μ 0
4μ0 B0 the associated Newton’s process
∗
converges to the root x .
then
a. the sequence {t(n) }n∈N converges and lim t(n) = x∗ where x∗ is the root of
n→∞
the function f given by Theorem 1
b. ∀n, |x∗ − t(n) | ≤ 2n−1
1
B0
The first hypothesis makes sure that the new value will also be in the range of
the function. The second and third hypotheses come from the use of the stabil-
ity property of the Newton sequence (see Corollary 1). The fourth hypothesis
controls the approximation we are allowed to make at each iteration. The con-
clusion gives us the convergence of the process to the same limit as Newton’s
method without approximations. Also we give an estimate of the distance from
the computed value to the root at each step.
Proof. Our proof is based on those for theorems 1 - 4 and corollary 1. To give
the intuition behind the proof, we decompose Newton’s perturbed process t(n)
as follows:
◦ At step 1. we start with the initial x(0) that satisfies the conditions in The-
orem 1. This means that Newton’s method from this initial point converges
to the root x∗ (cf. Theorem 1).
◦ At step 2. we consider a Newton sequence starting with x(1) . This sequence is
the same as the sequence at step 1. except that we “forget” the first element
of the sequence and start with the second. It is trivial that this sequence
converges to the root x∗ . We note that (cf. proof of Theorem 1) we can
associate the constants A1 , B1 to the initial iteration of this sequence and
get the corresponding hypotheses from Theorem 1.
Formal Verification of Exact Computations Using Newton’s Method 413
n 0
(2), (3). We get that lim Y 0 = x∗ and |x∗ − Y 0 | = |x∗ − Y01 | ≤ 2B 0 =
n→∞
2(A0 B02 C).
◦ Now we consider {Y1n }n∈N . The initial point of this sequence is Y10 = rnd1
0
(Y00 − f (Y00 )/f (Y00 )) = rnd1 (Y 0 ). We are in the situation of Corollary 1,
n
where we have a converging sequence ( {Y 0 }n∈N ) and we introduce an ap-
proximation in the initial iteration. To be able to apply this corollary we need
0 0 0 0
to verify 0 < μ0 < 1, [Y 0 − μ2 B 0 , Y 0 + μ2 B 0 ] ⊂ [a, b] and |rnd1 (Y 0 ) − Y 0 | ≤
0 0
1−μ0
4μ0 B 0 .
We will show later on that under our hypotheses these three con-
ditions are indeed verified. From Corollary 1 we get the new constants ac-
cording to relations (5), (6). This makes that we find ourselves again in
n
the conditions of Theorem 1 and we can deduce that lim Y 1 = x∗ and
n→∞
∗ μ20 +46μ0 +17
|x − Y10 | ≤ 2B = 2 8(7+μ B0.
0 )μ0
We are in the appropriate conditions to start this process again and explain
in the same manner the properties for {Y2n }n∈N , {Y3n }n∈N , etc. The auxiliary
sequences are given by the following relations:
8
A0 = A0 and An+1 =
7 + μn (2An )
μn 2 + 46μn + 17 2
B0 = B0 and Bn+1
= (An Bn C)
8(7 + μn )μn
we also consider
22
μ n + 46μ n + 17
2
μ2 + 46μn + 17
μn+1 = 2An+1 Bn+1
C = n =
(7 + μn )2 (7 + μ 2n )2
1 − μ n 1 1 − μ n
2 2
1 − μn
Rn = Bn = ( μn Bn ) = Bn
4μn 2
4μ n 2 8μn
Using the above reasoning steps, we get by induction that |Yn0 − x∗ | ≤ 2Bn
and we also manage to show ∀n, Bn+1 ≤ 12 Bn ≤ 2n−1
1
B0 . The latter relations is
deduced from the above formulas by basic manipulations. It trivially implies the
convergence of the perturbed sequence to the root x∗ .
We need some auxiliary results to ensure that Corollary 1 is applied in the
appropriate conditions each time we make a rounding. These results are as follow:
◦ 0 < 12 ≤ μ0 = μ0 ≤ μn ≤ μn+1 ≤ . . . < 1
2
1 1−μ0
◦ Rn+1 ≤ 13 Rn ≤ . . . ≤ 31n R0 = 31n 1−μ 0
4μ0 B 0 = 3n 8μ0 B0
◦ |Yn+1
0
− Yn0 | ≤ 21n B0 + 31n R0
0 0
◦ [Y n − μ2 B n , Y n + μ2 B n ] ⊆ [Y00 − 3B0 , Y00 + 3B0 ] ⊂ [a, b]
n n
We do not discuss all the details as they are elementary reasoning steps concern-
ing inequalities, second degree equations or geometric series. All these results
have been formalized in Coq to ensure that no steps are overlooked.
Formal Verification of Exact Computations Using Newton’s Method 415
A real number represented by a stream and for which we know the first digit
s1 +sβ
can be written as: r = s1 ::sβ = β .
Having signed digits makes our representation redundant. For example we
can represent 13 as 3::3::3::3 . . .10 but also as 4:: − 7::4:: − 7 . . .10 .
For each digit k the set of real numbers that admit a representation beginning
by this digit is: [ k−1 k+1
β , β ]. The sets associated to consecutive digits overlap with
a constant magnitude of β2 . The main benefit of this redundancy is that we are
able to design algorithms for which we can decide a possible first digit of the
output. Without redundancy this is in general undecidable. Take the example
of addition: 0::3 . . .10 + 0::6 . . .10 may need infinite precision to decide
whether the first digit is 0 or 1. In the case of signed digits we give 1 as a first
digit knowing we can always go back to a smaller number by using a negative
digit. We also note that in our example it was sufficient to know two digits of
the input to decide the first digit of the output and this is true for addition in
general.
Designing an algorithm therefore requires approximating the result to a pre-
cision that is sufficient to determine a possible first digit. Also, since our real
numbers are infinite streams, the algorithms need to be designed in such a way
that we are always able to provide an extra digit of the result. This is done by
co-recursive calls on our co-inductive streams.
416 N. Julien and I. Paşca
This relation also makes sure that streams only represent reals in [−1, 1] and
that the digits are in the set of the allowed signed digits.
The correctness of our algorithms is verified when we manage to express a
represents relation between our implementation and the standard in the Coq
library. For instance the proof that the multiplication is correct is formulated in
this way :
In means that every time we have an exact real (i.e. a stream of digits) x that
represents an axiomatic real vx and an y that represents a vy than our multi-
plication of streams x and y (here denoted ⊗) will represent the multiplication
of axiomatic reals vx and vy.
For further details on algorithms and proofs for this library we refer the reader
to [13].
Formal Verification of Exact Computations Using Newton’s Method 417
The relation between elements of the same rank in the two sequences:
∀ n, represents (EXn g EX0 n) (Xn X0 f f’ n)
is almost trivial, if we have a represents relation for the initial iteration and
for the function.
Theorem EXn correct : ∀ g ex0 f f’ x0 n, represents ex0 x0 →
(∀ x vx, represents x vx → represents (g x) (f vx / f ’ vx)) →
(∀ n, −1 ≤Xn x0 f f’ n ≤ 1) → represents (EXn g ex0 n) (Xn x0 f f’ n).
The proof follows from the correction of the subtraction on streams with respect
to the subtraction on axiomatic reals.
This theorem allows us to transfer properties proved for Newton’s method
on axiomatic reals to the method implemented on exact reals. If we satisfy the
conditions of Theorem 1 for the function f and the initial iteration X0, then
we can compute the root of the function at an arbitrary accuracy, given by
Theorem 3 (speed of convergence). From the same theorem we get the rank to
which we need to compute for a given accuracy to be obtained. However, if we
wanted to increase this accuracy, we would need to redo all the computation for
the new rank. We want to avoid this and take advantage of the lazy evaluation
characteristic for streams: we can design an algorithm that uses Newton’s method
to compute an arbitrary number of digits for the root of a given function, under
certain conditions for this function.
f1 (x) f ( d1β+x ) d1 + x
g1 (x) := = 1 d +x
= β × g( )
f1 (x) βf ( β )
1 β
For the exact real implementation in Coq we express the algorithm on streams
d1 +xβ
of digits, so we remind that for the stream d1 ::x, we have d1 ::xβ = β
CoFixpoint exact newton (g: stream digit → stream digit) ex0 n:=
match (make digit (EXn g ex0 n) with
|d1:: x’ ⇒ d1::exact newton (fun x ⇒ (β & g (d1::x))) x’ n
end.
1. f ∈ C (1) (] − 1, 1[)
2. ∀x, y ∈] − 1, 1[, |f (x) − f (y)| ≤ C|x − y|
3. f (x(0) ) = 0 and | f (x1(0) ) | ≤ A0 ;
(0)
4. | ff(x )
(x(0) )
| ≤ B0 ≤ 2ε ;
5. μ0 = 2A0 B0 C ≤ 1.
Relations 3. - 5. are given by the proof of Theorem 1. We are now able to prove
by co-induction that represents (exact_newton g ex0 n) x∗ .
Though short, elegant and proven correct, the algorithm presented in this section
is not usable in practice as it is very slow. There are two main reasons for this:
1. The certified computations from the library require a precision of the operands
higher than that of the result. We saw that in the case of addition one extra
digit is required, but for other operations and function this precision can be
higher. When we have an expression where we perform several operations,
the precision demanded for each individual operand is a lot higher than the
precision of the output. In the case of Newton’s method, each iteration only
brings a certain amount of information, so using a higher precision will not
improve the result.
2. This approach relies on the higher-order capabilities of the functional pro-
gramming language: the first argument of the exact_newton function is
itself a function that becomes more and more complex as exact_newton
calls itself recursively. The management of this function is somehow trans-
parent to the programmer, but it has a cost: a new closure is built at every
recursive call to exact_newton and when the function g is called, all the
closures built since the initial call have to be unraveled to obtain the opera-
tions that really need to be performed. This cost can be avoided by building
directly a first order data structure.
We discuss two possible improvements of this algorithm, dealing with these two
issues. For the first point the solution is simple, just use the significant digits
in the stream. Determining which are these significant digits and certifying the
result is still possible thanks to Theorem 5. We implement a truncate function
that given a stream s returns the stream containing the first n digits of s and
sets the rest to zero. This function represents the rnd function on axiomatic
reals (see Theorem 5).
Fixpoint truncate s n {struct n} :=
match n with | 0 ⇒ zero | S n’ ⇒
match s with | d :: s’ ⇒ d :: truncate s’ n’ end
end.
The function φ controls the approximation we can make at each iteration and
follows the constraints imposed by Theorem 5. The exact_newton algorithm
will work in the same way with this sequence as with the original method.
CoFixpoint exact newton rnd (g: stream digit → stream digit) ex0 n:=
match (make digit (Etn g ex0 n)) with
|d1:: x’ ⇒ d1::exact newton rnd (fun x ⇒ (β & g (d1::x))) x’ n
end.
Though the proof for this new algorithm is not finalized yet, we feel there is
no real difficulty in obtaining it as both the algorithm and the optimization we
make are proven correct.
To tackle the second point in our list of possible improvements we make ex-
plicit the construction of the new function g in the co-recursive call.
CoFixpoint exact newton aux
(g : stream digit → stream digit) (Xn : stream digit) k n :=
let Xn’ := make k digits x0 (EXn g x0 n) k in
(nth k Xn’) :: exact newton aux g Xn’ (S k) n.
Newton’s method is commonly used for the implementation of nth root function
or division. We discuss the example of the square root to illustrate the behaviour
of our algorithms. The square root of a positive real number a is the root of the
f (x)
function fsqrt (x) = x2 − a. The corresponding function gsqrt is fsqrt
= x2 − 2x
a
.
sqrt (x)
Due to restrictions about implementing the inverse function of exact reals, the
Formal Verification of Exact Computations Using Newton’s Method 421
Theoriginal algorithm is slow. For example the computation of the first digit
of 12 in base 2124 using the original algorithm blocks the system, while for the
same algorithm improved with approximations we get the equivalent precision of
37 decimal digits in 12 seconds. The second algorithm exact_newton2 brings
an improvement at each new digit we want to obtain making the algorithm
run in average twice as fast. We should also take into consideration that using
f (x) = xa2 − 1 can improve our execution times considerably as there is only
one division involved. Nevertheless, our intention here was not to implement
an efficient square root, but to test the capabilities of the previously presented
algorithms.
5 Related Work
This work presents different angles in the formal verification of a numerical al-
gorithm. A lot of work is being done concerning formally verified exact real
arithmetic libraries. Besides the library presented here, the development [15] for
PVS and [18] also for Coq are two of the most recent such implementations.
These two libraries have computations that are verified with respect to the real
analysis formalizations in PVS and C-CoRN [5], respectively. A significant part
of the work presented here could be reproduced in any of these libraries. In the
case of [18] the exact reals operations and functions are verified via an isomor-
phism between the exact reals and the C-CoRN real structure; there is also an
isomorphism between C-CoRN reals and Standard Library reals (see [14]), so in
theory it should be possible to verify computations by using the presented proofs
and the two isomorphisms.
Concerned with exact real arithmetic and also with co-inductive aspects we
mention the work of Niqui [17]. This works aims to obtain all field operations on
real numbers via the Edalat-Potts algorithm for lazy exact arithmetic.
Results of the convergence of Newton’s method with rounding have been
proved for some special cases like the the inverse and the square root [3]. Of
course, in these cases the speed of convergence is better than in the general case.
The proof of correctness of square root algorithms has been the subject of
several formal developments. We mention [2] for the verification of the GMP
422 N. Julien and I. Paşca
square root algorithm, [11] for an Intel architecture square root algorithm and
[20] for the verification of the square root algorithm in an IBM processor.
A general algorithm using Newton’s method was developped by Hur and Dav-
enport [12] on a different representation of exact reals but not in a formally
verified setting.
Acknowledgments
We thank Yves Bertot for his help and constructive suggestions.
Formal Verification of Exact Computations Using Newton’s Method 423
References
1. Bertot, Y., Castéran, P.: Interactive Theorem Proving and Program Development,
Coq’Art:the Calculus of Inductive Constructions. Springer, Heidelberg (2004)
2. Bertot, Y., Magaud, N., Zimmermann, P.: A Proof of GMP Square Root. J. Autom.
Reasoning 29(3-4), 225–252 (2002)
3. Brent, R.P., Zimmermann, P.: Modern Computer Arithmetic (2006) (in prepara-
tion), http://www.loria.fr/zimmerma/mca/pub226.html
4. Coq development team. The Coq Proof Assistant Reference Manual, version 8.1.
(2006)
5. Cruz-Filipe, L., Geuvers, H., Wiedijk, F.: C-CoRN: The Constructive Coq Repos-
itory at Nijmegen. In: Asperti, A., Bancerek, G., Trybulec, A. (eds.) MKM 2004.
LNCS, vol. 3119, pp. 88–103. Springer, Heidelberg (2004)
6. Démidovitch, B., Maron, I., et al.: Éléments de calcul numérique. Mir - Moscou
(1979)
7. Fleuriot, J.D.: On the mechanization of real analysis in Isabelle/HOL. In: Harrison,
J., Aagaard, M. (eds.) TPHOLs 2000. LNCS, vol. 1869, pp. 146–162. Springer,
Heidelberg (2000)
8. Gamboa, R., Kaufmann, M.: Nonstandard Analysis in ACL2. Journal of automated
reasoning 27(4), 323–428 (2001)
9. Giménez, E.: Codifying guarded definitions with recursive schemes. In: Dybjer, P.,
Nordström, B., Smith, J. (eds.) TYPES 1994. LNCS, vol. 996, pp. 39–59. Springer,
Heidelberg (1995)
10. Harrison, J.: Theorem Proving with the Real Numbers. Springer, Heidelberg (1998)
11. Harrison, J.: Formal verification of square root algorithms. Formal Methods in
System Design 22(2), 143–153 (2003)
12. Hur, N., Davenport, J.H.: A generic root operation for exact real arithmetic. In:
Blanck, J., Brattka, V., Hertling, P. (eds.) CCA 2000. LNCS, vol. 2064, pp. 82–87.
Springer, Heidelberg (2001)
13. Julien, N.: Certified exact real arithmetic using co-induction in arbitrary integer
base. In: Garrigue, J., Hermenegildo, M.V. (eds.) FLOPS 2008. LNCS, vol. 4989,
pp. 48–63. Springer, Heidelberg (2008)
14. Kaliszyk, C., O’Connor, R.: Computing with classical real numbers. In: CoRR,
abs/0809.1644 (2008)
15. Lester, D.R.: Real Number Calculations and Theorem Proving. In: Mohamed, O.A.,
Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 215–229. Springer,
Heidelberg (2008)
16. Mayero, M.: Formalisation et automatisation de preuves en analyses reelle et nu-
merique. Ph.D thesis, Université de Paris VI (2001)
17. Niqui, M.: Coinductive formal reasoning in exact real arithmetic. Logical Methods
in Computer Science 4(3-6), 1–40 (2008)
18. O’Connor, R.: Certified Exact Transcendental Real Number Computation in Coq.
In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170,
pp. 246–261. Springer, Heidelberg (2008)
19. Paşca, I.: A Formal Verification for Kantorovitch’s Theorem. Journées Franco-
phones des Langages Applicatifs, 15–29 (2008)
20. Sawada, J., Gamboa, R.: Mechanical verification of a square root algorithm using
taylor’s theorem. In: Aagaard, M., O’Leary, J.W. (eds.) FMCAD 2002. LNCS,
vol. 2517, pp. 274–291. Springer, Heidelberg (2002)
Construction of Büchi Automata for LTL Model
Checking Verified in Isabelle/HOL
1 Introduction
The term model checking [2] subsumes several algorithmic techniques for the ver-
ification of reactive and concurrent systems, in particular with respect to proper-
ties expressed as formulae of temporal logics. More specifically, the context of our
work are LTL model checking algorithms based on Büchi automata [19]. In this
approach, the system to be verified is modelled as a finite transition system and
the property is expressed as a formula ϕ of linear temporal logic (LTL). The for-
mula ϕ constrains executions, and the transition system is deemed correct (with
respect to the property) is all its executions satisfy ϕ. After translating the for-
mula into a Büchi automaton [1], the model checking problem can be rephrased in
terms of language inclusion between the transition system (interpreted as a Büchi
automaton) and the automaton representing ϕ or, technically more convenient, as
an emptiness problem for the product of the transition system and the automaton
representing ¬ϕ.
In this paper, we present a verified implementation in Isabelle/HOL of the
classical translation algorithm due to Gerth et al. [7] of LTL formulae into Büchi
automata.1 The automaton translation is at the heart of automaton-based model
1
The Isabelle sources on which our paper is based are available at http://www.
informatik.uni-freiburg.de/~ki/papers/diplomarbeiten/LTL2LGBA.zip.
Extensive documentation on Isabelle can be found at http://isabelle.in.tum.de.
Throughout this paper Isabelle refers to Isabelle/HOL.
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 424–439, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Construction of Büchi Automata for LTL Model 425
2 Preliminaries
2.1 Linear Temporal Logic
Linear-time temporal logic LTL [13] is a popular formalism for expressing cor-
rectness properties about (runs of) reactive systems. It extends propositional
logic by modal operators that refer to future points of time.
ξ p iff p ∈ ξ0 (p ∈ Prop)
ξ ¬ϕ iff ξϕ
ξ ϕ∨ψ iff ξ ϕ or ξ ψ
ξ Xϕ iff ξ|1 ϕ
ξ ϕUψ iff there exists i ∈ N such that ξ|i ψ and ξ|j ϕ for all 0 ≤ j < i.
– A = (Q, I, δ, F ) is a GBA;
– D is a finite set of labels;
– L : Q → 2D is the label function.
In model checking, systems are modelled as Kripke structures, that is, finite
transition systems whose states are labelled with propositional interpretations.
A Kripke structure K is an LGBA whose underlying GBA has a trivial (empty)
acceptance family, and whose label function assigns a single propositional inter-
pretation to every state. Assuming that the LGBA A represents the complement
of the LTL formula ϕ (A accepts precisely those executions of which ϕ does not
hold), K is a model of ϕ if no execution is accepted by both K and A, i.e. if the
intersection of the languages accepted by the two automata is empty.
We recall the algorithm proposed by Gerth et al. [7] for computing an LGBA Aϕ
(with set of labels 2Prop ) for an LTL formula ϕ such that Aϕ accepts a temporal
interpretation ξ iff ξ ϕ.
The construction of Aϕ proceeds in three stages. First, one builds the graph
of the underlying GBA, using a procedure similar to a tableau construction [4].
Second, the function for labelling states of the LGBA is defined. Finally, the
acceptance family is determined based on the set of “until” subformulae of ϕ.
We now describe each stage in more detail.
The first step builds a graph of nodes (which will become the automaton
states) that contain subformulae of ϕ. Intuitively, a node “promises” that the
formulae it contains hold of any temporal interpretation that has an accepting
run starting at that node. The construction is essentially based on “recursion
laws” of LTL such as
that are used to split a promised formula into promises for the current state
and for the successor state. The initial states of the automaton will be precisely
those nodes that promise ϕ.
Without loss of generality, we assume that ϕ is given in negation normal form
(NNF), i.e. the negation symbol is only applied to propositions. Transformation
to NNF is straightforward once we include the dual operators ∧ and V among
the set of logical connectives, using laws such as ¬(ϕ ∨ ψ) ≡ ¬ϕ ∧ ¬ψ.
Gerth et al. [7] represent each node of the graph by a record with the following
fields:
– Incoming: the set of names of all nodes that have an edge pointing to the
current node. Using this field, the entire graph is represented as the set of
its nodes.
– New : A set of LTL formulae promised by this node but that have not yet
been processed. This set is used during the construction and is empty for all
nodes of the final graph.
– Old : A set of LTL formulae promised by this node and that have already
been processed.
– Next : A set of LTL formulae that all successor nodes must promise.
– Father : During the construction, nodes will be split. This field contains the
name of the node from which the current one has been split. It is used by
Gerth et al. solely for reasoning about the algorithm, and we will not mention
it any further.
The algorithm successively moves formulae from New to Old, decomposing them,
and inserting subformulae into New and Next as appropriate. When the New
field is empty, a successor node is generated whose New field equals the Next
field of the current node. The algorithm maintains a list of all nodes generated so
far to avoid generating duplicate nodes; this is essential for ensuring termination
of the algorithm. More formally, the algorithm is realised by the function expand
whose pseudo-code is reproduced in Fig. 1. For reasons of space and clarity,
we omit some parts of the code in this presentation, in particular, some of the
cases for the currently considered formula η, while preserving the original line
numbering.
The automaton graph is constructed by the following function call:
where ϕ is the input LTL formula and init is a reserved identifier: all nodes whose
Incoming field contains init will be initial states of the automaton.
In the second step of the construction, we define the function labelling the
nodes with sets of propositional interpretations, each represented as the set of
propositions that evaluate to true. The label of a node q is defined as the set of
interpretations that are compatible with Old(q). Formally, let
It remains to define the acceptance family of the LGBA. Reconsider the “re-
cursion law” (1) for the U operator, which is implemented by lines 20–27 of
the code of Fig. 1. Every node “promising” a formula μ U ψ has one successor
promising ψ and a second successor promising μ and X(μ U ψ). Thus, the graph
of the LGBA may contain paths such that all nodes along the path promise μ
Construction of Büchi Automata for LTL Model 429
but no node promises ψ. Such paths are not models of μ U ψ, which requires ψ
to be true eventually, and the acceptance family is defined in order to exclude
them. Formally, we define for each formula μ U ψ the set of nodes
FμUψ = {q ∈ Q | μ U ψ ∈
/ Old(q) or ψ ∈ Old(q)}, (4)
4 Implementation in Isabelle
4.1 LTL Formulae
We represent LTL formulae in Isabelle as an inductive data type. For the pur-
poses of this presentation, we restrict to NNF formulae, although our full devel-
opment also includes unrestricted LTL formulae and NNF transformation. For
simplicity, we represent atomic propositions as strings; alternatively, the type of
propositions could be made a parameter of the data type definition.
430 A. Schimpf, S. Merz, and J.-G. Smaus
datatype
frml = LTLTrue ("true")
| LTLFalse ("false")
| LTLProp string ("prop’(_’)")
| LTLNProp string ("nprop’(_’)")
| LTLAnd frml frml ("_ and _")
| LTLOr frml frml ("_ or _")
| LTLNext frml ("X _")
| LTLUntil frml frml ("_ U _")
| LTLUDual frml frml ("_ V _")
The above definition includes the concrete syntax for each clause of the data
type. For example, (X prop(’’p’’)) and prop(’’q’’) would be the Isabelle
representation of (Xp) ∧ q.
We next introduce types for representing ω-words and temporal interpreta-
tions, and define the semantics of LTL formulae by a straightforward primitive
recursive function definition.
types
’a word = nat ⇒ ’a
interprt = "(string set) word"
record ’q gba =
initial :: "’q list"
Construction of Büchi Automata for LTL Model 431
record ’q lgba =
gbauto :: "’q gba"
label :: "’q string list list"
The use of the function “the” in the above code is a technicality related to the
fact that the node labelling function is, in principle, partial. What matters is
that “the (label A (σ i))” is of type string list list.
| "expand ((nprop(q))#fs, n) ns
= expand (fs, n(| old := (nprop(q))#(old n) |)) ns"
– compute the labelling of the states of the LGBA with sets of propositional
interpretations.
record node =
name :: nat
incoming :: "nat list"
old :: "frml list"
next :: "frml list"
types cnode = "frml list * node"
Figure 2 contains the fragment of the definition of function expand in Isabelle that
corresponds to the pseudo-code shown in Fig. 1. The function upd nds merges
Construction of Büchi Automata for LTL Model 433
the incoming fields of the current node with those of the already constructed
nodes whose old and next fields agree with those of the current node.
For the sake of presentation, the code shown in Fig. 2 is somewhat simpli-
fied with respect to our Isabelle theories: the actual definition produces a pair
consisting of a list of nodes and the highest used node name, which is used in
the (omitted) definition of the name of the node created in the second call to
expand in the clause for “until” formulae. Moreover, the actual definition checks
for duplicates whenever a formula is added to the old or next components of a
node.
The graph for an LTL formula is computed by the function create graph,
which in analogy to (2) is defined as
We now address the second problem, i.e. the computation of the acceptance
family for an LTL formula and a graph represented as a list of nodes. The
following function accept family is a quite direct transcription of the definition
of the acceptance family in (5):
where all until frmls computes the list of “until” subformulae of the argu-
ment formula, without duplicates. It is now straightforward to define a function
create gba that constructs a GBA (of type node gba) from a node list repre-
senting the graph.
It remains to compute the function labelling the nodes with sets of propo-
sitional interpretations, in order to obtain an LGBA. The following definitions
implement the labelling defined by (3) in a straightforward way.
definition
gen_label :: "[string list list, node] ⇒ string list list"
where
"gen_label lbls n
≡ [xs←lbls. set (pos_props (old n)) ⊆ set xs
∧ list_inter xs (neg_props (old n)) = []]"
definition
create_lgba :: "frml ⇒ node lgba"
where
434 A. Schimpf, S. Merz, and J.-G. Smaus
"create_lgba ϕ
≡ (let ns = create_graph ϕ in
(| gbauto = create_gba ϕ ns,
label = [ns[→]map (gen_label (list_Pow (get_props ϕ)))
ns] |))"
The auxiliary functions pos props and neg props compute the lists of positive
and negative literals contained in a list of formulae; get props computes the list
of atomic propositions contained in a temporal formula.
from the Isabelle theory file. This command produces an OCaml module con-
taining the function create lgba and all definitions and functions on which that
function depends.
In order to use this code we have manually written a parser and driver pro-
gram that parses an LTL formula, calls the function create lgba, and outputs
the result. We have used this program to generate automata corresponding to
formulae ϕn that are representative of the verification of liveness properties un-
der fairness constraints4
not quite fair, because the other tools go on to translate the LGBA to ordinary
Büchi automata. We plan to formalise this additional (polynomial) translation
in the future, but take the present results as an indication that the execution
times of the implementation generated from Isabelle are not prohibitive.
We have used the LTL-to-Büchi translator testbench [17] for gaining addi-
tional confidence in our program, including the hand-written driver. As expected,
our code passes all the tests.
5.1 Termination
HOL is a logic of total functions, and it is essential for consistency to prove that
every function that we define terminates. Indeed, Isabelle inserts a termination
predicate in all theorems that involve a function whose termination has not been
proven. Termination of the expand function (cf. Fig. 2) is not obvious on first
sight but, remarkably, is not discussed at all in the original paper [7].
Consider Fig. 2. A call to expand is of the form expand (fs,n) ns. Now in
all cases of the definition, except the first one, some formula is removed from
fs, suggesting a well-founded ordering based on the size of the list fs. (This
observation is also true of the cases of the definition omitted in Fig. 2.)
However, that simple definition breaks down for the first case where argument
fs equals []. Indeed, the recursive call constructs a new node based on the
contents of the next field of the node n. In this case, the termination argument
must be based on the argument ns of the function call. The apparent difficulty
here is that this list does not become shorter on recursive calls, but (potentially)
longer, so it is not completely obvious how to define a well-founded order. The
solution here is to find a suitable upper bound for the argument ns. This can
be done using the fact that all the nodes that are ever constructed contain
subformulae of the input formula ϕ in their fields old and next, the same holds
for the argument fs of formulae to process, and no two different nodes containing
the same formulae in their old and next fields are ever constructed. It follows
that there are only finitely many possible nodes since there exist only finitely
many distinct sets of subformulae of ϕ. Very roughly speaking, the well-founded
order by which argument ns decreases is given by (LIM ϕ - ns) where LIM is
a function that calculates the appropriate upper bound given an LTL formula
ϕ. The actual definition of the upper bound, which appears in the definition
436 A. Schimpf, S. Merz, and J.-G. Smaus
of the ordering below, depends on the arguments of function expand, not the
formula ϕ.
The two orderings are combined lexicographically, that is to say, either the
argument ns decreases w.r.t. the ordering discussed above, or the ns argument
stays the same and there is a decrease on the fs argument.
The termination proof is complicated further by the fact that we have a nested
recursive call in the last case. This is obvious in line 27 in Fig. 1, but the let
expression in Fig. 2 amounts to the same. We therefore start off by showing
a partial termination property, which states that if expand terminates, then
nds ⊇ ns, where nds is the result computed by the inner call (see Fig. 2). This
partial result is then used to show that the arguments of the outer recursive call
are smaller according to the well-founded ordering explained above.
The termination order is formally defined in Isabelle as follows:
abbreviation
"expand_term_ord ≡
inv_image (finite_psubset <*lex*> less_than)
(λ(n, ns). (nds_limit n ns - (old_next_pair ‘ set ns),
size_frml_list (fst n)))"
We explain this definition. The termination order compares pairs of the form
(n, ns) where n is a cnode and ns is a node list. This corresponds exactly to
the argument types of expand. The function λ(n, ns). . . . in the above definition
turns (n, ns) into another pair, say (st, sz), where st is given by the old and
next fields of all nodes in ns and subtracting those from the set of all possible
old and next fields—i.e., st states “how far ns is from the limit”. The second
argument sz is simply the length of the list appearing as the first component of
the pair n. To compare two pairs (n, ns) and (n , ns ), the function is used to
compute the corresponding (st, sz) and (st , sz ), and those pairs are compared
using a lexicographical combination of ⊆ and ≤.
The formal termination proof takes about 500 lines of Isar proof script.
5.2 Correctness
We now address the proper correctness proof of the algorithm, whose idea is
presented in the original paper [7]. We have to prove that the LGBA computed
by function create_lgba ϕ accepts precisely those temporal structures that are
a model of ϕ. Formally, this is expressed as the Isabelle theorem
theorem lgba_correct:
assumes "∀ i. ξ i ∈ Pow (set (get_props ϕ))"
shows "lgba_accept (create_lgba ϕ) ξ ←→ ξ |= ϕ".
along any path starting at that node. However, the graph construction by itself
can ensure this only partly. For example, we can prove the following lemma
about “until” formulae promised by a node:
lemma L4_2a:
assumes "gba_path (gbauto (create_lgba ϕ)) σ"
and "f U g ∈ set (old (σ 0))"
shows "(∀ i. {f, f U g} ⊆ set (old (σ i))
∧ g ∈/ set (old (σ i)))
∨ (∃ j. (∀ i<j. {f, f U g} ⊆ set (old (σ i)))
∧ g ∈ set (old (σ j)))".
In other words, we know for any path that starts at a node promising formula
f U g that f and f U g are promised as long as g is not promised. However, we
cannot be sure that g will indeed be promised by some node along the path. We
defined the acceptance family precisely in a way to make sure that such paths
are non-accepting, and indeed we can prove the following stronger lemma about
the accepting paths starting at a node promising some formula f U g:
lemma L4_2b:
assumes "gba_path (gbauto (create_lgba ϕ)) σ"
and "f U g ∈ set (old (σ 0))"
and "gba_accept (gbauto (create_lgba ϕ)) σ"
shows "∃ j. (∀ i<j. {f, f U g} ⊆ set (old (σ i)))
∧ g ∈ set (old (σ j))"
The proof of theorem lgba_correct above relies on similar lemmas for each
temporal operator, and then proves by induction on the structure of LTL for-
mulae that all formulae promised along an accepting path indeed hold of the
corresponding suffix of the temporal interpretation. For the proof of the “if”
direction of theorem lgba_correct we inductively construct an accepting path
for any temporal interpretation satisfying a formula. The length of the overall
correctness proof is about 4500 lines of Isar proof script. The effort of working
out the Isabelle proofs was around four person months.
6 Conclusion
In this paper we have presented a formally verified definition of labelled gener-
alised Büchi automata in the interactive proof assistant Isabelle. Our formali-
sation is based on the classical algorithm by Gerth et al. [7], and Isabelle can
generate executable code from our definitions. In this way, we obtain a highly
trustworthy program for a critical component of a model checking engine for
LTL.
Few formalisations of similar translations have been studied in the literature.
Schneider [15] presents a HOL conversion for LTL that produces a symbolic
encoding of an LGBA, which can be used in connection with a symbolic (in par-
ticular BDD-based) model checker. In contrast, our implementation produces
438 A. Schimpf, S. Merz, and J.-G. Smaus
a full LGBA that can be used with explicit-state LTL model checkers. More-
over, it generates a stand-alone program that can be used independently of any
particular proof assistant. The second author [11] previously presented a formali-
sation of weak alternating automata (WAA [12]), including a translation of LTL
formulae into WAA. Due to their much richer combinatorial structure, WAA
afford a rather straightforward LTL translation of linear complexity, whereas
the translation into (generalised) Büchi automata is exponential. Indeed, the
main contribution of [11] was the formalisation of a game-theoretic argument
due to [10,18] that underlies a complementation procedure for WAA.
Since the translation of LTL formulae to Büchi automata is of exponential
complexity, one cannot expect to translate large formulae. Fortunately, the for-
mulae that express typical correctness properties of concurrent systems are quite
small. Although efficiency was not of much concern to us during the development
of our theories, our experiments so far indicate that the extracted program does
not behave significantly worse than existing implementations of the algorithm
of Gerth et al. Of course, several improvements to the code are possible. For
example, we could represent the sets of propositional interpretations labelling
the automaton states symbolically instead of through an explicit enumeration,
for example using a Boolean function that checks whether an interpretation is
consistent with the label. Optimisations at a lower level could be obtained by
replacing the list representation of finite sets with a more efficient data structure.
More significant optimisations could be achieved by basing the construction
on a different algorithm altogether. Although the construction of Gerth et al.
is well known and widely implemented, several alternative constructions have
been studied in the literature [3,16,6,5,8], and the algorithm presented in [6] is
widely considered to behave best in practice. This algorithm makes use of more
advanced automata-theoretic notions, including WAA and various simulation
relations on WAA and Büchi automata. These concepts have wider applications
than just the automata constructions used in model checkers, including the com-
plementation of ω-automata [9] and the synthesis of concurrent systems.
Encouraged by the success we have had so far, we would indeed like to for-
malise the construction of [6] in future work. Our current formalisation will
continue to serve as an important building block that contains essential, funda-
mental concepts.
References
4. Fitting, M.C.: Proof Methods for Modal and Intuitionistic Logic. Synthese Library:
Studies in Epistemology, Logic, Methodology and Philosophy of Science. D. Reidel,
Dordrecht (1983)
5. Fritz, C.: Constructing Büchi automata from linear temporal logic using simulation
relations for alternating Büchi automata. In: Ibarra, O.H., Dang, Z. (eds.) CIAA
2003. LNCS, vol. 2759, pp. 35–48. Springer, Heidelberg (2003)
6. Gastin, P., Oddoux, D.: Fast LTL to Büchi automata translation. In: Berry, G.,
Comon, H., Finkel, A. (eds.) CAV 2001. LNCS, vol. 2102, pp. 53–65. Springer,
Heidelberg (2001)
7. Gerth, R., Peled, D., Vardi, M.Y., Wolper, P.: Simple on-the-fly automatic ver-
ification of linear temporal logic. In: Dembinski, P., Sredniawa, M. (eds.) 15th
Intl. Symp. Protocol Specification, Testing, and Verification (PSTV 1996). IFIP
Conference Proceedings, vol. 38, pp. 3–18. Chapman & Hall, Boca Raton (1996)
8. Gurumurthy, S., Kupferman, O., Somenzi, F., Vardi, M.Y.: On complementing
nondeterministic Büchi automata. In: Geist, D., Tronci, E. (eds.) CHARME 2003.
LNCS, vol. 2860, pp. 96–110. Springer, Heidelberg (2003)
9. Kupferman, O., Vardi, M.: Complementation constructions for nondeterministic
automata on infinite words. In: Halbwachs, N., Zuck, L. (eds.) TACAS 2005. LNCS,
vol. 3440, pp. 206–221. Springer, Heidelberg (2005)
10. Kupferman, O., Vardi, M.Y.: Weak alternating automata are not that weak. ACM
Trans. Comput. Log. 2(3), 408–429 (2001)
11. Merz, S.: Weak alternating automata in Isabelle/HOL. In: Aagaard, M.D., Harri-
son, J. (eds.) TPHOLs 2000. LNCS, vol. 1869, pp. 424–441. Springer, Heidelberg
(2000)
12. Muller, D., Saoudi, A., Schupp, P.: Weak alternating automata give a simple expla-
nation of why most temporal and dynamic logics are decidable in exponential tim.
In: 3rd IEEE Symp. Logic in Computer Science (LICS 1988), Edinburgh, Scotland,
pp. 422–427. IEEE Press, Los Alamitos (1988)
13. Pnueli, A.: The temporal semantics of concurrent programs. Theoretical Computer
Science 13, 45–60 (1981)
14. Schimpf, A.: Implementierung eines Verfahrens zur Erzeugung von Büchi-
Automaten aus LTL-Formeln in Isabelle. Diplomarbeit, Albert-Ludwigs-
Universität Freiburg (2008), http://www.informatik.uni-freiburg.de/~ki/
papers/diplomarbeiten/schimpf-diplomarbeit-08.pdf
15. Schneider, K., Hoffmann, D.W.: A HOL conversion for translating linear time tem-
poral logic to ω-automata. In: Bertot, Y., Dowek, G., Hirschowitz, A., Paulin, C.,
Théry, L. (eds.) TPHOLs 1999. LNCS, vol. 1690, pp. 255–272. Springer, Heidelberg
(1999)
16. Somenzi, F., Bloem, R.: Efficient Büchi automata from LTL formulae. In: Halb-
wachs, N., Peled, D.A. (eds.) CAV 1999. LNCS, vol. 1633, pp. 257–263. Springer,
Heidelberg (1999)
17. Tauriainen, H., Heljanko, K.: Testing LTL formula translation into Büchi au-
tomata. International Journal on Software Tools for Technology Transfer 4(1),
57–70 (2002), http://www.tcs.hut.fi/Software/lbtt/
18. Thomas, W.: Complementation of Büchi automata revisited. In: Rozenberg, G.,
Karhumäki, J. (eds.) Jewels are forever, Contributions on Theoretical Computer
Science in Honor of Arto Salomaa, pp. 109–122. Springer, Heidelberg (2000)
19. Vardi, M.Y., Wolper, P.: Reasoning about infinite computations. Information and
Computation 115(1), 1–37 (1994)
A Hoare Logic for the State Monad
Proof Pearl
Wouter Swierstra
1 Introduction
Monads help structure functional programs. Yet proofs about monadic programs
often start by expanding the definition of return and bind. This seems rather
wasteful. If we exploit this structure when writing programs, why should we
discard it when writing proofs? This pearl examines how to verify functional
programs written using the state monad. It is my express aim to take advantage
of the monadic structure of these programs to guide the verification process.
This pearl is a literate Coq script [15]. Most proofs have been elided from the
typeset version, but a complete development is available from my homepage.
Throughout this paper, I will assume that you are familiar with Coq’s syntax
and have some previous exposure to functional programming using monads [16].
Let me begin by motivating the state monad. Consider the following inductive
data type for binary trees:
Now suppose we want to define a function that replaces every value stored in a
leaf of such a tree with a unique integer, i.e., no two leaves in the resulting tree
should share the same label.
The obvious solution, given by the relabel function below, keeps track of a
natural number as it traverses the tree.
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 440–451, 2009.
c Springer-Verlag Berlin Heidelberg 2009
A Hoare Logic for the State Monad 441
The relabel function uses its argument number as the new label for the leaves.
To make sure that no two leaves get assigned the same number, the number
returned at a leaf is incremented. In the Node case, the number is threaded
through the recursive calls appropriately.
While this solution is correct, there is some room for improvement. It is all too
easy to pass the wrong number to a recursive call, thereby forgetting to update
the state. To preclude such errors, the state monad may be used to carry the
number implicitly as the tree is traversed.
For some fixed type of state s : Set, the state monad is:
A computation in the state monad State a takes an initial state as its argument.
Using this initial state, it performs some computation yielding a pair consisting
of a value of type a and a final state.
The two monadic operations, return and bind, are defined as follows:
The return function lifts any pure value into the state monad, leaving the state
untouched. Two computations may be composed using the bind function. It
passes both the state and the result arising from the first computation as argu-
ments to the second computation.
In line with the notation used in Haskell [11], I will use a pair of infix op-
erators to write monadic computations. Instead of bind, I will sometimes write
>>=, a right-associative infix operator. Secondly, I will write c1 >
> c2 instead of
bind c1 (fun ⇒ c2 ). This operator binds two computations, discarding the
intermediate result.
Besides return and bind, there are two other operations that may be used to
construct computations in the state monad:
The get function returns the current state, whereas put overwrites the current
state with its argument.
442 W. Swierstra
We can now redefine the relabelling function to use the state monad as follows:
Note that the type variable s has been instantiated to nat – the state carried
around by the relabelling function is a natural number. By using the state monad,
we no longer need to pass around this number by hand. This definition is less
error prone: all the ‘plumbing’ is handled by the monadic combinators.
3 The Challenge
We will prove that for any tree t and number x , the list flatten (fst (relabel t x ))
does not have any duplicates. This property does not completely characterise
relabelling – we should also check that the argument tree has the same shape as
the resulting tree. This is relatively straightforward to verify as the relabelling
function clearly maps leaves to leaves and nodes to nodes. Proving that the
resulting tree satisfies the proposed specification, however, is not so easy.
The relabel function in the previous section is simply typed. We can certainly use
proof assistants such as Coq to formalise equational proofs about such functions.
In this paper, however, I will take a slightly different approach.
In this paper I will use strong specifications, i.e., the type of the relabel function
will capture information about its behaviour. Simultaneously completing the
A Hoare Logic for the State Monad 443
function definition and the proof that this definition satisfies its specification
yields programs that are correct by construction. This approach to verification
can be traced back to Martin-Löf [6].
To give a strong specification of the relabelling function, we decorate com-
putations in the state monad with additional propositional information. Recall
that the state monad is defined as follows:
We can refine this definition slightly: instead of accepting any initial state of type
s, the initial state should satisfy a given precondition. Furthermore, instead of
returning any pair, the resulting pair should satisfies a postcondition relating
the initial state, resulting value, and final state. Bearing these two points in
mind, we arrive at the following definition of a state monad enriched with Hoare
logic [2, 3].
This definition of the return is identical to the original definition of the state
monad: we have only made its behaviour evident from its type. The Program
framework automatically discharges the trivial proofs necessary to complete the
definition.
444 W. Swierstra
The corresponding revision of bind is a bit more subtle. Recall that the bind
of the state monad has the following type.
State a → (a → State b) → State b
You might expect the definition of the revised bind function to have a type of
the form:
HoareState P1 a Q1 → (a → HoareState P2 b Q2 ) → HoareState ... b ...
Before we consider the precondition and postcondition of the resulting compu-
tation, note that we can generalise this slightly. In the above type signature, the
second argument of the bind function is not dependent. We can parametrise P2
and Q2 by the result of the first computation:
HoareState P1 a Q1
→ (forall (x : a), HoareState (P2 x ) b (Q2 x ))
→ HoareState ... b ...
This generalisation allows the pre- and postconditions of the second computation
to refer to the results of the first computation.
Now we need to choose a suitable precondition and postcondition for the com-
posite computation returned by the bind function. To motivate the choice of pre-
and postcondition, recall that the bind of the state monad is defined as follows:
Definition bind (a b : Set) : State a → (a → State b) → State b
:= fun c1 c2 s1 ⇒ let (x , s2 ) := c1 s1
in c2 x s2 .
The bind function starts by running the first computation, and subsequently
feeds its result to the second computation. So clearly the precondition of the
composite computation should imply the precondition of the first computation c1
– otherwise we could not justify running c1 with the initial state s1 . Furthermore
the postcondition of the first computation should imply the precondition of the
second computation – if this wasn’t the case, we could not give grounds for the
call to c2 . These considerations lead to the following choice of precondition for
the composite computation:
fun s1 ⇒ P1 s1 ∧ forall x s2 , Q1 s1 x s2 → P2 x s2
What about the postcondition? Recall that a postcondition is a relation be-
tween the initial state, resulting value, and the final state. We would expect the
postcondition of both argument computations to hold after executing the com-
posite computation resulting from a call to bind. This composite computation,
however, cannot refer to the initial state passed to the second computation or
the results of the first computation: it can only refer to its own initial state
and results. To solve this we existentially quantify over the results of the first
computation, yielding the below postcondition for the bind operation.
fun s1 y s3 ⇒ exists x , exists s2 , Q1 s1 x s2 ∧ Q2 x s2 y s3
A Hoare Logic for the State Monad 445
This definition does give rise to two proof obligations: the intermediate state
s2 must satisfy the precondition of the second computation c2 ; the application
c2 x s2 must satisfy the postcondition of bind. Both these obligations are fairly
straightforward to prove.
Before we have another look at the relabel function, we redefine the two aux-
iliary functions get and put to use the HoareState type:
Both functions have the trivial precondition top. The postcondition of the get
function guarantees that it will return the current state without modifying it.
The postcondition of the put function declares that the final state is equal to
put’s argument.
5 Relabelling Revisited
Finally, we return to the original question: how can we prove that the relabel
function satisfies its specification?
Using the HoareState type, we now arrive at the definition of the relabelling
function presented in Figure 1. The function definition of relabel is identical to
the version using the state monad in Section 3. The only novel aspect is the
choice of pre- and postcondition.
As we do not need any assumptions about the initial state, we choose the
trivial precondition top. The postcondition uses two auxiliary functions, size and
446 W. Swierstra
seq, and consists of two parts. First of all, the final state should be exactly size t
larger than the initial state, where t refers to the resulting tree. Furthermore,
when the relabelling function is given an initial state i, flattening t should yield
the sequence i, i + 1, ...i + size t .
This definition gives rise to two proof obligations, one for each branch of the
pattern match in the relabel function. In the Leaf case, the proof obligation is
trivial. It is discharged automatically by the Program framework. To solve the
remaining obligation, we need to apply several tactics to trigger β-reduction
and introduce the assumptions. After giving the variables in the context more
meaningful names, we arrive at the proof state in Figure 2.
To complete the proof, we must prove that the postcondition holds for the
tree Node l r under the assumption that it holds for recursive calls to l and
r . The first part of the conjunction follows immediately from the assumptions
finalRes, sizeR, and sizeL and the associativity of addition. The second part of
the conjunction is a bit more interesting. After applying the induction hypothe-
ses, flattenL and flattenR, the remaining goal becomes:
=================================
seq i (size l ) +
+ seq lState (size r ) = seq i (size l + size r )
A Hoare Logic for the State Monad 447
1 subgoal
i : nat
t : Tree nat
n : nat
l : Tree nat
lState : nat
sizeL : lState = i + size l
flattenL : flatten l = seq i (size l )
r : Tree nat
rState : nat
sizeR : rState = lState + size r
flattenR : flatten r = seq lState (size r )
finalState : rState = n
finalRes : t = Node l r
============================
n = i + size t ∧ flatten t = seq i (size t)
To complete the proof we need to use the assumption sizeL. If we had chosen the
obvious postcondition flatten t = seq i (size t ) we would not have been able to
complete this proof. Once we apply sizeL we can use one last lemma to complete
the proof:
6 Wrapping It Up
Now suppose we need to show that relabel satisfies a weaker postcondition. For
instance, consider the NoDup predicate on lists from the Coq standard libraries.
A list satisfies the NoDup predicate if it does not contain duplicates. The predi-
cate’s definition is given below.
How can we prove that the tree resulting from a call to the relabelling function
satisfies NoDup (flatten t )?
We cannot define a relabelling function that has this postcondition – the in-
duction hypotheses are insufficient to complete the required proofs in the Node
case. We can, however, weaken the postcondition and strengthen the precondi-
tion explicitly. In line with Hoare Type Theory [10, 9, 8], we call this operation
do:
Program Definition do (s a : Set) (P1 P2 : Pre s) (Q1 Q2 : Post s a) :
(forall i, P2 i → P1 i) → (forall i x f , Q1 i x f → Q2 i x f ) →
HoareState s P1 a Q1 → HoareState s P2 a Q2
:= fun c ⇒ c.
This function has no computational content. It merely changes the precondition
and postcondition associated with a computation in the HoareState type. We
can now define the final version of the relabelling function as follows:
Program Fixpoint finalRelabel (a : Set) (t : Tree a) :
HoareState (top nat) (Tree nat) (fun i t f ⇒ NoDup (flatten t ))
:= do (relabel a t ).
The precondition is unchanged. As a result, the first argument to the do func-
tion is trivial. To complete this definition, however, we need to prove that the
postcondition can be weakened appropriately. This proof boils down to showing
that the list seq i (size t ) does not have any duplicates. Using one last lemma,
forall n x y, x < y → ¬In x (seq y n), we complete the proof.
7 Discussion
Related Work
This pearl draws inspiration from many different sources. Most notably, it is
inspired by recent work on Hoare Type Theory [10, 9, 8]. Ynot, the implemen-
tation of Hoare Type Theory in Coq, postulates the existence of return, bind,
and do to use Hoare logic to reason about functions that use mutable references.
This paper shows how these functions may be defined in Coq, rather than postu-
lated. Furthermore, the HoareState type generalises their presentation somewhat:
where Hoare Type Theory has specifically been designed to reason about muta-
ble references, this pearl shows that the HoareState type can be used to reason
about any computation in the state monad.
The relabelling problem is taken from Hutton and Fulger [4], who give an
equational proof. Their proof, however, revolves around defining an intermediate
function relabel that carries around an (infinite) list of fresh labels.
relabel : forall a b, Tree a → State (list b) (Tree b)
To prove that relabel meets the required specification, Hutton and Fulger prove
various lemmas relating relabel and relabel. It is not clear how their proof tech-
niques can be adapted to other functions in the state monad.
A Hoare Logic for the State Monad 449
Similar techniques have been used by Leroy [5] in the Compcert project. His
solution, however, revolves around defining an auxiliary data type:
Where R is some relation between states. Unfortunately, the bind of this monad
yields less efficient extracted code, as it requires an additional pattern match on
the Res resulting from the first computation. Using the HoareState type, it may be
possible to rule out errors by strengthening the precondition, thereby eliminating
the need for this additional pattern match. Furthermore, the HoareState type
presented here is slightly more general as its postcondition may also refer to the
result of the computation.
Similar monadic structures to the one presented here have appeared in the ver-
ification of the seL4 microkernel [1] and security protocol verification [14]. There
are a few differences between these approaches and the development presented
here. Firstly, the postconditions presented here are ternary relations between
the initial state, result, and final state. As a result, we do not need to introduce
auxiliary variables to relate intermediate results. Sprenger and Basin [14] con-
struct a Hoare logic on top of a weakest-precondition calculus. They present a
shallow embedding of a series of logical rules that describe how the return and
bind behave. On the other hand, Cock et al. [1] present their rules are presented
as predicate transformers, using Isabelle/HOL’s verification condition generator
to infer the weakest precondition of a computation. The approach taken here fo-
cuses on programming with strong specifications in type theory, where the type
of a computation fixes the desired pre- and postcondition.
Further Work
I have not provided justification for the choice of pre- and postcondition of bind
and return. Other choices are certainly possible. For instance, we could choose
the following type for return:
Clearly this is a bad choice – applying the return function will no longer yield
any information about the computation. It would be interesting to investigate if
the choices presented here are somehow canonical, for instance, by showing that
the HoareState type forms a monad in some category of strong specifications.
McKinna’s thesis [7] on the categorical structure of strong specifications may
form the starting point for such research.
Using the HoareState type to write larger programs will lead to larger proof
obligations. For this approach to scale, it is important to provide a suitable set
of custom tactics to alleviate the burden of proof. Some tactics that are already
provided by the Program framework proved useful in the development presented
here, but further automation might still be necessary.
450 W. Swierstra
References
[1] Cock, D., Klein, G., Sewell, T.: Secure microkernels, state monads and scalable
refinement. In: Munoz, C., Ait, O. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp.
167–182. Springer, Heidelberg (2008)
[2] Floyd, R.W.: Assigning meanings to programs. Mathematical Aspects of Com-
puter Science 19 (1967)
[3] Hoare, C.A.R.: An axiomatic basis for computer programming. Communications
of the ACM 12(10), 576–580 (1969)
[4] Hutton, G., Fulger, D.: Reasoning about effects: seeing the wood through the trees.
In: Proceedings of the Ninth Symposium on Trends in Functional Programming
(2008)
[5] Leroy, X.: Formal certification of a compiler back-end, or: programming a com-
piler with a proof assistant. In: POPL 2006: 33rd Symposium on Principles of
Programming Languages, pp. 42–54. ACM Press, New York (2006)
[6] Martin-Löf, P.: Constructive mathematics and computer programming. In: Pro-
ceedings of a discussion meeting of the Royal Society of London on Mathematical
logic and programming languages, pp. 167–184. Prentice-Hall, Inc., Englewood
Cliffs (1985)
[7] McKinna, J.: Deliverables: a categorical approach to program development in type
theory. Ph.D thesis, School of Informatics at the University of Edinburgh (1992)
[8] Nanevski, A., Morrisett, G.: Dependent type theory of stateful higher-order func-
tions. Technical Report TR-24-05, Harvard University (2005)
[9] Nanevski, A., Morrisett, G., Birkedal, L.: Polymorphism and separation in Hoare
Type Theory. In: ICFP 2006: Proceedings of the Eleventh ACM SIGPLAN Inter-
nation Conference on Functional Programming (2006)
[10] Nanevski, A., Morrisett, G., Shinnar, A., Govereau, P., Birkedal, L.: Ynot: Rea-
soning with the awkward squad. In: ICFP 2008: Proceedings of the Twelfth ACM
SIGPLAN International Conference on Functional Programming (2008)
[11] Peyton Jones, S. (ed.): Haskell 98 Language and Libraries: The Revised Report.
Cambridge University Press, Cambridge (2003)
[12] Sozeau, M.: Subset coercions in Coq. In: Altenkirch, T., McBride, C. (eds.)
TYPES 2006. LNCS, vol. 4502, pp. 237–252. Springer, Heidelberg (2007)
A Hoare Logic for the State Monad 451
1 Introduction
Termination provers for term rewrite systems (TRSs) became more and more
powerful in the last years. One reason is that a proof of termination no longer
is just some reduction order which contains the rewrite relation of the TRS.
Currently, most provers construct a proof in the dependency pair framework
which allows to combine basic termination techniques in a flexible way. Then
a termination proof is a tree where at each node a specific technique has been
applied. So instead of stating the precedence of some lexicographic path order
(LPO) or giving some polynomial interpretation, current termination provers
return proof trees which reach sizes of several megabytes. Hence, it would be too
much work to check by hand whether these trees really form a valid proof.
That we cannot blindly trust the output of termination provers is regularly
demonstrated: Every now and then some tool delivers a faulty proof for some
TRS. But most often this is only detected if there is some other prover giving
the opposite answer on the same TRS, i.e., that it is nonterminating. To solve
this problem, in the last years two systems have been developed which auto-
matically certify or reject a generated termination proof: CiME/Coccinelle [4,6]
and Rainbow/CoLoR [3] where Coccinelle and CoLoR are libraries on rewriting for
This research is supported by FWF (Austrian Science Fund) project P18763.
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 452–468, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Certification of Termination Proofs Using CeTA 453
Coq (http://coq.inria.fr), and CiME and Rainbow are used to convert proof
trees into Coq-proofs which heavily rely on the theorems within those libraries.
• In the other two systems, whenever a proof is not accepted, the user just
gets a Coq-error message that some step in the generated Coq-proof failed. In
contrast, our functions deliver error messages using notions of term rewriting.
• Since the analysis of the proof trees in IsaFoR is performed by executable
functions, we can just apply Isabelle’s code-generator [11] to create a certified
Haskell program [17], CeTA, leading to the following workflow.
Hence, to use our certifier CeTA (Certified Termination Analysis) you do not
have to install any theorem prover, but just execute some binary. Moreover,
the runtime of certification is reduced significantly. Whereas the other two
approaches take more than one hour to certify all (≤ 580) proofs during
the last certified termination competition, CeTA needs less than two minutes
for all (786) proofs that it can handle. Note that CeTA can also be used for
modular certification. Each single application of a termination technique can
be certified—just call the corresponding Haskell-function.
Concerning the techniques that have been formalized, the other two systems
offer techniques that are not present in IsaFoR, e.g., LPO or matrix interpreta-
tions. Nevertheless, we also feature one new technique that has not been certified
1
In the remainder of this paper we just write Isabelle instead of Isabelle/HOL.
454 R. Thiemann and C. Sternagel
so far. Whereas currently only the initial dependency graph estimation of [1] has
been certified, we integrated the most powerful estimation which does not re-
quire tree automata techniques and is based on a combination of [9,12] where
the function tcap is required. Initial problems in the formalization of tcap led to
the development of etcap, an equivalent but more efficient version of tcap which
is also beneficial for termination provers. Replacing tcap by etcap within the
termination prover TTT2 [14] reduced the time to estimate the dependency graph
by a factor of 2. We will also explain, how to reduce the number of edges that
have to be inspected when checking graph decompositions.
Another benefit of our system is its robustness. Every proof which uses weaker
techniques than those formalized in IsaFoR is accepted. For example, termination
provers can use the graph estimation of [1], as it is subsumed by our estimation.
The paper is structured as follows. In Sect. 2 we recapitulate the required
notions and notations of term rewriting and the dependency pair framework
(DP framework). Here, we also introduce our formalization of term rewriting
within IsaFoR. In Sect. 3–6 we explain our certification of the four termination
techniques we currently support: dependency pairs (Sect. 3), dependency graph
(Sect. 4), reduction pairs (Sect. 5), and combination of proofs in the dependency
pair framework (Sect. 6). However, to increase readability we abstract from our
concrete Isabelle code and present the checks for the techniques on a higher
level. How we achieved readable error-messages while at the same time having
maintainable Isabelle proofs is the topic of Sect. 7. We conclude in Sect. 8 where
we show how CeTA is created from IsaFoR and where we give experimental data.
IsaFoR, CeTA, and all details about our experiments are available at CeTA’s
website http://cl-informatik.uibk.ac.at/software/ceta.
Example 1. As an example, consider the following TRS, encoding rules for sub-
traction and division on natural numbers.
Given a TRS R, (, r) ∈ R means that is the lhs and r the rhs of a rule in
R (usually written as → r ∈ R). The rewrite relation induced by a TRS R is
denoted by →R and has the following definition in IsaFoR:
Example 4. The dependency pairs for the TRS from Ex. 1 consist of the rules
Note that after switching to ’%’-terms, the derivation from above can be written
as t →∗R σ →DP(R) u σ. Hence every nonterminating derivation starting at
a term t ∈ TR∞ can be transformed into an infinite derivation of the following
shape where all →DP(R) -steps are applied at the root.
t →∗R s1 →DP(R) t1 →∗R s2 →DP(R) t2 →∗R · · · (1)
for every input. The termination techniques that will be introduced in the fol-
lowing sections are all such (sound) processors.
So much to the underlying formalization. Now we will present how the check
in IsaFoR certifies a set of DPs P that was generated by some termination tool
for some TRS R. To this end, the function checkDPs is used.
checkDPs(P,R) = checkWfTRS(R) ∧ computeDPs(R) ⊆ P
Here checkWfTRS checks the two well-formedness properties mentioned above
(the difference between wf_trs and checkWfTRS is that only the latter is exe-
cutable) and computeDPs uses Def. 3, which is currently the strongest definition
of DPs. To have a robust system, the check does not require that exactly the set
of DPs w.r.t. to Def. 3 is provided, but any superset is accepted. Hence we are
also able to accept proofs from termination tools that use a weaker definition of
DP(R). The soundness result of checkDPs is formulated as follows in IsaFoR.
where the latter part of the disjunction is computed only on demand. Thus, only
those edges have to be computed, which would contradict a valid decomposition.
Example 7. Consider the set of nodes P = {(DD), (DM), (MM)}. Suppose that
we have to check a decomposition of P into L = {(DD)}, {(DM)}, {(MM)} for
some graph G = (P, E). Then our check has to ensure that the dashed edges in
the following illustration do not belong to E.
It is easy to see that (2) is satisfied for every list of SCCs that is given
in topological order. What is even more important, whenever there is a valid
SCC decomposition of G, then (2) is also satisfied for every subgraph. Hence,
regardless of the dependency graph estimation a termination prover might have
used, we accept it, as long as our estimation delivers less edges.
However, the criterion is still too relaxed, since we might cheat in the input
by listing nodes twice. Consider P = {p1 , . . . , pm } where the corresponding
graph is arbitrary and L = {p1 }, . . . , {pm }, {p1}, . . . , {pm }. Then trivially (2)
is satisfied, because we can always take the source of edge (pi , pj ) from the first
part of L and the target from the second part of L. To prevent this kind of
problem, our criterion demands that the sets Ci in L are pairwise disjoint.
Before we formally state our theorem, there is one last step to consider, namely
the handling of singleton nodes which do not form an SCC on their own. Since
we cannot easily infer at what position these nodes have to be inserted in the
topological sorted list—this would amount to do an SCC decomposition on our
own—we demand that they are contained in the list of components.5
To distinguish a singleton node without an edge to itself from a “real SCC”, we
require that the latter ones are marked. Then condition (2) is extended in a way
that unmarked components may have no edge to themselves. The advantage of
not marking a component is that our IsaFoR-theorem about graph decomposition
states that every infinite path will end in some marked component, i.e., here the
unmarked components can be ignored.
then there is some suffix β of α and some marked Ci such that all nodes of β
belong to Ci .
5
Note that Tarjan’s SCC decomposition algorithm produces exactly this list.
460 R. Thiemann and C. Sternagel
To illustrate the difference between cap and tcap consider the TRS of Ex. 1
and t = div(0, 0). Then cap(t) = xfresh since div is a defined symbol. However,
tcap(t) = t since there is no division rule where the second argument is 0.
Apart from tree automata techniques, currently the most powerful estimation
is the one based on tcap looking both forward as in EDG and backward as in
EDG∗ . Hence, we aimed to implement and certify this estimation in IsaFoR.
Unfortunately, when doing so, we had a problem with the domain of variables.
The problem was that although we first implemented and certified the standard
unification algorithm of [15], we could not directly apply it to compute tcap. The
reason is that to generate fresh variables as well as to rename variables in rules
apart, we need a type of variables with an infinite domain. One solution would
Certification of Termination Proofs Using CeTA 461
have been to constrain the type of variables where there is a function which
delivers a fresh variable w.r.t. any given finite set of variables.
However, there is another and more efficient approach to deal with this prob-
lem than the standard approach to rename and then do unification. Our solution
is to switch to another kind of terms where instead of variables there is just one
special constructor “” representing an arbitrary fresh variable. In essence, this
data structure represents contexts which do not contain variables, but where mul-
tiple holes are allowed. Therefore in the following we speak of ground-contexts
and use C, D, . . . to denote them.
Definition 11. Let C be the equivalence class of a ground-context C where the
holes are filled with arbitrary terms: C = {C[t1 , . . . , tn ] | t1 , . . . , tn ∈ T (F , V)}.
Obviously, every ground-context C can be turned into a term t which only con-
tains distinct fresh variables and vice-versa. Moreover, every unification problem
between t and can be formulated as a ground-context matching problem be-
tween C and , which is satisfiable iff there is some μ such that μ ∈ C.
Since the result of tcap is always a term which only contains distinct fresh
variables, we can do the computation of tcap using the data structure of ground-
contexts; it only requires an algorithm for ground-context matching. To this end
we first generalize ground-context matching problems to multiple pairs (Ci , i ).
merge(, C) ⇒merge C
merge(C, ) ⇒merge C
merge(f (C n ), g(D k )) ⇒merge ⊥ if f = g or n = k
merge(f (C n ), f (Dn )) ⇒merge f (merge(C1 , D1 ), . . . , merge(Cn , Dn ))
f (. . . , ⊥, . . .) ⇒merge ⊥
Note that our implementations of the matching algorithm and the merge func-
tion in IsaFoR are slightly different due to different data structures. For example
matching problems are represented as lists of pairs, so it may occur that we have
duplicates in M. The details of our implementation can be seen in IsaFoR (theory
Edg) or in the source of CeTA.
Soundness and completeness of our algorithms are proven in IsaFoR.
One can also reformulate the desired check to estimate the dependency graph
whether tcap(t) does not unify with u in terms of etcap. It is the same requirement
as demanding (etcap(t), u) ⇒∗match ⊥. Again, the soundness of this estimation
has been proven in IsaFoR where the second part of the theorem is a direct
consequence of the first part by using the soundness of the matching algorithm.
checkDepGraphProc(P, L, R) = checkDecomposition(checkEdg(R), P, L)
In this way, for every new class of reduction pairs, we do not have to prove
transitivity of or anymore, as it would be required for Thm. 18. Currently, we
just support reduction pairs based on polynomial interpretations with negative
constants [13], but we plan to integrate other reduction pairs in the future.
For checking an application of a reduction pair processor we implemented
a generic function checkRedPairProc in Isabelle, which works as follows. It
takes as input two functions checkS and checkNS which have to approximate a
reduction pair, i.e., whenever checkS(s, t) is accepted, then s t must hold in
the corresponding reduction pair and similarly, checkNS has to guarantee s t.
Then checkRedPairProc(checkS, checkNS, P, P , R) works as follows:
• iterate once over P to divide P into P and P where the former set contains
all pairs of P where checkS is accepted
• ensure for all s → t ∈ R∪P that checkNS(s, t) is accepted, otherwise reject
• accept if P ⊆ P , otherwise reject
From Sect. 3–5 we have basic checks for the three techniques of applying depen-
dency pairs (checkDPs), the dependency graph processor (checkDepGraphProc),
and the reduction pair processor (checkRedPairProc). For representing proof
trees within the DP framework we used the following data structures in IsaFoR.
datatype ’f RedPair = NegPolo "(’f × (cint × nat list))list"
datatype (’f,’v)DPProof = . . . 6
| PisEmpty
| RedPairProc "’f RedPair" "(’f,’v)trsL" "(’f,’v)DPProof"
| DepGraphProc "((’f,’v)DPProof option × (’f,’v)trsL)list"
datatype (’f,’v)TRSProof = . . . 6
| DPTrans "(’f shp,’v)trsL" "(’f shp,’v)DPProof"
6
CeTA supports even more techniques, cf. CeTA’s website for a complete list.
Certification of Termination Proofs Using CeTA 465
The first line fixes the format for reduction pairs, i.e., currently of (linear)
polynomial interpretations where for every symbol there is one corresponding en-
try. E.g., the list [(f, (−2, [0, 3]))] represents the interpretation where Pol(f )(x, y)
= max(−2 + 3y, 0) and Pol(g)(x1 , . . . , xn ) = 1 + Σ1≤i≤n xi for all f = g.
The datatype DPProof represents proof trees for DP problems. Then the check
for valid DPProofs gets as input a DP problem (P, R) and a proof tree and tries to
certify that (P, R) is finite. The most basic technique is the one called PisEmpty,
which demands that the set P is empty. Then (P, R) is trivially finite.
For an application of the reduction pair processor, three inputs are required.
First, the reduction pair redp, i.e., some polynomial interpretation. Second, the
dependency pairs P that remain after the application of the reduction pair
processor. Here, the datatype trsL is an abbreviation for lists of rules. And
third, a proof that the remaining DP problem (P , R) is finite. Then the checker
just has to call createRedPairProc(redp, P, P , R) and additionally calls itself
recursively on (P , R). Here, createRedPairProc invokes checkRedPairProc
where checkS and checkNS are generated from redp.
The most complex structure is the one for decomposition of the (estimated)
dependency graph. Here, the topological list for the decomposition has to be
provided. Moreover, for each subproblem P , there is an optional proof tree.
Subproblems where a proof is given are interpreted as “real SCCs” whereas the
ones without proof remain unmarked for the function checkDepGraphProc.
The overall function for checking proof trees for DP problems looks as follows.
checkDPProof(P,R,PisEmpty) = (P = [])
checkDPProof(P,R,(RedPairProc redp P prf)) =
createRedPairProc(redp,P,P ,R) ∧ checkDP(P ,R,prf)
checkDPProof(P,R,DepGraphProc P s) =
checkDepGraphProc(P,map
! (λ(prfO,P ).(isSome prfO,P )) P s, R)
∧ (Some prf,P )∈P s checkDPProof(P ,R,prf)
XML, we get the same string as the input string for the TRS (modulo white-
space). This is a major benefit in comparison to the two other approaches where
it can and already has happened that the uncertified components Rainbow/CiME
produced a wrong proof goal from the input TRS, i.e., they created a termination
proof within Coq for a different TRS than the input TRS.
7 Error Messages
To generate readable error messages, our checks do not have a Boolean return
type, but a monadic one (isomorphic to ’e option). Here, None represents an
accepted check whereas Some e represents a rejected check with error message e.
The theory ErrorMonad contains several basic operations like >> for conjunction
of checks, <- for changing the error message, and isOK for testing acceptance.
Using the error monad enables an easy integration of readable error messages.
For example, the real implementation of checkTRSProof looks as follows:
fun checkTRSProof where "checkTRSProof R (DPTrans P prf) = (
checkDPs R P
<- (λs. ’’error . . .’’ @ showTRS R @ ’’. . .’’ @ showTRS P @ s)
>> checkDPProof P (computeID R) prf
<- (λs. ’’error below switch to dependency pairs’’ @ s))"
However, since we do not want to adapt the proofs every time the error mes-
sages are changed, we setup the Isabelle simplifier such that it hides the details of
the error monad, but directly removes all the error handling and turns monadic
checks via isOK(...) into Boolean ones using the following lemmas.
lemma "isOK(m >> n) = isOK(m) ∧ isOK(n)"
lemma "isOK(m <- s) = isOK(m)"
Then for example isOK(checkTRSProof R (DPTrans P prf)) directly simpli-
fies to isOK(checkDPs R P) ∧ isOK(checkDPProof P (computeID R) prf).
The 10 proofs that CeTA rejected are all for nonterminating TRSs which do
not satisfy the variable condition. Since TC supports only polynomial orders as
reduction pairs, it can handle less TRSs than the other combinations. But, there
are 44 TRSs which are only solved by TC (and TC+ ), the reason being the time-
limit of 60 seconds (19 TRSs), the dependency graph estimation (8 TRSs), and
the polynomial order allowing negative constants (17 TRSs).
The second line clearly shows that TC+ (with nontermination and usable rules
support) currently is the most powerful combination with 786 certified proofs.
Moreover, TC+ can handle 214 nonterminating and 102 terminating TRSs where
none of ACC, CCC, ARC, and MRC were successful. The efficiency of CeTA is also
clearly visible: the average certification time in TC and TC+ for a single proof is
by a factor of 50 faster than in the other combinations.8
For more details on the experiments we refer to CeTA’s website.
To conclude, we presented a modular and competitive termination certifier,
CeTA, which is directly created from our Isabelle library on term rewriting, IsaFoR.
Its main features are that CeTA is available as a stand-alone binary, the efficiency,
the dependency graph estimation, nontermination and usable rules support, the
error handling, and the robustness.
As each sub-check for a termination technique can be called separately, and as
our check to certify a whole termination proof just invokes these sub-checks, it
seems possible to integrate other techniques (even if they are proved in a different
theorem prover) as long as they are available as executable code. However, we
will need a common proof format and a compatible definition.
As future work we plan to certify several other termination techniques where
we already made progress in the formalization of semantic labeling and the
subterm-criterion. We would further like to contribute to a common proof format.
8
Note that in the experiments above, for each TRS, each combination might have
certified a different proof. In an experiment where the certifiers where run on the
same proofs for each TRS (using only techniques that are supported by all certifiers,
i.e., EDG and linear polynomials without negative constants), CeTA was even 190
times faster than the other approaches and could certify all 358 proofs, whereas
each of the other two approaches failed on more than 30 proofs due to timeouts.
468 R. Thiemann and C. Sternagel
References
1. Arts, T., Giesl, J.: Termination of term rewriting using dependency pairs. Theo-
retical Computer Science 236, 133–178 (2000)
2. Baader, F., Nipkow, T.: Term Rewriting and All That. Cambridge University Press,
Cambridge (1998)
3. Blanqui, F., Delobel, W., Coupet-Grimal, S., Hinderer, S., Koprowski, A.: CoLoR,
a Coq library on rewriting and termination. In: Proc. WST 2006, pp. 69–73 (2006)
4. Contejean, E., Courtieu, P., Forest, J., Pons, O., Urbain, X.: Certification of au-
tomated termination proofs. In: Konev, B., Wolter, F. (eds.) FroCos 2007. LNCS,
vol. 4720, pp. 148–162. Springer, Heidelberg (2007)
5. Contejean, E., Marché, C., Monate, B., Urbain, X.: CiME, http://cime.lri.fr
6. Courtieu, P., Forest, J., Urbain, X.: Certifying a termination criterion based on
graphs, without graphs. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs
2008. LNCS, vol. 5170, pp. 183–198. Springer, Heidelberg (2008)
7. Dershowitz, N.: Termination dependencies. In: Proc. WST 2003, pp. 27–30 (2003)
8. Giesl, J., Thiemann, R., Schneider-Kamp, P.: The dependency pair framework:
Combining techniques for automated termination proofs. In: Baader, F., Voronkov,
A. (eds.) LPAR 2004. LNCS (LNAI), vol. 3452, pp. 301–331. Springer, Heidelberg
(2005)
9. Giesl, J., Thiemann, R., Schneider-Kamp, P.: Proving and disproving termina-
tion of higher-order functions. In: Gramlich, B. (ed.) FroCos 2005. LNCS (LNAI),
vol. 3717, pp. 216–231. Springer, Heidelberg (2005)
10. Giesl, J., Schneider-Kamp, P., Thiemann, R.: AProVE 1.2: Automatic termination
proofs in the DP framework. In: Furbach, U., Shankar, N. (eds.) IJCAR 2006.
LNCS (LNAI), vol. 4130, pp. 281–286. Springer, Heidelberg (2006)
11. Haftmann, F.: Code generation from Isabelle/HOL theories (April 2009),
http://isabelle.in.tum.de/doc/codegen.pdf
12. Hirokawa, N., Middeldorp, A.: Automating the dependency pair method. Informa-
tion and Computation 199(1-2), 172–199 (2005)
13. Hirokawa, N., Middeldorp, A.: Tyrolean Termination Tool: Techniques and features.
Information and Computation 205(4), 474–511 (2007)
14. Korp, M., Sternagel, C., Zankl, H., Middeldorp, A.: Tyrolean Termination Tool 2.
In: Proc. RTA 2009. LNCS, vol. 5595, pp. 295–304 (2009)
15. Martelli, A., Montanari, U.: An efficient unification algorithm. ACM Transactions
on Programming Languages and Systems 4(2), 258–282 (1982)
16. Nipkow, T., Paulson, L.C., Wenzel, M.T.: Isabelle/HOL. LNCS, vol. 2283. Springer,
Heidelberg (2002)
17. Peyton Jones, S., et al.: The Haskell 98 language and libraries: The revised report.
Journal of Functional Programming 13(1)–255 (2003)
18. Waldmann, J.: Matchbox: A tool for match-bounded string rewriting. In: van Oost-
rom, V. (ed.) RTA 2004. LNCS, vol. 3091, pp. 85–94. Springer, Heidelberg (2004)
A Formalisation of Smallfoot in HOL
Thomas Tuerk
1 Motivation
Separation logic is an extension of Hoare logic that allows local reasoning [7, 9].
It is used to reason about mutable data structures in combination with low
level imperative programming languages that use pointers and explicit memory
management. Thanks to local reasoning, it scales better than classical Hoare
logic to the verification of large programs and can easily be used to reason
about parallelism. There are several implementations: Smallfoot [2], SLAyer1
and SpaceInvader [5] are probably some of the best know examples. Moreover,
there are formalisations inside theorem provers [1, 6, 10, 11].
The problem, as I see it, is that all these tools and formalisations focus on one
concrete setting. They fix the programming languages, their exact semantics, the
supported specifications etc. However, there are a lot of different possible design
choices and the tools differ in these. I’m therefore building a general framework
for separation logic in HOL that can be instantiated to a variety of different
separation logics. By building such a framework, I hope to be able to concentrate
on the essence of separation logic as well as keeping the formalisation clean and
easy.
In this paper, the results of these efforts to build a separation logic framework
in HOL are presented. The framework is based on Abstract Separation Logic [4],
1
http://research.microsoft.com/SLAyer/
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 469–484, 2009.
c Springer-Verlag Berlin Heidelberg 2009
470 T. Tuerk
instantiated to build Holfoot. The paper ends with a section about future work
and some conclusions.
2 Formalisation of Smallfoot
Smallfoot [2] is one of the oldest and best documented separation logic tools.
It is able to automatically prove specifications about programs written in a
simple, low-level imperative language, which is designed to resemble C. This
language contains pointers, local and global variables, dynamic memory alloca-
tion/deallocation, conditional execution, while-loops and recursive procedures
with call-by-value and call-by-reference arguments. Moreover, there is support
for parallelism with conditional critical regions that synchronise the access to so-
called resources. Smallfoot-specifications are concerned with the shape of mem-
ory. Common specifications, for example, say that some stack-variable points to
a single linked list in memory. However, nothing is e. g. said about the length of
the list or about its data-content.
Smallfoot comes with a selection of example specifications. There are com-
mon algorithms about single linked lists like copying, reversing or deallocating
them. Another set of examples contains similar algorithms for trees. There is an
implementation of mergesort, some code about queues, circular-lists, buffers and
similar examples. Holfoot3 is able to parse Smallfoot-specifications and prove
most of the mentioned examples completely automatically inside the HOL the-
orem prover.
While some features like local variables or procedures with call-by-value ar-
guments took some effort, and while it turned out to be useful to use explicit
permission for stack-variables, it was nevertheless possible to formalise Small-
foot based on Abstract Separation Logic in a natural way. As far as I know,
this is the first time Abstract Separation Logic has been used to implement a
separation logic tool. The formalisation of Smallfoot illustrates that Abstract
Separation Logic is powerful and flexible enough to model languages and spec-
ifications used by well-known separation logic tools. Moreover, it demonstrates
that it is possible to automate reasoning in this framework. While Holfoot is
slower than Smallfoot, it provides the additional assurance of a formal proof
inside HOL. That this is really valuable, is underlined by the fact, that an error
in Smallfoot was detected while building Holfoot. Due to a bug in its implemen-
tation, Smallfoot handles call-by-value parameters like call-by-reference ones.
However, besides a formal foundation and much higher trust in the tool, an-
other advantage of Holfoot is, that it is straightforward to use all the libraries
and proof-tools HOL provides. Smallfoot specifications talk about the shape
of data-structures. The Smallfoot-specification of mergesort for example states
that mergesort returns a single linked list. It does not guarantee anything about
the content of this list, much less that mergesort really sorts lists. In fact, to
prove a fully functional specification of mergesort, substantial knowledge about
3
Holfoot as well as a collection of examples can be found in the HOL-repository.
472 T. Tuerk
permutations of lists, orderings and sorted lists is needed. Here, the existing
infrastructure of HOL is very useful.
Once the formalisation of the features provided by Smallfoot was completed, it
was straight-forward to extend it with support for the content of data-structures.
This allows the verification of fully functional specifications. Holfoot is able to
automatically verify fully functional specifications of simple algorithms like list-
reversal, list-copy or list-length:
list_copy(z;c) [data_list(c,data)] { list_reverse(i;) [data_list(i,data)] {
local x,y,w,d; local p, x;
if (c == NULL) {z=NULL;} p = NULL;
else { while (i != NULL) [
z=new(); z->tl=NULL; data_list(i,_idata) *
x = c->dta; z->dta = x; data_list(p,_pdata) *
w=z; ‘‘(data:num list) =
y=c->tl; (REVERSE _pdata) ++ _idata‘‘] {
while (y != NULL) [ x = i->tl; i->tl = p; p = i; i = x;
data_lseg(c, }
‘‘_data1++[_cdate]‘‘,y) * i = p;
data_list(y,_data2) * } [data_list(i,‘‘REVERSE data‘‘)]
data_lseg(z,_data1,w) *
w |-> tl:0,dta:_cdate *
‘‘data:num list = list_length(r;c) [data_list(c,cdata)] {
_data1 ++ _cdate::_data2‘‘] { local t;
d=new(); d->tl=NULL; if (c == NULL) {r = 0;} else {
x=y->dta; d->dta=x; t = c->tl;
w->tl=d; w=d; list_length(r;t);
y=y->tl; r = r + 1;
} }
} } [data_list(c,cdata) *
} [data_list(c,data) * data_list(z,data)] r == ‘‘LENGTH (cdata:num list)‘‘]
The syntax of the this pseudo-code used by Smallfoot and Holfoot is indented
to be close to C. However, there are some uncommon features: the arguments
of a procedure before the semicolon are call-by-reference arguments, the others
call-by-reference ones. So the argument z of list copy is a call-by-reference
argument, whereas c is a call-by-value argument. The pre- and postconditions of
procedures are denoted in brackets around the procedure’s body. Similarly, loops
are annotated with their invariant. In specifications, a variable name that starts
with an underscore denotes an existentially quantified variable. For example,
data1, data2 and cdate are existentially quantified in the loop-invariant of
copy. This invariant requires that data can somehow be split into these three.
How it is split changes from iteration to iteration. Finally, everything within
quotation marks is regarded as a HOL term. So, REVERSE or LENGTH are not
part of the Smallfoot formalisation but functions from HOL’s list library.
While these simple algorithms can be handled completely automatically, more
complicated ones like the aforesaid mergesort need user interaction. However,
even in these interactive proofs, there is a clear distinction between reasoning
about the content and about the shape. While the shape can mostly be handled
automatically, the user is left to reason about properties of the content. Let’s
consider the following specification of parallel mergesort:
merge(r;p,q) [data_list(p,pdata) * if (q == NULL) r = p;
data_list(q,qdata) * else if(p == NULL) r = q;
‘‘(SORTED $<= pdata) /\ else {
(SORTED $<= qdata)‘‘] { p_date = p->dta;
local t, q_date, p_date; q_date = q->dta;
A Formalisation of Smallfoot in HOL 473
After parsing and preprocessing the specification stored in the given file, ver-
ification conditions are generated using SMALLFOOT VC TAC. This single call is
sufficient to eliminate the whole program structure and leave just the described
verification conditions. The next line calls some proof-tools for permutations and
sorted lists and is able to discharge most of the verification conditions. The rest
of the proof-script handles the remaining verification conditions which are all of
the aforesaid form.
As this example illustrates, human interaction is often only needed to reason
about the essence of an algorithms and HOL provides powerful tools to aid this
474 T. Tuerk
reasoning. This shows the power of Holfoot and with it the flexibility and power
of the whole framework.
In the previous section, a high-level view of Holfoot and with it of the framework
and its capabilities was presented. In this section its semantic foundations –
Abstract Separation Logic [4] – will be explained. This explanation follows closely
the HOL formalisation4 .
Abstract Separation Logic abstracts from both the concrete states and the
concrete programming language. Instead of using a concrete model of memory
consisting usually of a stack and a heap, Abstract Separation Logic uses an
abstract set of states Σ. A partial function ◦, called separation combinator, is
used to combine states.
– ◦ is partially associative
– ◦ is partially commutative
– ◦ is cancellative, i. e.
∀s1 , s2 , s3 . Defined(s1 ◦ s2 ) ∧ (s1 ◦ s2 = s1 ◦ s3 ) =⇒ (s2 = s3 ) holds
– for all states s there exists a neutral element us with us ◦ s = s
s1 # s2 iff s1 ◦ s2 is defined s1 ( s3 iff ∃s2 . s3 = s1 ◦ s2
P ∗ Q := {s | ∃p, q. (p ◦ q = s) ∧ p ∈ P ∧ q ∈ Q}
emp := {u | ∃s. u ◦ s = s}
Example 4. Heaps, modelled as finite partial functions, are commonly used with
separation logic. In this model, Σ is the set of all heaps and ◦ is given by
"
h1 h2 iff dom(h1 ) ∩ dom(h2 ) = ∅
h1 ◦ h2 =
undefined otherwise
In this setting, two heaps are disjoint (h1 # h2 ) iff their domains are disjoint.
The combination of two separate heaps (h1 ◦ h2 ) is their disjoint union. The
empty heap is the neutral element for all heaps.
3.1 Actions
The programming language used by Abstract Separation Logic is abstract as
well. Its elementary constructs are actions.
Definition 5 (Action). An action act : Σ → P (Σ) is a function from a state
to a set of states or a special failure state .
If executing an action act in a state s results in , then an error may occur
during the execution of the action. Otherwise, if act(s) results in a set of states
S, no error can occur and executing the action will nondetermistically lead to
one of the states in S. The empty set can be used to model actions that do not
terminate. Actions can be combined to form new actions. The most common
combination is consecutive execution:
⎧
⎪
⎪ if act1 (s) =
⎨ if ∃s . s ∈ act1 (s) ∧ act2 (s ) =
(act1 ; act2 )(s) =
⎪
⎪
⎩
act2 (s ) otherwise
s ∈act1 (s)
In order to provide local reasoning, only those actions are considered whose
specifications can be safely extended using this inference rule. These actions are
called local.
Definition 7 (Local Actions). An action act is called local , iff for all states
s, s1 , s2 with s = s1 ◦ s2 and act(s1 ) = the evaluation of the action on the
extended state does not fail (act(s) = ) and act(s) ⊆ act(s1 ) ∗ {s2 } holds.
The skip action defined by skip(s) := {s} is a simple example of a local action.
Other examples are diverge(s) := ∅ or fail(s) := . Sequential composition and
nondeterministic choice preserve locality. The set of local actions forms together
with the following order a complete lattice.
Definition 8 (Order of Actions). act1 ' act2 iff act2 allows more behaviour
than act1 , i. e. iff ∀s. (act2 (s) = ) ∨ (act1 (s) ⊆ act2 (s)) holds. Notice that this
is equivalent to ∀P, Q. 2 P # act2 2 Q # =⇒ 2 P # act1 2 Q #.
This lattice of local actions is used to define a best local action as an infimum of
local actions in this lattice. The HOL formalisation contains the corresponding
definitions and theorems. However, here the discussion of this lattice is skipped.
Instead an equivalent, high level characterisation is used.
One common use of the best local action bla are the materialisation and anni-
hilation actions. materialise(P ) := bla[emp, P ] can be used to materialise some
new part of the state that satisfies the predicate P . Similarly, annihilate(P ) :=
bla[P, emp] is used to annihilate some part of the state that satisfies P . No-
tice, that for certain P the annihilation annihilate(P ) behaves unexpectedly. If
there is more than one substate that satisfies P , then annihilate(P ) diverges.
Therefore, usually just precise predicates are used with annihilation:
Another useful local action is assume. Given a predicate, assume skips if the
predicate holds and diverges if it does not hold. In the next section, assume is
used in combination nondeterministic choice and Kleene star to model condi-
tional execution and loops. In order to be a local action, the predicate has to be
intuitionistic, though.
Definition 12 (Intuitionistic Predicate). A predicate P is called intuition-
istic, iff P ∗ true = P holds. This means that iff P holds for a state s, then it
holds for all superstates s 3 s as well. The intuitionistic negation ¬i P holds in
a state s, if P does not hold for all superstates s 3 s. P is called decided in a
set of states S, iff ∀s ∈ S. s ∈ P ∨ s ∈ ¬i P holds.
For an intuitionistic predicate P the local action assume(P ) can be defined as
⎧
⎨ {s} if s ∈ P
assume(P )(s) = ∅ if s ∈ ¬i P
⎩
otherwise
3.2 Programs
n
Tpenv (act) = {act}
n
Tpenv (pt1 ; pt2 ) = {t1 · t2 | t1 ∈ Tpenv
n
(pt1 ) ∧ t2 ∈ Tpenv
n
(pt2 )}
n
Tpenv (pt1 || pt2 ) = t1 zip t2
n (pt ),t ∈T n (pt )
t1 ∈Tpenv 1 2 penv 2
⎧
⎪
⎪ {fail} if name ∈
/ dom(penv)
⎨ ∅ if name ∈ dom(penv) ∧ n = 0
n
Tpenv (proccall(name, arg)) =
⎪
⎪
n−1
Tpenv (pt) otherwise
⎩
pt∈penv(name,arg)
n
Tpenv (l.pt) = {remove-locks(l,t) | t ∈ Tpenv
n
(pt) ∧ t is l-synchronised}
n
Tpenv (with l do pt) = {P (l) · t · V (l) | t ∈ Tpenv
n
(pt)}
Finally, the traces of a proto-trace pt and a program p with respect to penv are
defined as
n
Tpenv (pt) = Tpenv (pt) Tpenv (p) = Tpenv (pt)
n∈N pt∈p
A Formalisation of Smallfoot in HOL 479
It remains to define the semantics of traces. Local actions in traces are just inter-
preted by themselves. Checks are added to enforce race-freedom. The semantics
of lock actions is however more complicated.
One central idea behind Concurrent Separation Logic is to split the state into
parts for each thread and each lock: a lock protects a part of the state. If a
thread holds a lock, it can access this state, otherwise it cannot. Therefore, a
precise predicate called lock invariant is associated with each lock. This invariant
abstracts the part of the state that is protected by the lock. materialise and
annihilate actions are used to make this abstracted state accessible/inaccessible.
Definition 19 (Semantics of Atomic Actions). The semantics of an atomic
action with respect to a lock-environment lenv : locks → P (Σ) is given by
actlenv = act
⎧
⎨ if ∃s1 , s2 . s = s1 ◦ s2 ∧
{s}
check(act1 , act2 )lenv (s) = act1 (s1 ) = ∧ act2 (s2 ) =
⎩
otherwise
P (l)lenv = materialise(lenv(l))
V (l)lenv = annihilate(lenv(l))
Notice that the semantics of a program is a always a local action. This allows
concepts for actions to be easily lifted to programs:
Definition 21 (Hoare triple). A Hoare triple (penv,lenv) {P } prog {Q} holds,
iff 2 P # prog(penv,lenv) 2 Q # holds. If a Hoare triple holds for all
environments, it is written as {P } prog {Q}.
Definition 22 (Program Abstractions). A program p2 is an abstraction of
a program p1 with respect to some environment env (denoted as p1 'env p2 ), iff
p1 env ' p2 env holds.
the semantic foundations. Some important inference rules, that are valid in
Abstract Separation Logic are:
P2 ⇒ P1 Q1 ⇒ Q2 p1 p2
env {P1 }p{Q1 } env {P }p2 {Q}
env {P2 } p{Q2 } env {P } p1 {Q}
B is decided in P
env {P } assume(B) {P ∧ B} env {Parg } qbla[P(·) , Q(·) ]{Qarg }
B is decided in P
name ∈ dom(penv) env {B ∧ P } p1 {Q}
(penv,lenv) {P } penv(name, arg) {Q} env {¬i B ∧ P } p2 {Q}
(penv,lenv) {P } proccall(name, arg){Q} env {P } if B then p1 else p2 {Q}
lenv(l) = r lenv(l) = r
(penv,lenv) {P } p {Q} (penv,lenv) {P ∗ r} p {Q ∗ r}
(penv,lenv) {P ∗ r} l.p {Q ∗ r} (penv,lenv) {P } with l do p{Q}
These inference rules are very useful. However, the reader might notice, that
there is a problem with recursive functions. The inference rule that handles
procedure-calls replaces the call with the definition of the procedure. This is
fine for non-recursive functions. However, an implicit induction is needed for
recursive functions.
Definition 23 (Procedure Specification). A procedure specification consists
of a lock-environment lenv, a procedure-environment penv and specification func-
tions P(·,·,·) , Q(·,·,·). It holds, iff all procedures satisfy their specification in the
given environment:
3.5 Holfoot
the value found in the heap at that location indexed by tag t. Otherwise, i. e. if
the location is not in the heap, the action fails.
Similarly to actions, it is straightforward to define predicates. For example,
e1 |-> L is defined as:
val holfoot_ap_points_to_def = Define ‘
holfoot_ap_points_to e1 L = \(st,h):holfoot_state.
let loc_opt = (e1 st) in (IS_SOME (loc_opt) /\
let (loc = THE loc_opt) in (~(loc = 0) /\ ((FDOM h)= {loc}) /\
(FEVERY (\(tag,exp). IS_SOME (exp st) /\ (THE (exp st) = (h’ loc) tag)) L))‘;
This definition of |-> is used to define predicates for single linked lists. The data
content of these lists is represented by lists of natural numbers. Therefore HOL’s
list libraries can be used to reason about the data content.
Since actions and predicates are shallowly embedded, it is easy to extend
Holfoot with new actions and predicates. Moreover, the automation has been
designed with extensions in mind.
Acknowledgements
I would like to thank Matthew Parkinson, Mike Gordon, Alexey Gotsman, Mag-
nus Myreen and Viktor Vafeiadis for a lot of discussions, comments and criticism.
484 T. Tuerk
References
[1] Appel, A.W., Blazy, S.: Separation logic for small-step Cminor. In: Schneider, K.,
Brandt, J. (eds.) TPHOLs 2007. LNCS, vol. 4732, pp. 5–21. Springer, Heidelberg
(2007)
[2] Berdine, J., Calcagno, C., O’Hearn, P.W.: Smallfoot: Modular automatic assertion
checking with separation logic. In: de Boer, F.S., Bonsangue, M.M., Graf, S.,
de Roever, W.-P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 115–137. Springer,
Heidelberg (2006)
[3] Brookes, S.: A semantics for concurrent separation logic. Theor. Comput.
Sci. 375(1-3), 227–270 (2007)
[4] Calcagno, C., O’Hearn, P.W., Yang, H.: Local action and abstract separation logic.
In: LICS 2007: Proceedings of the 22nd Annual IEEE Symposium on Logic in
Computer Science, Washington, DC, USA, pp. 366–378. IEEE Computer Society,
Los Alamitos (2007)
[5] Distefano, D., O’Hearn, P.W., Yang, H.: A local shape analysis based on separation
logic. In: Hermanns, H., Palsberg, J. (eds.) TACAS 2006. LNCS, vol. 3920, pp.
287–302. Springer, Heidelberg (2006)
[6] Marti, N., Affeldt, R., Yonezawa, A.: Towards formal verification of memory prop-
erties using separation logic. In: 22nd Workshop of the Japan Society for Soft-
ware Science and Technology, Tohoku University, Sendai, Japan, September 13–15.
Japan Society for Software Science and Technology (2005)
[7] O’Hearn, P.W., Reynolds, J.C., Yang, H.: Local reasoning about programs that
alter data structures. In: Fribourg, L. (ed.) CSL 2001 and EACSL 2001. LNCS,
vol. 2142, pp. 1–19. Springer, Heidelberg (2001)
[8] Parkinson, M., Bornat, R., Calcagno, C.: Variables as resource in hoare logics.
In: LICS 2006: Proceedings of the 21st Annual IEEE Symposium on Logic in
Computer Science, Washington, DC, USA, pp. 137–146. IEEE Computer Society,
Los Alamitos (2006)
[9] Reynolds, J.C.: Separation logic: A logic for shared mutable data structures. In:
LICS 2002: Proceedings of the 17th Annual IEEE Symposium on Logic in Com-
puter Science, Washington, DC, USA, pp. 55–74. IEEE Computer Society, Los
Alamitos (2002)
[10] Tuch, H., Klein, G., Norrish, M.: Types, bytes, and separation logic. In: POPL
2007: Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on
Principles of programming languages, pp. 97–108. ACM, New York (2007)
[11] Weber, T.: Towards mechanized program verification with separation logic. In:
Marcinkowski, J., Tarlecki, A. (eds.) CSL 2004. LNCS, vol. 3210, pp. 250–264.
Springer, Heidelberg (2004)
Liveness Reasoning with Isabelle/HOL
1 Introduction
Paulson’s inductive approach has been used to verify safety properties of many secu-
rity protocols [1, 2]. The success gives incentives to extend this approach to a general
approach for protocol verification. To achieve this goal, a method for the verification
of liveness properties is needed. According to Manna and Pnueli [3], temporal proper-
ties are classified into three classes: safety properties, response properties and reactivity
properties. The original inductive approach only deals with safety properties. In this pa-
per, proof rules for liveness properties (both response and reactivity) are derived under
the same execution model as the original inductive approach. These liveness proof rules
can be used to reduce the proof of liveness properties to the proof of safety properties,
a task well solved by the original approach.
The proof rules are derived based on a new notion of fairness, parametric fairness,
which is an adaption of the α-fairness [4,5,6] to the setting of HOL. Parametric fairness
is properly stronger than standard fairness notions such as weak fairness and strong
fairness. We will explain why the use of parametric fairness can deliver more liveness
results through simpler proofs.
A probabilistic model is established to show the soundness of our new fairness no-
tion. It is proved that the set of all parametrically fair execution traces is measurable
and has probability 1. Accordingly, the definition of parametric fairness is reasonable.
This research was funded by 863 Program(2007AA01Z409) and NNSFC(60373068) of China.
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 485–499, 2009.
c Springer-Verlag Berlin Heidelberg 2009
486 J. Wang, H. Yang, and X. Zhang
The practicability of this liveness reasoning approach has been confirmed by exper-
iments of various sizes [7, 8, 9]. All the work has been formalized with Isabelle/HOL
using Isar [10, 11], although the general approach is not necessarily confined to this
particular system.
The paper is organized as the following: section 2 presents concurrent system, the
execution model of inductive approach; section 3 gives a shallow embedding of LTL
(Linear Temporal Logic); section 4 gives an informal explanation of parametric fair-
ness; section 5 explains the liveness proof rules; section 6 establishes a probabilis-
tic model for parametric fairness; section 7 describes liveness verification examples.
section 8 discusses related works; section 9 concludes.
2 Concurrent Systems
In the inductive approach, system state only changes with the happening of events. Ac-
cordingly, it is natural to represent a system state with the list of events happening so
far, arranged in reverse order. A system is concurrent because its states are nondeter-
ministic, where a state is nondeterministic, if there are more than one event eligible to
happen under that state. The specification of a concurrent system is just a specification
of this eligible relation. Based on this view, formal definition of concurrent system is
given in Fig. 1.
σi ≡ σ i
primrec [[σ]]0 = []
[[σ]](Suc i) = σ i # [[σ]]i
τ [cs> e ≡ (τ , e) ∈ cs
The type of concurrent systems is ( a list × a) set, where a is the type of events.
Concurrent systems are written as cs. The expression (τ , e) ∈ cs means that event
e is eligible to happen under state τ in concurrent system cs. The notation (τ , e) ∈ cs
is abbreviated as τ [cs> e. The set of reachable states is written as vt cs and τ ∈ vt cs
is abbreviated as cs τ . An execution of a concurrent system is an infinite sequence of
events, represented as a function with type nat ⇒ a. The i-th event in execution σ is
abbreviated as σ i . The prefix consisting of the first i events is abbreviated as [[σ]]i . For σ
to be a valid execution of cs (written as cs σ), σ i must be eligible to happen under [[σ]]i .
Liveness Reasoning with Isabelle/HOL 487
3 Embedding LTL
LTL (Linear Temporal Logic) used to represent liveness properties in this paper is de-
fined in Fig. 2. LTL formulae are written as ϕ, ψ, κ etc. The type of LTL formulae is
defined as a tlf. The expression (σ, i) |= ϕ means that LTL formula ϕ is valid at mo-
ment i of the infinite execution σ. The operator |= is overloaded, so that σ |= ϕ can be
defined as the abbreviation of (σ, 0) |= ϕ. The always operator , eventual operator ♦,
next operator 1, until operator are defined literally. An operator - is defined to lift a
predicate on finite executions up to a LTL formula. The temporal operator !→ is the lift
of logical implication −→ up to LTL level. For an event e, the term (|e|) is a predicate
on finite executions stating that the last happened event is e. Therefore, the expression
(|e|) is an LTL formula saying that event e happens at the current moment.
ϕ ≡ λ σ i. ∀ j. i ≤ j −→ (σ, j) |= ϕ
♦ϕ ≡ λ σ i. ∃ j. i ≤ j ∧ (σ, j) |= ϕ
constdefs EFα :: ( a list × a) set ⇒ ( a list ⇒ bool) ⇒ ( a list ⇒ a) ⇒ a seq ⇒ bool
EFα cs P E σ ≡ σ |= ♦(λ σ i. P [[σ]]i ∧ [[σ]]i [cs> E [[σ]]i )
−→ σ |= ♦(λ σ i. P [[σ]]i ∧ σ i = E [[σ]]i )
datatype Evt = e0 | e1 | e2 | e3 | e4
Unfortunately, neither WF nor SF serves this purpose. Consider the execution (e2 . e1 .
e4 . e0 )ω , which satisfies both WF cs2 and SF cs2 while violating the conclusion of (1).
The deficiency of standard fairness notions such as SF and WF is their failure to
specify the association between helpful states and helpful events explicitly. For exam-
ple, SF only requires any infinitely enabled event be executed infinitely often. Even
though execution (e2 . e1 . e4 . e0 )ω satisfies SF, it is not intuitively random in that it is
still biased towards avoiding the helpful event e0 under the corresponding helpful state
1. Therefore, (1) is not valid if ?F is instantiated either to SF or WF. Extreme fairness
was proposed by Pnueli [12] to solve this problem. The EF is a direct expression of ex-
treme fairness in HOL, which requires that all expressible (P,E)-pairs be fairly treated.
Execution (e2 . e1 . e4 . e0 )ω does not satisfy EF, because the (P, E)-pair (λτ . F 2 τ = 1,
λτ . e1 ) is not fairly treated. In fact, (1) is valid if ?F is instantiated to EF.
Unfortunately, a direct translation of extreme fairness in HOL is problematic, be-
cause the universal quantification over P, E may accept any well-formed expression of
the right type. Given any nondeterministic execution σ, it is possible to construct a pair
(Pσ , Eσ ) which is not fairly treated by σ. Accordingly, any nondeterministic execution
σ is not EF, as confirmed by the following lemma:
treated, PF only requires (P, E)-pairs appearing in its parameter pel be fairly treated.
Since there are only finitely many (P, E)-pairs in pel, most executions of a concurrent
system are kept, even if every (P, E)-pair in pel rules out some measurement of them.
Section 6 will make this argument precise by establishing a probabilistic model for PF.
5 Liveness Rules
According to Manna [3], response properties are of the form σ |= (P !→ ♦Q),
where P and Q are past formulae (in [3]’s term), obtained by lifting predicates on
finite traces. The conclusion of (1) is of this form. Reactivity properties are of the form
σ |= (♦P) !→ (♦Q) meaning: if P holds infinitely often in σ, then Q holds
infinitely often in σ as well.
The proof rule for response property is the theorem resp-rule:
[[RESP cs F E N P Q; cs σ; PF cs {|F, E, N|} σ]] =⇒ σ |= P → ♦Q
and the proof rule for reactivity property is the theorem react-rule:
[[REACT cs F E N P Q; cs σ; PF cs {|F, E, N|} σ]] =⇒ σ |= ♦P → ♦Q
)P −→¬Q ∗* ≡ λ τ . (∃ i ≤ |τ |. P )τ *i ∧ (∀ k. 0 < k ∧ k ≤ i −→ ¬ Q )τ *k ))
locale RESP =
fixes cs :: ( a list × a) set and F :: a list ⇒ nat and E :: a list ⇒ a
and N :: nat and P :: a list ⇒ bool and Q :: a list ⇒ bool
assumes mid: [[cs τ ; )P −→¬Q ∗* τ ; ¬ Q τ ]] =⇒ 0 < F τ ∧ F τ < N
assumes fd: [[cs τ ; 0 < F τ ]] =⇒ τ [cs> E τ ∧ F (E τ # τ ) < F τ
locale REACT =
fixes cs :: ( a list × a) set and F :: a list ⇒ nat and E :: a list ⇒ a
and N :: nat and P :: a list ⇒ bool and Q :: a list ⇒ bool
assumes init: [[cs τ ; P τ ]] =⇒ F τ < N
assumes mid: [[cs τ ; F τ < N; ¬ Q τ ]] =⇒ τ [cs> E τ ∧ F (E τ # τ ) < F τ
inductive-set sigma :: ( a set × a set set) ⇒ a set set for M::( a set × a set set) where
basic: (let (U, A) = M in (a ∈ A)) =⇒ a ∈ sigma M|
empty: {} ∈ sigma M|
! a ∈ sigma M =⇒ (let (U,
complement: A) = M in U − a) ∈ sigma M|
union: ( i::nat. a i ∈ sigma M) =⇒ ( i. a i) ∈ sigma M
locale RCS =
fixes R :: ( a list × a) ⇒ real
assumes Rrange: 0 ≤ R(τ , e) ∧ R(τ , e) ≤ 1
fixes CS :: ( a list × a) set
defines CS-def: CS ≡ {(τ , e) . 0 < R(τ , e)}
fixes N :: a list ⇒ a set
defines N-def: N τ ≡ {e. 0 < R(τ , e)}
assumes Rsum1: CS τ =⇒ ( e ∈ (N τ ). R(τ ,e)) = 1
begin
fun π :: a list ⇒ real where
π [] = 1 |
π (e#τ ) = R(τ , e) ∗ π τ
Theorem 1 shows almost all executions are fair in the sense of PF. It says that the set
of valid executions satisfying PF has probability 1. The underlying concurrent system
is {(τ ,e).0<R(τ ,e)}, which is the expansion of CS in RCS. The probability function
BTS.P R is the one derived from R using BTS. The set RCS.Path R contains all valid
infinite executions of the underlying CS.
to the last execution step of τ , which is denoted by hd(τ ), and the state before the last
step is denoted by tl(τ ). Label ι is said to be fairly treated by an infinite execution, if ι
is taken infinite many times whenever it is enabled infinite many times in σ. An infinite
execution is fair in the sense of GF, if all labels in set L are fairly treated. By instantiat-
ing label set L with the set of elements on pel(denoted by set pel), and label function l
with function lf where lf (τ , e) = {(Q, E)| (Q, E) ∈ (set pel) ∧ Q τ ∧ e = (E τ )}, it
can be shown PF is just an instance of GF:
The combination of (4) and (5) gives rise to Theorem 1. The intuition behind (4)
it that (P, E)-pairs in PF can be seen as labels in GF. To prove (5), it is sufficient to
prove the probability of unfair executions equals to 0. In turn, it is sufficient to prove
the probability of unfair executions with respect to any one label ι equals to 0, because
locale GF =
fixes cs:: ( a list × a) set
fixes L :: b set
assumes countable-L: countable L
and non-empty-L: L = {}
fixes l :: a list × a ⇒ b set
assumes subset-l: l(τ ,e) ⊆ L
begin
definition enabled:: b ⇒ a list ⇒ bool where enabled ι τ ≡ (∃ e. (ι ∈ l(τ , e) ∧ (τ , e) ∈ cs))
Arrive p 2 4 ∈
/ arr set τ
Arrive p 2 4
StartU p 0 Stop 1 0
d(0, 0)
Nr. of
Nr. of Nr. of
Contents working
lines lemmas
days
Definitions of the elevator control system, F, E, FT and ET ≤ 200 – *
Safety properties 579 12 2
First stage of the proof 2012 10 5
Second stage of the proof 1029 8 3
In mobile Ad Hoc networks, nodes keep moving around in a certain area while com-
municating over wireless channels. The communication is in a peer-to-peer manner,
with no central control. This makes the network very susceptible to malicious attacks.
Therefore, security is an important issue in Mobile Ad Hoc networks. Secure routing
protocol is chosen as our verification target.
Liveness Reasoning with Isabelle/HOL 497
8 Related Works
Approaches for verification of concurrent systems can roughly be divided into theorem
proving and model checking. The work in this paper belongs to the theorem proving
category, which can deal with infinite state systems.
The purpose of this paper is to extend Paulson’s inductive protocol verification ap-
proach [1, 2] to deal with general liveness properties, so that it can be used as a general
protocol verification approach. The notion of PF and the corresponding proof rules
were first proposed by Zhang in [13]. At the same time, Yang developed a benchmark
verification for elevator control system to show the practicality of the approach [7].
Later, Yang used the approach serious to verify the liveness properties of Mobile Ad
Hoc network protocols [8, 9]. These works confirm the practicality of the extension.
The liveness rules in this paper relies on a novel fairness notion PF, an adaption
of the α-fairness [4, 5, 16] to suit the setting of HOL. The use of PF can derive more
liveness properties and usually the proofs are simpler than using the standard WF and
SF.
According to Baier’s work [5], PF should have a sensible probabilistic meaning. To
confirm this, Wang established a probabilistic model for PF, to show that the measure
of PF executions equals 1 [17]. The formalization of this probabilistic model is deeply
influenced by Hurd and Stefan [14, 15], however, due to different type restrictions from
[14], and the need of extension theorem which is absent from [15], most proofs have to
be done from scratch. Wang’s work established the soundness of PF.
498 J. Wang, H. Yang, and X. Zhang
9 Conclusion
The advantage of theorem proving over model checking is its ability to deal with infinite
state systems. Unfortunately, relatively less work is done to verify liveness properties
using theorem proving. This paper improves the situation by proposing an extension
of Paulson’s inductive approach for liveness verification and showing the soundness,
feasibility and practicality of the approach.
The level of automation in our work is still low compared to model checking. One
direction for further research is to develop specialized tactics for liveness proof. Addi-
tionally, the verification described in this paper is still at abstract model level. Another
important direction for further research is to extend it to code level verification.
References
1. Paulson, L.C.: The inductive approach to verifying cryptographic protocols. Journal of Com-
puter Security 6(1-2), 85–128 (1998)
2. Paulson, L.C.: Inductive analysis of the Internet protocol TLS. ACM Transactions on Com-
puter and System Security 2(3), 332–351 (1999)
3. Manna, Z., Pnueli, A.: Completing the temporal picture. Theor. Comput. Sci. 83(1), 91–130
(1991)
4. Pnueli, A., Zuck, L.D.: Probabilistic verification. Information and Computation 103(1), 1–29
(1993)
5. Baier, C., Kwiatkowska, M.: On the verification of qualitative properties of probabilistic
processes under fairness constraints. Information Processing Letters 66(2), 71–79 (1998)
6. Jaeger, M.: Fairness, computable fairness and randomness. In: Proc. 2nd International Work-
shop on Probabilistic Methods in Verification (1999)
7. Yang, H., Zhang, X., Wang, Y.: Liveness proof of an elevator control system. In: The ‘Emerg-
ing Trend’ of TPHOLs, Oxford University Computing Lab. PRG-RR-05-02, pp. 190–204
(2005)
8. Yang, H., Zhang, X., Wang, Y.: A correctness proof of the srp protocol. In: 20th International
Parallel and Distributed Processing Symposium (IPDPS 2006), Proceedings, Rhodes Island,
Greece, April 25-29 (2006)
9. Yang, H., Zhang, X., Wang, Y.: A correctness proof of the dsr protocol. In: Cao, J.,
Stojmenovic, I., Jia, X., Das, S.K. (eds.) MSN 2006. LNCS, vol. 4325, pp. 72–83. Springer,
Heidelberg (2006)
10. Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL — A Proof Assistant for Higher-Order
Logic. LNCS, vol. 2283. Springer, Heidelberg (2002)
11. Wenzel, M.: Isar - a generic interpretative approach to readable formal proof documents. In:
Nipkow, T., Paulson, L.C., Wenzel, M.T. (eds.) Isabelle/HOL. LNCS, vol. 2283. Springer,
Heidelberg (2002)
12. Pnueli, A.: On the extremely fair treatment of probabilistic algorithms. In: Proceedings of
the fifteenth annual ACM symposium on Theory of computing, pp. 278–290. ACM Press,
New York (1983)
13. Zhang, X., Yang, H., Wang, Y.: Liveness reasoning for inductive protocol verification. In:
The ‘Emerging Trend’ of TPHOLs, Oxford University Computing Lab. PRG-RR-05-02, pp.
221–235 (2005)
14. Hurd, J.: Formal Verification of Probabilistic Algorithms. Ph.D thesis, University of Cam-
bridge (2002)
Liveness Reasoning with Isabelle/HOL 499
15. Richter, S.: Formlizing integration theory with an application to probabilistic algorithms. In:
Slind, K., Bunker, A., Gopalakrishnan, G.C. (eds.) TPHOLs 2004. LNCS, vol. 3223, pp.
271–286. Springer, Heidelberg (2004)
16. Pnueli, A.: On the extremely fair treatment of probabilistic algorithms. In: ACM (ed.) Pro-
ceedings of the 15th annual ACM Symposium on Theory of Computing, Boston, Mas-
sachusetts, April 25–27, pp. 278–290. ACM Press, New York (1983)
17. Wang, J., Zhang, X., Zhang, Y., Yang, H.: A probabilistic model for parametric fairness
in isabelle/hol. Technical Report 364/07, Department of Computer Science, University of
Kaiserslautern (2007)
Mind the Gap
A Verification Framework for Low-Level C
NICTA , Australia
1
2
School of Computer Science and Engineering, UNSW, Sydney, Australia
3
Computer Sciences Laboratory, ANU, Canberra, Australia
{first-name.last-name}@nicta.com.au
1 Introduction
The seL4 kernel [10] is a high-performance microkernel in the L4 family [18], tar-
geted at secure, embedded devices. In verifying such a complex and large – 8,700
lines of C – piece of software, scalability and separation of concerns are of the ut-
most importance. We show how to achieve both for low-level, manually optimised,
real-world C code.
Fig. 1 shows the layers and
Isabelle/HOL
proofs involved in the verification
Abstract Specification
of seL4. The top layer is an ab-
stract, operational specification of RA
S. Berghofer et al. (Eds.): TPHOLs 2009, LNCS 5674, pp. 500–515, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Mind the Gap 501
Paper Structure. We begin with an example that sketches the details of a typ-
ical kernel function. We then explain how the components of the verification
framework fit together, summarising relevant details of our earlier work on
the monadic, executable specification [5], and on our C semantics and mem-
ory model [25,26,27]. In particular, we describe the issues involved in converting
the C implementation into Isabelle/HOL. The main part of the paper shows the
refinement framework with its fundamental definitions, rules, and automated
tactics. We demonstrate the framework’s performance by reporting on our expe-
rience so far in applying it to the verification of substantial parts of the seL4 C
implementation (474 out of 518 functions, 91%).
2 Example
The seL4 kernel [10] provides the following operating system kernel services: inter-
process communication, threads, virtual memory, access control, and interrupt
control. In this section we present a typical function, cteMove, with which we
will illustrate the verification framework.
Access control in seL4 is based on capabilities. A capability contains an object
reference along with access rights. A capability table entry (CTE) is a kernel
data structure with two fields: a capability and an mdbNode. The latter is book-
keeping information and contains a pair of pointers which form a doubly linked
list.
The cteMove operation, shown in Fig. 2, moves a capability table entry from
src to dest. The left-hand side of the figure shows the executable specification in
Isabelle/HOL, while the right-hand side shows the corresponding C code.
The first 6 lines in Fig. 2 initialise the destination entry and clear the source
entry; the remainder of the function updates the pointers in the doubly linked
list. During the move, the capability in the entry may be diminished in access
rights. Thus, the argument cap is this possibly diminished capability, previously
retrieved from the entry at src.
502 S. Winwood et al.
the monadic model. One feature of this calculus is that the refinement property
cannot hold if the failure flag is set by the executable specification, thus RA im-
plies non-failure of the executable level. In particular, this allows all assertions
in the executable specification to be taken as assumptions in the proof of RC .
4 Embedding C
In this section we describe our infrastructure for parsing C into Isabelle/HOL and
for reasoning about the result. The seL4 kernel is implemented almost entirely
in C99 [16]. Direct hardware accesses are encapsulated in machine interface func-
tions, some of which are implemented in ARMv6 assembly. In the verification,
we axiomatise the assembly functions using Hoare triples.
Fig. 3 gives an overview of
the components involved. The C-SIMPL
C expressions, guards
C Code
right-hand side shows our in-
C memory model
stantiation of SIMPL [23], a
generic, imperative language
Parser SIMPL
inside Isabelle/HOL. The SIMPL
framework provides a program Operational Semantics
taken. As we translate all of the C source at once, the parser can determine ex-
actly which globals do have their addresses taken, and these variables are then
given addresses in the heap. Global variables that do not have their addresses
taken are, like locals, simply fields in the program state. The restriction on local
variables could be relaxed at the cost of higher reasoning overhead.
The other significant syntactic omissions in our C subset are union types, bit-
fields, goto statements, and switch statements that allow cases to fall-through.
We handle union types and bitfields with an automatic code generator [4], de-
scribed in Sect. 6, that implements these types with structs and casts. Further-
more, we do not allow function calls through function pointers and take care
not to introduce a more deterministic evaluation order than C prescribes. For
instance, we translate the side-effecting C expressions ++ and -- as statements.
5 Refinement
Our verification goal is to prove refinement between the executable specification
and the C implementation. Specifically, this means showing that the C kernel
entry points for interrupts, page faults, exceptions, and system calls refine the
executable specification’s top-level function callKernel. We show refinement us-
ing a variation of forward simulation [7] we call correspondence: evaluation of
corresponding functions takes related states to related states.
In previous work [5], while proving RA , we found it useful to divide the proof
along the syntactic structure of both programs as far as possible, and then prove
the resulting subgoals semantically. Splitting the proof has two main benefits:
firstly, it is a convenient unit of proof reuse, as the same pairing of abstract
and concrete functions recurs frequently for low-level functions; and secondly, it
facilitates proof development by multiple people. One important feature of this
approach is that preconditions are discovered lazily à la Dijkstra [9]. Rules for
showing correspondence typically build preconditions from those of the premises.
In this section we describe the set of tools and techniques we developed to
ease the task of proving correspondence in RC . First, we give our definition of
correspondence, followed by a discussion of the use of the VCG. We then de-
scribe techniques for reusing proofs from RA to solve proof obligations from the
implementation. Next, we present our approach for handling operations with no
corresponding analogue. Finally, we describe our splitting approach and sketch
the proof of the example.
local variables are fields in the state record, this is a function from the state. We
thus annotate the correspondence statement with an extraction function xf, a
function which extracts the return value from a program state.
The correspondence statement is illustrated in Fig. 4 and defined below
ccorres r xf P P hs a c ≡
∀ (s, t )∈S. ∀ t . s ∈ P ∧ t ∈ P ∧ ¬ mFailed (a s) ∧ Γ c·hs, t ⇒ t
−→ ∃ (s ,rv )∈mResults (a s).
∃ t N . t = Normal t N ∧ (s , t N ) ∈ S ∧ r rv (xf t N )
The definition can be read as follows: given related states s and t with the
preconditions P and P respectively, if the abstract specification a does not fail
when evaluated at state s, and the concrete statement c evaluates under handler
stack hs in extended state t to extended state t , then the following must hold:
∀ s. Γ {t | s ∈ P ∧ t ∈ P ∧ (s, t) ∈ S}
c
{t | ∃ (rv , s )∈mResults (a s). (s , t ) ∈ S ∧ r rv (xf t )}
ccorres r xf P P hs a c
In essence, this rule states that to show correspondence between a and c, for a
given initial specification state s, it is sufficient to show that executing c results
in normal termination where the final state is related to the result of evaluating
a at s. The VCG precondition can assume that the initial states are related and
satisfy the correspondence preconditions.
Use of this rule in verifying correspondence is limited by two factors. Firstly,
the verification conditions produced by the VCG may be excessively large or
complex. Our experience is that the output of a VCG step usually contains a
separate term for every possible path through the target code, and that the
complexity of these terms tends to increase with the path length. Secondly, the
specification return value and result state are existential, and thus outside the
range of our extensive automatic support for showing universal properties of
specification fragments. Fully expanding the specification is always possible, and
in the case of deterministic operations will yield a single state/return value pair,
but the resulting term structure may also be large.
In the case of our example, the goal produced by the VCG has 377 lines
before unfolding the specification and 800 lines afterward. Verifying such non-
trivial functions is made practical by the approach described in the remainder
of this section.
This states that the proof obligation introduced by MemGuard at the CTE
pointer p can be discharged, assuming that there exists a CTE object on the
Mind the Gap 509
specification side (denoted cte-at (ptr-val p)); this rule turns a proof obligation
from the implementation into an assumption of the specification. There is, how-
ever, one major problem: the pointer p cannot depend on the C state, because
it is also used on the specification side.
To see why this is such a problem, recall that local variables in C-SIMPL are
fields in the state record; any pointer, apart from constants, in the program will
always refer to the state, making the above rule inapplicable; in the example,
the first guard refers to the local variable ´srcSlot.
All is not lost, however: the values in local variables generally correspond to
some value available in the specification. We have developed an approach that
automatically replaces such local variables with new HOL variables representing
their value. Proof obligations which refer to the local variable can then be solved
by facts about the related value from the specification precondition. We call this
process lifting.
The pointer Ptr src no longer depends on the C state and is a value from the
specification side, so the MemGuard can be removed with the above rule.
Lifting is only sound if the behaviour of the lifted code fragment is indistin-
guishable from that of the original code; the judgement d ∼ d[v/f ] states that
replacing applications of the function f in statement d with value v results in
the equivalent statement d . This condition is defined as follows
∀ v . d v ∼ d[v/f ] ∀ v . P v −→ ccorres r xf G G hs a (d v )
ccorres r xf G (G ∩ {s | P (f s)}) hs a d
Note that d , the lifted fragment, appears only in the assumptions; proving the
first premise involves inventing a suitable candidate. We have developed tactic
support for automatically calculating the lifted fragment and discharging such
proof obligations, based on a set of syntax-directed proof rules.
510 S. Winwood et al.
5.5 Splitting
If we examine our example, there is a clear match between most lines. Split-
ting allows us to take advantage of this structural similarity by considering each
match in isolation; formally, given the specification fragment do rv ← a; b rv
od and the implementation fragment c; d, splitting entails proving a first corre-
spondence between a and c and a second between b and d.
In the case where we can prove that c terminates abruptly, we discard d.
Otherwise, the following rule is used
ccorres r xf P P hs a c ∀ v . d v ∼ d[v/xf ]
∀ rv rv . r rv rv −→ ccorres r xf (Q rv ) (Q rv rv ) hs (b rv ) (d rv )
Example. After lifting, moving the guard, and symbolically executing the getCTE
function, applying the above rule to the example proof statement in Fig. 5 gives
the following as the first proof obligation
Mind the Gap 511
This goal, proved using the VCG approach from Sect. 5.2, states that, apart
from the state correspondence, the return value from the specification side
(cteMDBNode cte) and implementation side (H &(Ptr src→[cteMDBNode-C])
stored in mdb) are related through cmdb-relation, that is, the linked list pointers
in the returned specification node are equal to those in the implementation.
The proof of the example, cteMove, is 25 lines of Isabelle/HOL tactic style proof.
The proof starts by weakening the preconditions (here abbreviated P and P )
with new Isabelle schematic variables; this allows preconditions to be calculated
on demand in the correspondence proofs.
We then lift the function arguments and proceed to prove by splitting; the
leaf goals are proved as separate lemmas using the C-SIMPL VCG. Next,
the correspondence preconditions are moved back through the statements us-
ing the two VCGs on specification and implementation. The final step is to solve
the proof obligation generated by the initial precondition weakening: the stated
preconditions (our P and P ) must imply the calculated preconditions.
The lifting and splitting phase takes 9 lines, the VCG stage takes 1 line, using
tactic repetition, while the final step takes 15 lines and is typically the trickiest
part of any correspondence proof.
6 Experience
In this section we explore how our C subset influenced the kernel implementation
and performance. We then discuss our experience in applying the framework.
We chose to implement the C kernel manually, rather than synthesising it from
the executable specification. Initial investigations had shown that generated C
code would not meet the performance requirements of a real-world microkernel.
Message-passing (IPC) performance, even in the first hand-written version, com-
pleted after two person months, was slow, on the order of the Mach microkernel.
After optimisation, this operation is now comparable to that of the modern,
commercially deployed, OKL4 2.1 [22] microkernel: we measured 206 cycles for
OKL4’s hand-crafted assembly IPC path, and 756 cycles for its non-optimised C
version on the ARMv6 Freescale i.MX31 platform. On the same hardware, our C
kernel initially took over 3000 cycles, after optimisations 299. The fastest other
IPC implementation for ARMv6 in C we know of is 300 cycles.
The C subset and the implementation developed in parallel, influencing each
other. We extended the subset with new features such as multiple side-effect
free function calls in expressions, but we also needed to make trade-offs such
as for references to local variables. We avoided passing large structures on the
512 S. Winwood et al.
Lines Changes
Haskell/C Isabelle Proof Bugs Convenience
Executable specification 5,700 13,000 117,000 8 10
Implementation 8,700 15,000 50,000a 97 34
a
With 474 of 518 (91%) of the functions verified.
specification. For example, the encoding of Isabelle’s option type using a default
value in C (such as NULL) required us to show that these default values never
occurred as valid values.
We discovered that the difficulty of verifying any given function in RC was
determined by the degree of difference between the function in C and its exe-
cutable specification, arising either from the control structures of C or its impure
memory model. Unlike the proof of RA , the semantic complexity of the function
seems mostly irrelevant. For instance, the operation which deletes a capability —
by far the most semantically complex operation in seL4 — was straightforward
to verify in RC . On the other hand, a simpler operation which employs an indis-
criminate memset over a number of objects was comparatively difficult to verify.
It is interesting to note that, even here, proofs from RA were useful in proving
facts about the implementation.
An important consequence of the way we split up proofs is that local reasoning
becomes possible. No single person needed a full, global understanding of the
whole kernel implementation.
7 Related Work
Earlier work on OS verification includes PSOS [12] and UCLA Secure Unix [28].
Later, Bevier [3] describes verification of process isolation properties down to
object code level, but for an idealised kernel (KIT) far simpler than modern
microkernels. We use the same general approach — refinement — as KIT and
UCLA Secure Unix, however the scale, techniques for each refinement step, and
level of detail we treat are significantly different.
The Verisoft project [24] is working towards verifying a whole system stack, in-
cluding hardware, compiler, applications, and a simplified microkernel VAMOS.
The VFiasco [15] project is attempting to verify the Fiasco kernel, another vari-
ant of L4 directly on the C++ level. For a comprehensive overview of operating
system verification efforts, we refer to Klein [17].
Deductive techniques to prove annotated C programs at the source code level
include Key-C [20], VCC [6], and Caduceus [13], recently integrated into the
Frama-C framework [14]. Key-C only focuses on a type-safe subset of C. VCC,
which also supports concurrency, appears to be heavily dependent on large ax-
iomatisations; even the memory model [6] axiomatises a weaker version of what
Tuch proves [26]. Caduceus supports a large subset of C, with extensions to han-
dle certain kinds of unions and casts [1, 19]. These techniques are not directly
applicable to refinement, although Caduceus has at least been used [2] to extract
a formal Coq specification for verifying security and safety properties.
We directly use the SIMPL verification framework [23] from the Verisoft
project, but we instantiate it differently. While Verisoft’s main implementation
language is fully formally defined from the ground up, with well-defined Pascal-
like semantics and C-style syntax, we treat a true, large subset of C99 [16] on
ARMv6 with all the realism and ugliness this implies. Our motivation for this is
our desire to use standard tool-chains and compilers for real-world deployment
514 S. Winwood et al.
of the kernel. Verisoft instead uses its own non-optimising compiler, which in
exchange is formally verified. Another difference is the way we exploit structural
similarities between our executable specification and C implementation. Verisoft
uses the standard VCG-based methodology for implementation verification. Our
framework allows us to transport invariant properties and Hoare-triples from
our existing proof on the executable specification [5] down to the C level. This
allowed us to avoid invariants on the C level, speeding up the overall proof effort
significantly.
8 Conclusion
We have presented a formal framework for verifying the refinement of a
large, monadic, executable specification into a low-level, manually performance-
optimised C implementation. We have demonstrated that the framework
performs well by applying it to the verification of the seL4 microkernel in Is-
abelle/HOL, and by completing a large part of this verification in a short time.
The framework allows us to take advantage of the large number of invariants
proved on the specification level, thus saving significant amounts of work. We
were able to conduct the semantic reasoning on the more pleasant monadic, shal-
lowly embedded specification level, and leave essentially syntactic decomposition
to the C level.
We conclude that our C verification framework achieves both the scalability
in terms of size, as well as the separation of concerns that is important for
distributing such a large proof over multiple people.
References
1. Andronick, J.: Modélisation et Vérification Formelles de Systèmes Embarqués dans
les Cartes à Microprocesseur—Plate-Forme Java Card et Système d’Exploitation.
Ph.D thesis, Université Paris-Sud (March 2006)
2. Andronick, J., Chetali, B., Paulin-Mohring, C.: Formal verification of security prop-
erties of smart card embedded source code. In: Fitzgerald, J.S., Hayes, I.J., Tarlecki,
A. (eds.) FM 2005. LNCS, vol. 3582, pp. 302–317. Springer, Heidelberg (2005)
3. Bevier, W.R.: Kit: A study in operating system verification. IEEE Transactions on
Software Engineering 15(11), 1382–1396 (1989)
4. Cock, D.: Bitfields and tagged unions in C: Verification through automatic gen-
eration. In: Beckert, B., Klein, G. (eds.) Proc, 5th VERIFY, Sydney, Australia,
August 2008. CEUR Workshop Proceedings, vol. 372, pp. 44–55 (2008)
5. Cock, D., Klein, G., Sewell, T.: Secure microkernels, state monads and scalable
refinement. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS,
vol. 5170, pp. 167–182. Springer, Heidelberg (2008)
6. Cohen, E., Moskal, M., Schulte, W., Tobies, S.: A precise yet efficient memory
model for C (2008),
http://research.microsoft.com/apps/pubs/default.aspx?id=77174
Mind the Gap 515