Functional And Logic Programming 10th International Symposium Flops 2010 Sendai Japan April 1921 2010 Proceedings 1st Edition Brigitte Pientka Auth download
Functional And Logic Programming 10th International Symposium Flops 2010 Sendai Japan April 1921 2010 Proceedings 1st Edition Brigitte Pientka Auth download
https://ebookbell.com/product/functional-and-logic-programming-10th-
international-symposium-flops-2010-sendai-japan-
april-1921-2010-proceedings-1st-edition-brigitte-pientka-auth-4141844
https://ebookbell.com/product/functional-and-logic-programming-15th-
international-symposium-flops-2020-akita-japan-
september-1416-2020-proceedings-1st-ed-keisuke-nakano-22497182
https://ebookbell.com/product/functional-and-logic-programming-11th-
international-symposium-flops-2012-kobe-japan-
may-2325-2012-proceedings-1st-edition-michael-codish-auth-4141846
https://ebookbell.com/product/functional-and-logic-programming-12th-
international-symposium-flops-2014-kanazawa-japan-
june-46-2014-proceedings-1st-edition-michael-codish-4697296
Functional And Logic Programming 13th International Symposium Flops
2016 Kochi Japan March 46 2016 Proceedings 1st Edition Oleg Kiselyov
https://ebookbell.com/product/functional-and-logic-programming-13th-
international-symposium-flops-2016-kochi-japan-
march-46-2016-proceedings-1st-edition-oleg-kiselyov-5355856
https://ebookbell.com/product/functional-and-logic-programming-14th-
international-symposium-flops-2018-nagoya-japan-
may-911-2018-proceedings-1st-ed-john-p-gallagher-7151012
https://ebookbell.com/product/functional-and-logic-programming-16th-
international-symposium-flops-2022-kyoto-japan-
may-1012-2022-proceedings-michael-hanus-50738266
https://ebookbell.com/product/functional-and-logic-programming-17th-
international-symposium-flops-2024-kumamoto-japan-
may-1517-2024-proceedings-jeremy-gibbons-232418430
https://ebookbell.com/product/functional-and-constraint-logic-
programming-18th-international-workshop-wflp-2009-brasilia-brazil-
june-28-2009-revised-selected-papers-1st-edition-roberto-
ierusalimschy-auth-4141840
Lecture Notes in Computer Science 6009
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Germany
Madhu Sudan
Microsoft Research, Cambridge, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Matthias Blume Naoki Kobayashi
Germán Vidal (Eds.)
Functional and
Logic Programming
13
Volume Editors
Matthias Blume
Google
20 West Kinzie Street, Chicago, IL 60610, USA
E-mail: [email protected]
Naoki Kobayashi
Tohoku University, Graduate School of Information Sciences
6-3-9 Aoba, Aramaki, Aoba-ku, Sendai-shi, Miyagi 980-8579, Japan
E-mail: [email protected]
Germán Vidal
Universidad Politécnica de Valencia, DSIC, MiST
Camino de Vera, S/N, 46022 Valencia, Spain
E-mail: [email protected]
ISSN 0302-9743
ISBN-10 3-642-12250-7 Springer Berlin Heidelberg New York
ISBN-13 978-3-642-12250-7 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
springer.com
© Springer-Verlag Berlin Heidelberg 2010
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper 06/3180
Preface
Program Chairs
Matthias Blume Google, Chicago, USA
Germán Vidal Universidad Politécnica de Valencia, Spain
Symposium Chair
Naoki Kobayashi Tohoku University, Sendai, Japan
Program Committee
Nick Benton Microsoft Research, Cambridge, UK
Manuel Chakravarty University of New South Wales, Australia
Michael Codish Ben-Gurion University of the Negev, Israel
Bart Demoen Katholieke Universiteit Leuven, Belgium
Agostino Dovier University of Udine, Italy
John P. Gallagher Roskilde University, Denmark
Maria Garcia de la Banda Monash University, Australia
Michael Hanus University of Kiel, Germany
Atsushi Igarashi Kyoto University, Japan
Patricia Johann Rutgers University, USA
Shin-ya Katsumata Kyoto University, Japan
Michael Leuschel University of Düsseldorf, Germany
Francisco López-Fraguas Complutense University of Madrid, Spain
Paqui Lucio University of the Basque Country, Spain
Yasuhiko Minamide University of Tsukuba, Japan
Frank Pfenning Carnegie Mellon University, USA
Francois Pottier INRIA, France
Tom Schrijvers Katholieke Universiteit Leuven, Belgium
Chung-chieh “Ken” Shan Rutgers University, USA
Zhong Shao Yale University, USA
Jan-Georg Smaus University of Freiburg, Germany
Nobuko Yoshida Imperial College London, UK
Local Chair
Eijiro Sumii Tohoku University, Sendai, Japan
VIII Symposium Organization
External Reviewers
Andreas Abel Javier Álvez
Kenichi Asai Demis Ballis
Dariusz Biernacki Bernd Braßel
Francisco Bueno Dario Campagna
James Cheney Markus Degen
Michael Elhadad Andrzej Filinski
Sebastian Fischer Marc Fontaine
Jacques Garrigue Raffaela Gentilini
Neil Ghani Silvia Ghilezan
Mayer Goldberg Clemens Grelck
Kevin Hammond Hugo Herbelin
Montserrat Hermo Petra Hofstedt
Andrew Kennedy Neelakantan Krishnaswami
Barbara König Sean Lee
Roman Leshchinskiy Pedro López-Garcı́a
Michael Maher Mircea Marin
Koji Nakazawa Marisa Navarro
Monica Nesi Christopher Okasaki
Dominic Orchard Matthieu Petit
Carla Piazza Benjamin Pierce
Morten Rhiger Adrián Riesco
Claudio Russo Fernando Sáenz-Pérez
Pietro Sala Alan Schmitt
Peter Schneider-Kamp Antonis Stampoulis
Christian Sternagel Don Stewart
Peter Stuckey Eijiro Sumii
Alexander Summers Jaime Sánchez-Hernández
Akihiko Tozawa Janis Voigtländer
Mark Wallace Hongwei Xi
Toshiyuki Yamada Noam Zeilberger
Table of Contents
Invited Talks
Beluga: Programming with Dependent Types, Contextual Data, and
Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Brigitte Pientka
Using Static Analysis to Detect Type Errors and Concurrency Defects
in Erlang Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Konstantinos Sagonas
Solving Constraint Satisfaction Problems with SAT Technology . . . . . . . . 19
Naoyuki Tamura, Tomoya Tanjo, and Mutsunori Banbara
Refereed Papers
Types
A Church-Style Intermediate Language for MLF . . . . . . . . . . . . . . . . . . . . . 24
Didier Rémy and Boris Yakobowski
ΠΣ: Dependent Types without the Sugar . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Thorsten Altenkirch, Nils Anders Danielsson, Andres Löh, and
Nicolas Oury
Haskell Type Constraints Unleashed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Dominic Orchard and Tom Schrijvers
Foundations
A Complete Axiomatization of Strict Equality . . . . . . . . . . . . . . . . . . . . . . . 118
Javier Álvez and Francisco J. López-Fraguas
X Table of Contents
Logic Programming
A Pearl on SAT Solving in Prolog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Jacob M. Howe and Andy King
Term Rewriting
Complexity Analysis by Graph Rewriting . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Martin Avanzini and Georg Moser
Brigitte Pientka
1 Introduction
Formal systems given via axioms and inference rules play a central role in de-
scribing and verifying guarantees about the runtime behavior of programs. While
we have made a lot of progress in statically checking a variety of formal guaran-
tees such as type or memory safety, programmers typically cannot define their
own safety policy and reason about it within the programming language itself.
This paper presents an overview of a novel programming and reasoning frame-
work, called Beluga [Pie08, PD08]. Beluga uses a two-level approach: on the data-
level, it supports specifications of formal systems within the logical framework
LF [HHP93]. The strength and elegance of LF comes from supporting encodings
based on higher-order abstract syntax (HOAS), in which binders in the object
language are represented as binders in LF’s meta-language. As a consequence,
users can avoid implementing common and tricky routines dealing with variables,
such as capture-avoiding substitution, renaming and fresh name generation. Be-
cause of this, one can think of HOAS encodings as the most advanced technology
for specifying and prototyping formal systems which leads to very concise and
elegant encodings and provides the most support for such an endeavor.
On top of LF, we provide a dependently typed functional language that sup-
ports analyzing and manipulating LF data via pattern matching. A distinct
M. Blume, N. Kobayashi, and G. Vidal (Eds.): FLOPS 2010, LNCS 6009, pp. 1–12, 2010.
c Springer-Verlag Berlin Heidelberg 2010
2 B. Pientka
feature of Beluga is its explicit support for contexts to keep track of hypothesis,
and contextual objects to describe objects which may depend on them. Contex-
tual objects are characterized by contextual types. For example, A[Ψ ] describes
a contextual object Ψ.M where M has type A in the context Ψ and hence may
refer to the variables declared in the context Ψ . These contextual objects are
analyzed and manipulated naturally by pattern matching.
Furthermore, Beluga supports context variables which allow us to write generic
functions that abstract over contexts. As types classify terms, context schemas
classify contexts. Contexts whose schemas are superficially incompatible can be
reasoned with via context weakening and context subsumption.
The main application of Beluga at the moment is to prototype formal systems
together with their meta-theory. Formal systems given via axioms and inference
rules are common in the design and implementation of programming languages,
type systems, authorization and security logics, etc. Contextual objects concisely
characterize hypothetical and parametric derivations. Inductive proofs about a
given formal system can be implemented as recursive functions that case-analyze
some given (possibly hypothetical) derivation. Hence, Beluga serves as a proof
checking framework. At the same time, Beluga provides an experimental frame-
work for programming with proof objects. Due to its powerful type system, the
programmer can not only enforce strong invariants about programs statically,
but also to create, manipulate, and analyze certificates (=proofs) which guaran-
tee that a program satisfies a user-defined safety property. Therefore, Beluga is
ideally suited for applications such as certified programming and proof-carrying
code [Nec97].
Beluga is an implementation in OCaml based on our earlier work [Pie08, PD08].
It provides an re-implementation of LF [HHP93] including type reconstruction,
constraint-based higher-order unification and type checking. On top of LF, we de-
signed and implemented a dependently typed functional language that supports
explicit contexts and pattern matching over contextual objects. To support rea-
soning with contexts, we support context weakening and subsumptions. A key
step towards a palatable, practical source-level language was the design and imple-
mentation of a bidirectional type reconstruction algorithm for dependently typed
Beluga functions. While type reconstruction for LF and Beluga is in general un-
decidable, in practice, the performance is competitive. Beluga also provides an
interpreter to execute programs using an environment-based semantics.
Our test suite includes many specifications from the Twelf repository [PS99].
We also implemented a broad range of proofs as recursive Beluga functions,
including proofs of the Church-Rosser theorem, proofs about compiler transfor-
mations, subject reduction, and a translation from natural deduction to Hilbert
style proofs. To illustrate the expressive power of Beluga, our test suite also in-
cludes simple theorems about structural relationships between expressions and
proofs about the paths in expressions. These latter theorems are interesting since
they require nested quantifiers and implications, placing them outside the frag-
ment of propositions expressible in systems such as Twelf. The Beluga system,
Beluga 3
including source code, examples, and a tutorial discussing key features of Beluga,
is available from
http://complogic.cs.mcgill.ca/beluga/.
Context Γ ::= · | Γ, x
x∈Γ Γ, x M −→ N
Γ x −→ x Γ lam x . M −→ lam x . N
Γ M1 −→ N1 Γ M2 −→ N2 N1 = lam x . M
Γ app M1 M2 −→ app N1 N2
Finally, we show how to represent terms and types in the logical framework
LF, and implement the normalization algorithm as a recursive program in
Beluga.
4 B. Pientka
The Beluga syntax follows ideas from ML-like languages with a few extension.
For example, Λg ⇒... introduces abstraction over the context variable g cor-
responding to quantification over the context variable g in the type of norm. We
then split on the object e which has contextual type (exp T)[g]. As in the defini-
tion we gave earlier, there are three cases to consider for e: either it is a variable
from the context, it is a lambda-abstraction, or it is an application. Each pattern
is written as a contextual object, i.e. the object itself together with its context.
For the variable case, we use a parameter variable, written as #p ... and write
[g] #p ... . Operationally, it will match any declaration from the context g once
g is concrete. The parameter variable #p is associated with the identity substi-
tution (written in concrete syntax with ... ) to explicitly state its dependency on
the context g.
The pattern [g] lam λx. M... x describes the case where the object e is a lambda-
abstraction. We write M... x for the body of the lambda-abstraction which may
refer to all the variables from the context g (written as ... ) and the variable x.
Technically, ... x describes the identity substitution which maps all the variables
from g, x:exp T to themselves. We now recursively normalize the contextual object
[g,x] M... x. To accomplish this, we pass to the recursive call the extended context
g, x:exp _ together with the contextual object [g,x] M... x. We write an underscore
for the type of x in the context g, x:exp _ and let type reconstruction determine
6 B. Pientka
it. Note, that we cannot write x:exp T1 since T1 would be free. Hence, supporting
holes is crucial to be able to write the program compactly and avoid unnecessary
type annotations. The result of the recursive call is a contextual object [g,x] N... x
which we will use to assemble the result. In the case for applications, we recursively
normalize the contextual object [g] M1... and then pattern match on its result. If it
returned a lambda-abstraction lam λx. M’... x, we simply replace x with M2... . Sub-
stitution is inherently supported in Beluga and... (M2... ) describes the substitution
which maps all variables in g to themselves (written as ... ) and x is mapped to
M2... . In the case where normalizing [g] M1... does not return a lambda-abstraction,
we continue normalizing [g] M2... and reassemble the final result. In conclusion,
our implementation yields a natural, elegant, and very direct encoding of the for-
mal description of normalization.
Beluga supports a two-level approach for programming with and reasoning about
HOAS encodings. The data-level supports specifications of formal systems in the
logical framework LF. On top of it, we provide an expressive computation lan-
guage which supports dependent types and recursion over HOAS encodings. A
key challenge is that we must traverse a λ-abstractions and manipulate objects
which may contain bound variables. In Beluga, we solve this problem by using
contextual types which characterize contextual objects and by introducing con-
text variables to abstract over concrete contexts and parameterize computation
with them. By design, variables occurring in contextual objects can never es-
cape their scope, a problem which often arises in other approaches. While in
our previous example, all contextual types and objects shared the same context
variable, our framework allows the introduction of different context variables, if
we wish to do so.
Beluga’s theoretical basis is contextual modal type theory which has been
described in [NPP08]. We later extended contextual modal type theory with
context variables which allow us to abstract over concrete contexts and param-
eter variables which allow us to talk abstractly about elements of contexts. The
foundation for programming with contextual data objects and contexts was first
described in [Pie08] and subsequently extended with dependent types in [PD08].
3 Implementation
Type reconstruction for LF. We illustrate briefly the problem. Consider the expres-
sion lam x . lam y . app x y which is represented as lam λx. lam λy. app x y in LF.
However, since expressions are indexed by types to ensure that we only represent
well-typed expressions, constructors lam and app take in also two index
arguments describing the type of the expression. Type reconstruction needs to in-
fer them. The reconstructed, fully explicit representation is
lam (arr T S) (arr T S) λx. lam S T λy. app T S x y
8 B. Pientka
We adapted the general principle also found in [Pfe91a]: We may omit an index
argument when a constant is used, if it was implicit when the type of the constant
was declared. An argument is implicit to a type if it either occurs as a free variable
in the type or it is an index argument in the type. Following the ideas in the
Twelf system, we do not allow the user to supply an implicit argument explicitly.
Type reconstruction is, in general, undecidable for the LF. Our algorithm
similar to its implementation in the Twelf system reports a principal type, a
type error, or that the source term needs more type information.
Type reconstruction for the computation level is undecidable. For our com-
putation language, we check functions against a given type and either succeed,
report a type error, or fail by asking for more type information if no ground in-
stantiation can be found for an omitted argument or if we cannot infer the type
of meta-variables occurring in patterns. It is always possible to make typing
unambiguous by adding more annotations.
4 Related Work
In our discussion on related work, we will concentrate on programming languages
supporting HOAS specifications and reasoning about them. Most closely related
to our work is the Twelf system [PS99], a proof checking environment based
on the logical framework LF. Its design has strongly influenced the design of
Beluga. While both Twelf and Beluga support specifying formal systems using
HOAS in LF, Twelf supports implementing proofs as relations. To verify that
the relation indeed constitutes a proof, one needs to prove separately that it
is a total function. Twelf is a mature system providing termination checking
as well as an implementation of coverage checking. Both features are under
development for Beluga. One main difference between Twelf and Beluga lies in
the treatment of contexts. In Twelf, the actual context of hypotheses remains
implicit. As a consequence, instead of a generic base case, base cases in proofs
are handled whenever an assumption is introduced. This may lead to scattering
of base cases and adds some redundancy. World declarations, similar to context
schema declarations, check that assumptions introduced are of the expected form
and that appropriate base cases are indeed present. Because worlds in the Twelf
system also carry information about base cases, manual weakening is required
more often when assembling larger proofs using lemmas. Explicit contexts in
Beluga, make the meta-theoretic reasoning about contexts, which is hidden in
Twelf, explicit. We give a systematic comparison and discuss the trade-offs of
this decision together with illustrative examples in [FP10].
Another important difference between Twelf and Beluga is its expressive
power. To illustrate, consider the following simple statement about lambda-
terms: If for all N , N is a subterm of K implies that N is a subterm of M , then K
must be a subterm of M . Because this statement requires nested quantification
and implication, especially in a negative position, it is outside Twelf’s meta-logic
which is used to verify that a given relation constitutes a proof. While this has
been known, we hope that this simple theorem illustrates this point vividly.
More recently, we see a push towards incorporating logical framework technol-
ogy into mainstream programming languages to support the tight integration of
specifying program properties with proofs that these properties hold. The Del-
phin language [PS08] is most closely related to Beluga. Both support writing
recursive functions over LF specifications, but differ in the theoretical founda-
tion. In particular, contexts to keep track of assumptions are implicit in Delphin.
It hence lacks the ability to distinguish between closed objects and objects de-
pending on bound variables on the type-level. Delphin’s implementation utilizes
as much of the Twelf infrastructure as possible.
10 B. Pientka
the future, we plan to explore how to connect our framework to a theorem prover
which can fill in parts of the function (= proof) automatically and where the
user can interactively develop functions in collaboration with a theorem prover.
References
[Pfe91a] Pfenning, F.: Logic programming in the LF logical framework. In: Huet,
G., Plotkin, G. (eds.) Logical Frameworks, pp. 149–181. Cambridge Uni-
versity Press, Cambridge (1991)
[Pfe91b] Pfenning, F.: Unification and anti-unification in the Calculus of Con-
structions. In: Sixth Annual IEEE Symposium on Logic in Computer
Science, Amsterdam, The Netherlands, July 1991, pp. 74–85 (1991)
[Pfe97] Pfenning, F.: Computation and deduction (1997)
[Pie03] Pientka, B.: Tabled higher-order logic programming. PhD thesis, De-
partment of Computer Science, Carnegie Mellon University, CMU-CS-
03-185 (2003)
[Pie05] Pientka, B.: Verifying termination and reduction properties about
higher-order logic programs. Journal of Automated Reasoning 34(2),
179–207 (2005)
[Pie08] Pientka, B.: A type-theoretic foundation for programming with higher-
order abstract syntax and first-class substitutions. In: 35th Annual ACM
SIGPLAN-SIGACT Symposium on Principles of Programming Lan-
guages (POPL 2008), pp. 371–382. ACM Press, New York (2008)
[Pot07] Pottier, F.: Static name control for FreshML. In: 22nd IEEE Sympo-
sium on Logic in Computer Science (LICS 2007), pp. 356–365. IEEE
Computer Society, Los Alamitos (2007)
[PS99] Pfenning, F., Schürmann, C.: System description: Twelf — a meta-
logical framework for deductive systems. In: Ganzinger, H. (ed.) CADE
1999. LNCS (LNAI), vol. 1632, pp. 202–206. Springer, Heidelberg (1999)
[PS08] Poswolsky, A.B., Schürmann, C.: Practical programming with higher-
order encodings and dependent types. In: Drossopoulou, S. (ed.) ESOP
2008. LNCS, vol. 4960, pp. 93–107. Springer, Heidelberg (2008)
[RP96] Rohwedder, E., Pfenning, F.: Mode and termination checking for higher-
order logic programs. In: Riis Nielson, H. (ed.) ESOP 1996. LNCS,
vol. 1058, pp. 296–310. Springer, Heidelberg (1996)
[SPG03] Shinwell, M.R., Pitts, A.M., Gabbay, M.J.: FreshML: programming with
binders made simple. In: 8th International Conference on Functional
Programming (ICFP 2003), pp. 263–274. ACM Press, New York (2003)
[Twe09] Twelf wiki (2009), http://twelf.plparty.org/wiki/Main_Page
Using Static Analysis to Detect Type Errors and
Concurrency Defects in Erlang Programs
Konstantinos Sagonas
Abstract. This invited talk will present the key ideas in the design and
implementation of Dialyzer, a static analysis tool for Erlang programs.
Dialyzer started as a defect detection tool using a rather ad hoc dataflow
analysis to detect type errors in Erlang programs, but relatively early in
its development it adopted a more disciplined approach to detecting def-
inite type clashes in dynamically typed languages. Namely, an approach
based on using a constraint-based analysis to infer success typings which
are also enhanced with optional contracts supplied by the programmer.
In the first part of the talk, we will describe this constraint-based
approach to type inference and explain how it differs with past and recent
attempts to type check programs written in dynamic languages. In the
second part of the talk, we will present important recent additions to
Dialyzer, namely analyses that detect concurrency defects (such as race
conditions) in Erlang programs. For a number of years now, Dialyzer
has been part of the Erlang/OTP system and has been actively used by
its community. Based on this experience, we will also critically examine
Dialyzer’s design choices, show interesting cases of Dialyzer’s use, and
distill the main lessons learned from using static analysis in open source
as well as commercial code bases of significant size.
1 Introduction
M. Blume, N. Kobayashi, and G. Vidal (Eds.): FLOPS 2010, LNCS 6009, pp. 13–18, 2010.
c Springer-Verlag Berlin Heidelberg 2010
14 K. Sagonas
these works, both in the techniques that were used but more importantly in the
fact that Dialyzer aimed to detect definite type errors instead of possible ones.
To detect this kind of errors, Dialyzer initially used a relatively ad hoc path-
insensitive dataflow analysis [3]. Even though this approach was quite weak in
principle, it turned out surprisingly effective in practice. Dialyzer managed to
detect a significant number of type errors in heavily used libraries and appli-
cations, errors which remained hidden during many years of testing and use of
the code. The analysis was fast and scalable allowing it to be used in programs
of significant size (hundreds of thousand lines of code). More importantly, the
analysis was sound for defect detection: it modelled the operational semantics
of Erlang programs accurately and never reported a false alarm. We believe this
was a key factor in Dialyzer’s adoption by the Erlang community, even before
the tool was included in the Erlang/OTP distribution. Early experiences using
this analysis were reported in Bugs’05 [6].
Despite Dialyzer’s success, there was a clear limit to the kind of type errors
that could be detected using a purely dataflow-based analysis. To ameliorate
the situation, Dialyzer’s analysis was redesigned, pretty much from scratch. We
opted for a more disciplined and considerably more powerful analysis that in-
fers success typings of functions using a constraint-based inference algorithm [4].
Informally, a success typing is a type signature that over-approximates the func-
tion’s dynamic semantics. More concretely, it over-approximates the set of terms
for which the function can evaluate to a value. The domain of the type signature
includes a type-level representation of all possible terms that the function could
accept as parameters, and its range includes all possible return values for this
domain. In effect, success typings approach the type inference problem from a
direction opposite to that of type systems for statically typed languages. For ex-
ample, while most type systems have to restrict the set of terms that a function
can accept in order to prove type safety, success typings, which are intended to
locate definite type clashes, only need to be careful not to exclude some term
for which the function can be used without raising some runtime exception. The
analogy can be taken further. The well-known slogan that “well-typed programs
never go wrong” has its analogue “ill-typed programs always fail”, meaning that
use of a function in a way which is incompatible with its success typing will
surely raise a runtime exception if encountered.
Slogans aside, success typings allow for compositional, bottom-up, constraint-
based type inference which appears to scale well in practice. Moreover, by
taking control and data flow into account and by exploiting properties of the
language such as its module system, which allows for specializing success typ-
ings of module-local functions based on their actual instead of their possible uses,
success typings can be refined using a top-down algorithm [4]. This refinement
process often makes success typings as accurate as the types inferred by stati-
cally typed functional languages. Given these so called refined success typings,
Dialyzer employs a function-local dataflow analysis that locates type clashes and
other errors (e.g., case clauses that can never match, guards that will always fail,
etc.) in programs.
16 K. Sagonas
Once there was a solid basis for detecting discrepancies between allowed and
actual uses of functions, the obvious next step was to design a specification
language for Erlang. This language is nowadays a reality and allows Erlang
programmers to express their intentions about how certain functions are to be
used, thereby serving both as important documentation for the source code and
providing additional constraints to the analysis in the form of type contracts [9].
These contracts are either success typings which are automatically generated
and inserted into the source code or user-specified refinements of the inferred
refined success typings. In many respects, this approach of adding contracts is
similar to that pioneered by other dynamically typed functional languages such
as PLT Scheme [10]. Nowadays, many of Erlang/OTP’s libraries as well as open
source applications written in Erlang come with type contracts for functions;
especially in those which are part of a module’s API.
The presence of such information has allowed Dialyzer to detect even more
type errors and subtle interface abuses in key functions of Erlang/OTP. A paper
describing in detail the approach we advocate and experiences from applying it
in one non-trivial case study was published in the 2008 Erlang workshop [11].
Relatively recently, Dialyzer was enhanced with a precise and scalable analysis
that automatically detects data races. In pure Erlang code, data races are impos-
sible: the language does not provide any constructs for processes to create and
modify shared memory. However, the Erlang/OTP implementation comes with
key libraries and built-ins, implemented in C as part of the runtime system, that
do create and manipulate data structures which are shared between processes.
Unrestricted uses of these built-ins in code run by different processes may lead
to data races.
To detect such situations, we have designed a static analysis that detects
some of the most common kinds of data races in Erlang programs: races in the
process registry, in the Erlang Term Storage (ETS) facility, and in the mnesia
database management system. This analysis integrates smoothly with the rest of
Dialyzer’s analyses targetting the sequential part of the language. The analysis
is non-trivial as the built-ins accessing shared data structures may be spatially
far apart, they may even be located in the code of different modules, or even
worse, hidden in the code of higher-order functions. For this reason, the analysis
takes control flow into account. Also, it has to be able to reason about data flow:
if at some program point the analysis locates a call to a built-in reading a shared
data structure and from that point on control reaches a program point where a
call to a built-in writing the same data structure appears, the analysis needs to
determine whether the two calls may possibly be performed on the same data
item or not. If they may, it has detected a race condition, otherwise there is none.
Because data races are subtle and difficult to locate, Dialyzer departs from the
“report only definite errors” principle: for the first time its user can opt for an
analysis that is sound either for defect detection or for correctness. The former
Using Static Analysis to Detect Type Errors and Concurrency Defects 17
4 Concluding Remarks
Although many of the ideas of Dialyzer can be traced as far back as 2003,
this line of research is active and far from complete. We are currently working
on various extensions and additions to the core of Dialyzer’s analysis that will
enable it to detect many more kinds of errors in programs. Chief among them
are those related to detecting defects in concurrent programs that use message-
passing for concurrency, which arguably is Erlang’s most salient feature. No
matter how many analyses one designs and employs, programmers somehow
seem to be continuously stepping upon interesting new cases of bugs which are
beyond the reach of these analyses. Although it is clearly impossible to catch all
software bugs, it’s certainly fun to try!
Acknowledgements
This research has been supported in part by grant #621-2006-4669 from the
Swedish Research Council. Tobias Lindahl and Maria Christakis have contributed
enormously to Dialyzer; both to the design of the analyses employed by the tool,
by fine-tuning their actual implementation and making Dialyzer not only effec-
tive in discovering bugs, but also efficient and scalable.
The author wishes to thank both the Program Committee of FLOPS 2010 for
their invitation and the program chairs of the Symposium for their patience in
receiving answers to their e-mails and the camera-ready version of this paper.
References
1. Armstrong, J.: Programming Erlang: Software for a Concurrent World. The Prag-
matic Bookshelf, Raleigh (2007)
2. Nagy, T., Nagyné Vı́g, A.: Erlang testing and tools survey. In: Proceedings of the
7th ACM SIGPLAN Workshop on Erlang, pp. 21–28. ACM, New York (2008)
18 K. Sagonas
M. Blume, N. Kobayashi, and G. Vidal (Eds.): FLOPS 2010, LNCS 6009, pp. 19–23, 2010.
c Springer-Verlag Berlin Heidelberg 2010
20 N. Tamura, T. Tanjo, and M. Banbara
Log encoding uses a Boolean variable p(x(i) ) for each i-th bit of each CSP
variable x [6,7]. A constraint is encoded by enumerating its conflict points as
SAT clauses as in the direct encoding.
Log-support encoding also uses a Boolean variable p(x(i) ) for each i-th bit
of each CSP variable x as in the log encoding [8]. A constraint is encoded by
considering its support points.
Order encoding uses a Boolean variable p(x ≤ i) for each integer variable x
and domain value i where p(x ≤ i) is defined as true if and only if the CSP
variable x is less than or equal to i [9]. This encoding was first used to encode
job-shop scheduling problems by Crawford and Baker [10] and studied by Inoue
et al. [11,12]. This encoding shows a good performance for a wide variety of
problems, It succeeded to solve previously undecided problems, such as open-
shop scheduling problems [9] and two-dimensional strip packing problems [13].
Let x be an integer variable whose domain is {l..u}. Boolean variables p(x ≤ l),
p(x ≤ l + 1), . . . , p(x ≤ u − 1) and the following SAT clauses are introduced to
encode each integer variable x. Please note that p(x ≤ u) is unnecessary because
x ≤ u is always true.
0 1 2 3 4 5 6 7 x
The following six SAT clauses are used to encode the variables x and y.
The following five SAT clauses are used to encode the constraint x + y ≤ 7
(Fig. 1)
p(y ≤ 5) p(x ≤ 2) ∨ p(y ≤ 4) p(x ≤ 3) ∨ p(y ≤ 3) p(x ≤ 4) ∨ p(y ≤ 2) p(x ≤ 5)
instances in these categories by Sugar and other top ranked solvers Mistral3 ,
Choco4 , and bpsolver5 .
Through those results, it is shown that a SAT-based solver can have a compet-
itive performance with other state-of-the-art solvers in difficult CSP benchmark
instances. We hope the order encoding used in Sugar is becoming popular in
other SAT-based systems.
This research was partially supported by JSPS (Japan Society for the Promo-
tion of Science), Grant-in-Aid for Scientific Research (A), 2008–2011, 20240003.
References
1. Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability.
Frontiers in Artificial Intelligence and Applications (FAIA), vol. 185. IOS Press,
Amsterdam (2009)
2. de Kleer, J.: A comparison of ATMS and CSP techniques. In: Proceedings of the
11th International Joint Conference on Artificial Intelligence (IJCAI 1989), pp.
290–296 (1989)
3. Walsh, T.: SAT v CSP. In: Dechter, R. (ed.) CP 2000. LNCS, vol. 1894, pp. 441–
456. Springer, Heidelberg (2000)
4. Kasif, S.: On the parallel complexity of discrete relaxation in constraint satisfaction
networks. Artificial Intelligence 45, 275–286 (1990)
5. Gent, I.P.: Arc consistency in SAT. In: Proceedings of the 15th European Confer-
ence on Artificial Intelligence (ECAI 2002), pp. 121–125 (2002)
6. Iwama, K., Miyazaki, S.: SAT-variable complexity of hard combinatorial problems.
In: Proceedings of the IFIP 13th World Computer Congress, pp. 253–258 (1994)
7. Gelder, A.V.: Another look at graph coloring via propositional satisfiability. Dis-
crete Applied Mathematics 156, 230–243 (2008)
8. Gavanelli, M.: The log-support encoding of CSP into SAT. In: Bessière, C. (ed.)
CP 2007. LNCS, vol. 4741, pp. 815–822. Springer, Heidelberg (2007)
9. Tamura, N., Taga, A., Kitagawa, S., Banbara, M.: Compiling finite linear CSP into
SAT. Constraints 14, 254–272 (2009)
10. Crawford, J.M., Baker, A.B.: Experimental results on the application of satisfi-
ability algorithms to scheduling problems. In: Proceedings of the 12th National
Conference on Artificial Intelligence (AAAI 1994), pp. 1092–1097 (1994)
11. Inoue, K., Soh, T., Ueda, S., Sasaura, Y., Banbara, M., Tamura, N.: A competitive
and cooperative approach to propositional satisfiability. Discrete Applied Mathe-
matics 154, 2291–2306 (2006)
12. Nabeshima, H., Soh, T., Inoue, K., Iwanuma, K.: Lemma reusing for SAT based
planning and scheduling. In: Proceedings of the International Conference on Au-
tomated Planning and Scheduling 2006 (ICAPS 2006), pp. 103–112 (2006)
13. Soh, T., Inoue, K., Tamura, N., Banbara, M., Nabeshima, H.: A SAT-based method
for solving the two-dimensional strip packing problem. Journal of Algorithms in
Cognition, Informatics and Logic (2009) (to appear)
14. Tamura, N., Banbara, M.: Sugar: a CSP to SAT translator based on order encoding.
In: Proceedings of the 2nd International CSP Solver Competition, pp. 65–69 (2008)
3
http://www.4c.ucc.ie/~ehebrard/Software.html
4
http://choco.emn.fr
5
http://www.probp.com
Solving Constraint Satisfaction Problems with SAT Technology 23
15. Tamura, N., Tanjo, T., Banbara, M.: System description of a SAT-based CSP solver
Sugar. In: Proceedings of the 3rd International CSP Solver Competition, pp. 71–75
(2008)
16. Tanjo, T., Tamura, N., Banbara, M.: Sugar++: a SAT-based Max-CSP/COP
solver. In: Proceedings of the 3rd International CSP Solver Competition, pp. 77–82
(2008)
17. Eén, N., Sörensson, N.: An extensible SAT-solver. In: Giunchiglia, E., Tacchella,
A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004)
18. Biere, A.: PicoSAT essentials. Journal on Satisfiability, Boolean Modeling and
Computation 4, 75–97 (2008)
A Church-Style Intermediate Language for MLF
Introduction
MLF (Le Botlan and Rémy 2003, 2007; Rémy and Yakobowski 2008b) is a type
system that seamlessly merges ML-style implicit but second-class polymor-
phism with System-F explicit first-class polymorphism. This is done by enriching
System-F types. Indeed, System F is not well-suited for partial type inference, as
illustrated by the following example. Assume that a function, say choice, of type
∀ (α) α → α → α and the identity function id, of type ∀ (β) β → β, have been
defined. How can the application choice to id be typed in System F? Should
choice be applied to the type ∀ (β) β → β of the identity that is itself kept
polymorphic? Or should it be applied to the monomorphic type γ → γ, with
the identity being applied to γ (where γ is bound in a type abstraction in front
of the application)? Unfortunately, these alternatives have incompatible types,
respectively (∀ (α) α → α) → (∀ (α) α → α) and ∀ (γ) (γ → γ) → (γ → γ):
none is an instance of the other. Hence, in System F, one is forced to irreversibly
choose between one of the two explicitly typed terms.
However, a type inference system cannot choose between the two, as this would
sacrifice completeness and be somehow arbitrary. This is why MLF enriches types
with instance-bounded polymorphism, which allows to write more expressive
types that factor out in a single type all typechecking alternatives in such cases
as the example of choice. Now, the type ∀ (α τ ) α → α, which should be
read “α → α where α is any instance of τ ”, can be assigned to choice id, and
the two previous alternatives can be recovered a posteriori by choosing different
instances for α.
M. Blume, N. Kobayashi, and G. Vidal (Eds.): FLOPS 2010, LNCS 6009, pp. 24–39, 2010.
c Springer-Verlag Berlin Heidelberg 2010
A Church-Style Intermediate Language for MLF 25
Currently, the language MLF comes with a Curry-style version iMLF where no
type information is needed and a type-inference version eMLF that requires par-
tial type information (Le Botlan and Rémy 2007). However, eMLF is not quite in
Church’s style, since a large amount of type information is still implicit and par-
tial type information cannot be easily maintained during reduction. Hence, while
eMLF is a good surface language, it is not a good candidate for use as an internal
language during the compilation process, where some program transformations,
and perhaps some reduction steps, are being performed. This has been a problem
for the adoption of MLF in the Haskell community (Peyton Jones 2003), as the
Haskell compilation chain uses an explicitly-typed internal language.
This is also an obstacle to proving subject reduction, which does not hold
in eMLF. In a way, this is unavoidable in a language with non-trivial partial
type inference. Indeed, type annotations cannot be completely dropped, but
must at least be transformed and reorganized during reduction. Still, one could
expect that eMLF be equipped with reduction rules for type annotations. This
has actually been considered in the original presentation of MLF, but only with
limited success. The reduction kept track of annotation sites during reduction;
this showed, in particular, that no new annotation site needs to be introduced
during reduction. Unfortunately, the exact form of annotations could not be
maintained during reduction, by lack of an appropriate language to describe their
computation. As a result, it has only been shown that some type derivation can
be rebuilt after the reduction of a well-typed program, but without exhibiting
an algorithm to compute them during reduction.
Independently, Rémy and Yakobowski (2008b) have introduced graphic con-
straints, both to simplify the presentation of MLF and to improve its type infer-
ence algorithm. This also lead to a simpler, more expressive definition of MLF.
In this paper, we present x MLF, a Church-style version of MLF that contains
full type information. In fact, type checking becomes a simple and local ver-
ification process—by contrast with type inference in eMLF, which is based on
unification. In x MLF, type abstraction, type instantiation, and all parameters
of functions are explicit, as in System F. However, type instantiation is more
general and more atomic than type application in System F: we use explicit type
instantiation expressions that are proof evidences for the type instance relations.
In addition to the usual β-reduction, we give a series of reduction rules for
simplifying type instantiations. These rules are confluent when allowed in any
context. Moreover, reduction preserves typings, and is sufficient to reduce all
typable expressions to a value when used in either a call-by-value or call-by-name
setting. This establishes the soundness of MLF for a call-by-name semantics for
the first time. Notably, x MLF is a conservative extension of System F.
The paper is organized as follows. We present x MLF, its syntax and its static
and dynamic semantics in §1. We study its main properties, including type
soundness for different evaluations strategies in §2. We discuss possible varia-
tions, as well as related and future works in §3. All proofs are omitted, but can
be found in (Yakobowski 2008, Chapters 14 & 15).
26 D. Rémy and B. Yakobowski
1 The Calculus
Inst-Under Inst-Abstr
Inst-Bot Γ, α τ φ : τ1 ≤ τ2 ατ ∈Γ
Γ τ :⊥≤τ Γ ∀ (α ) φ : ∀ (α τ ) τ1 ≤ ∀ (α τ ) τ2 Γ !α : τ ≤ α
Inst-Inside Inst-Intro
Γ φ : τ1 ≤ τ2 α∈/ ftv(τ )
Γ ∀ ( φ) : ∀ (α τ1 ) τ ≤ ∀ (α τ2 ) τ Γ : τ ≤ ∀ (α ⊥) τ
Inst-Comp
Γ φ 1 : τ1 ≤ τ2
Γ φ2 : τ2 ≤ τ3 Inst-Elim Inst-Id
Γ φ 1 ; φ 2 : τ1 ≤ τ3
Γ : ∀ (α τ ) τ ≤ τ {α ← τ } Γ ½:τ ≤τ
1
The choice of is only by symmetry with the elimination form described next,
and has no connection at all with linear logic.
28 D. Rémy and B. Yakobowski
τ (!α) = α τ = ∀ (α ⊥) τ α∈
/ ftv(τ )
⊥ τ = τ (∀ (α τ ) τ ) = τ {α ← τ }
τ ½ = τ (∀ (α τ ) τ ) (∀ ( φ)) = ∀ (α τ φ) τ
τ (φ1 ; φ2 ) = (τ φ1 ) φ2 (∀ (α τ ) τ ) (∀ (α ) φ) = ∀ (α τ ) (τ φ)
Example. Let τmin , τcmp , and τand be the types of the parametric minimum and
comparison functions and of the conjunction of boolean formulas:
τmin ∀ (α ⊥) α → α → α τcmp ∀ (α ⊥) α → α → bool
τand bool → bool → bool
Let φ be the instantiation ∀ ( bool); . Then, φ : τmin ≤ τand and φ :
τcmp ≤ τand hold. Let τK be the type ∀ (α ⊥) ∀ (β ⊥) α → β → α (e.g. of
the λ-term λ(x) λ(y) x) and φ be the instantiation2 ∀ (α ) (∀ ( α); ). Then,
φ : τK ≤ τmin .
We say that types τ and τ are equivalent in Γ if there exist φ and φ such that
Γ φ : τ ≤ τ and Γ φ : τ ≤ τ . Although types of x MLF are syntactically the
same as the types of iMLF—the Curry-style version of MLF (Le Botlan and Rémy
2007)—they are richer, because type equivalence in x MLF is finer than type
equivalence in iMLF, as will be explained in §3.
Typing rules for x MLF. Typing rules are defined in Figure 4. Compared with
System F, the novelties are type abstraction and type instantiation, unsurpris-
ingly. The typing of a type abstraction Λ(α τ ) a extends the typing environ-
ment with the type variable α bound by τ . The typing of a type instantiation
a φ resembles the typing of a coercion, as it just requires the instantiation φ to
transform the type of a into the type of the result. Of course, it has the full
power of the type application rule of System F. For example, the type instan-
tiation a τ has type τ {α ← τ } provided the term a has type ∀ (α) τ . As in
System F, a well-typed closed term has a unique type—in fact, a unique typing
derivation.
A let-binding let x = a1 in a2 cannot entirely be treated as an abstraction for
an immediate application (λ(x : τ1 ) a2 ) a1 because the former does not require a
type annotation on x whereas the latter does. This is nothing new, and the same
as in System F extended with let-bindings. (Notice however that τ1 , which is the
type of a1 , is fully determined by a1 and could be synthesized by a typechecker.)
Example. Let id stand for the identity Λ(α ⊥) λ(x : α) x and τid for the
type ∀ (α ⊥) α → α. We have id : τid . The function choice mentioned
in the introduction, may be defined as Λ(β ⊥) λ(x : β) λ(y : β) x. It has
type ∀ (β ⊥) β → β → β. The application of choice to id, which we refer to
below as choice id, may be defined as Λ(β τid ) choice β (id (!β)) and has type
∀ (β τid ) β → β. The term choice id may also be given weaker types by type
instantiation. For example, choice id has type (∀ (α ⊥) α → α) → (∀ (α ⊥)
α → α) as in System F, while choice id (; ∀ (γ ) (∀ ( γ ); )) has the ML
type ∀ (γ ⊥) (γ → γ) → γ → γ.
Reduction. The semantics of the calculus is given by a small-step reduction
semantics. We let reduction occur in any context, including under abstractions.
That is, the evaluation contexts are single-hole contexts, given by the grammar:
E ::= [ · ] | E φ | λ(x : τ ) E | Λ(α τ ) E
| E a | a E | let x = E in a | let x = a in E
30 D. Rémy and B. Yakobowski
(λ(x : τ ) a1 ) a2 −→ a1 {x ← a2 } (β)
let x = a2 in a1 −→ a1 {x ← a2 } (βlet )
a½ −→ a (ι-Id)
a (φ; φ ) −→ a φ (φ ) (ι-Seq)
a −→ Λ(α ⊥) a α∈
/ ftv(a) (ι-Intro)
(Λ(α τ ) a) −→ a{!α ← ½}{α ← τ } (ι-Elim)
(Λ(α τ ) a) (∀ (α ) φ) −→ Λ(α τ ) (a φ) (ι-Under)
(Λ(α τ ) a) (∀ ( φ)) −→ Λ(α τ φ) a{!α ← φ; !α} (ι-Inside)
E[a] −→ E[a ] if a −→ a (Context)
The reduction rules are described in Figure 5. As usual, basic reduction steps
contain β-reduction, with the two variants (β) and (βlet ). Other basic reduc-
tion rules, related to the reduction of type instantiations and called ι-steps, are
described below. The one-step reduction is closed under the context rule. We
write −→β and −→ι for the two subrelations of −→ that contains only Con-
text and β-steps or ι-step, respectively. Finally, the reduction is the reflexive
and transitive closure −→ → of the one-step reduction relation.
Reduction of type instantiation. Type instantiation redexes are all of the form
a φ. The first three rules do not constrain the form of a. The identity type
instantiation is just dropped (Rule ι-Id). A type instantiation composition is
replaced by the successive corresponding type instantiations (Rule ι-Seq). Rule
ι-Intro introduces a new type abstraction in front of a; we assume that the
bound variable α is fresh in a. The other three rules require the type instantiation
to be applied to a type abstraction Λ(α τ ) a. Rule ι-Under propagates the
type instantiation under the bound, inside the body a. By contrast, Rule ι-
Inside propagates the type instantiation φ inside the bound, replacing τ by τ φ.
However, as the bound of α has changed, the domain of the type instantiations
!α is no more τ , but τ φ. Hence, in order to maintain well-typedness, all the
occurrences of the instantiation !α in a must be simultaneously replaced by
the instantiation (φ; !α). Here, the instantiation !α is seen as atomic, i.e. all
occurrences of !α are substituted, but other occurrences of α are left unchanged
(see the appendix for the formal definition). For instance, if a is the term
Λ(α τ ) λ(x : α → α) λ(y : ⊥) y (α → α) (z (!α))
then, the type instantiation a (∀ ( φ)) reduces to:
Λ(α τ φ) λ(x : α → α) λ(y : ⊥) y (α → α) (z (φ; !α))
Rule ι-Elim eliminates the type abstraction, replacing all the occurrences of α
inside a by the bound τ . All the occurrences of !α inside τ (used to instantiate τ
into α) become vacuous and must be replaced by the identity instantiation. For
example, reusing the term a above, a reduces to λ(x : τ → τ ) λ(y : ⊥) y (τ →
τ ) (z ½). Notice that type instantiations a τ and a (!α) are irreducible.
A Church-Style Intermediate Language for MLF 31
2 Properties of Reduction
The reduction has been defined so that the type erasure of a reduction sequence
in x MLF is a reduction sequence in the untyped λ-calculus. Formally, the type
erasure of a term a of x MLF is the untyped λ-term a defined inductively by
x = x let x = a1 in a2 = let x = a1 in a2
a φ = a λ(x : τ ) a = λ(x) a
a1 a2 = a1 a2 Λ(α τ ) a = a
32 D. Rémy and B. Yakobowski
It is immediate to verify that two terms related by ι-reduction have the same
type erasure. Moreover, if a β-reduces to a , then the type erasure of a β-reduces
to the type erasure of a in one step in the untyped λ-calculus.
2.2 Confluence
Theorem 2. The relation −→β is confluent. The relations −→ι and −→ are
confluent on the terms well-typed in some context.
We conjecture, but have not proved, that all reduction sequences are finite.
In order to show that the calculus may also be used as the core of a programming
language, we now introduce constants and restricts the semantics to a weak
evaluation strategy.
We let the letter c range over constants. Each constant comes with its arity |c|.
The dynamic semantics of constants must be provided by primitive reduction
rules, called δ-rules. However, these are usually of a certain form. To characterize
δ-rules (and values), we partition constants into constructors and primitives,
ranged over by letters C and f , respectively. The difference between the two lies
in their semantics: primitives (such as +) are reduced when fully applied, while
constructors (such as cons) are irreducible and typically eliminated when passed
as argument to primitives.
In order to classify constructed values, we assume given a collection of type
constructors κ, together with their arities |κ|. We extend types with constructed
types κ (τ1 , . . . τ|κ| ). We write α for a sequence of variables α1 , . . . αk and ∀ (α) τ
for the type ∀ (α1 ) . . . ∀ (αk ) τ . The static semantics of constants is given by
an initial typing environment Γ0 that assigns to every constant c a type τ of
the form ∀ (α) τ1 → . . . τn → τ0 , where τ0 is a constructed type whenever the
constant c is a constructor.
We distinguish a subset of terms, called values and written v. Values are term
abstractions, type abstractions, full or partial applications of constructors, or
partial applications of primitives. We use an auxiliary letter w to character-
ize the arguments of functions, which differ for call-by-value and call-by-name
strategies. In values, an application of a constant c can involve a series of type
instantiations, but only evaluated ones and placed before all other arguments.
Moreover, the application may only be partial whenever c is a primitive. Eval-
uated instantiations θ may be quantifier eliminations or either inside or under
34 D. Rémy and B. Yakobowski
(general) instantiations. In particular, a τ and a (!α) are never values. The gram-
mar for values and evaluated instantiations is as follows:
v ::= λ(x : τ ) a
| Λ(α : τ ) a
| C θ1 . . . θk w1 . . . wn n ≤ |C|
| f θ1 . . . θk w1 . . . wn n < |f |
θ ::= ∀ ( φ) | ∀ (α ) φ |
Finally, we assume that δ-rules are of the form f θ1 . . . θk w1 . . . w|f | −→f a (that
is, δ-rules may only reduce fully applied primitives).
In addition to this general setting, we make further assumptions to relate the
static and dynamic semantics of constants.
Subject reduction: δ-reduction preserves typings, i.e., for any typing context
Γ such that Γ a : τ and a −→f a , the judgment Γ a : τ holds.
Progress: Well-typed, full applications of primitives can be reduced, i.e., for
any term a of the form f θ1 . . . θk w1 . . . wn verifying Γ0 a : τ , there
exists a term a such that a −→f a .
It is then routine work to extend the semantics with a global store to model side
effects and verify type soundness for this extension.
3 Discussion
Expressiveness of x MLF. The translation of eMLF into x MLF shows that x MLF
is at least as expressive as eMLF. However, and perhaps surprisingly, the converse
is not true. That is, there exist programs of x MLF that cannot be typed in MLF.
While, this is mostly irrelevant when using MLF as an internal language for eMLF,
the question is still interesting from a theoretical point of view, as understanding
x MLF on its own, i.e. independently of the type inference constraints of eMLF,
could perhaps suggest other useful extensions of x MLF.
4
So far, type-soundness has only been proved for the original, but slightly
weaker variant of MLF (Le Botlan 2004) and for the shallow, recast version of
MLF (Le Botlan and Rémy 2007).
36 D. Rémy and B. Yakobowski
For the sake of simplicity, we explain the difference between x MLF and iMLF,
the Curry-style version of MLF (which has the same expressiveness as eMLF).
Although syntactically identical, the types of x MLF and of syntactic iMLF differ
in their interpretation of alias bounds, i.e. quantifications of the form ∀ (β α) τ .
Consider, for example, the two types τ0 and τid defined as ∀ (ατ ) ∀ (β α) β →
α and ∀ (α τ ) α → α. In iMLF, alias bounds can be expanded and τ0 and τid are
equivalent. Roughly, the set of their instances (stripped of toplevel quantifiers)
is {τ → τ | τ ≤ τ }. In contrast, the set of instances of τ0 is larger in x MLF and
at least a superset of {τ → τ | τ ≤ τ ≤ τ }. This level of generality cannot
be expressed in iMLF.
The current treatment of alias bounds in x MLF is quite natural in a Church-
style presentation. Surprisingly, it is also simpler than treating them as in eMLF.
A restriction of x MLF without alias bounds that is closed under reduction and in
closer correspondence with iMLF can still be defined a posteriori, by constraining
the formation of terms, but the definition is contrived and unnatural. Instead of
restricting x MLF to match the expressiveness of iMLF, a question worth further
investigation is whether the treatment of alias bounds could be enhanced in
iMLF and eMLF to match the one in x MLF without compromising type inference.
Related works. A strong difference between eMLF and x MLF is the use of ex-
plicit coercions to trace the derivation of type instantiation judgments. A similar
approach has already been used in a language with subtyping and intersection
types, proposed as a target for the compilation of bounded polymorphism (Crary
2000). In both cases, coercions are used to make typechecking a trivial process.
In our case, they are also exploited to make subject reduction easy—by introduc-
ing the language to describe how type instance derivations must be transformed
during reduction. (We believe that the use of explicit coercions for simplifying
subject-reduction proofs has been neglected.) In both approaches, reduction is
split into a standard notion of β-reduction and a new form of reduction (which
we call ι-reduction) that only deals with coercions, preserves type-erasures, and
is (conjectured to be) strongly normalizing. There are also important differ-
ences. While both coercion languages have common forms, our coercions intend-
edly keep the instance-bounded polymorphism form ∀ (α τ ) τ . On the oppo-
site, coercions are used to eliminate the subtype-bounded polymorphism form
∀ (α ≤ τ ) τ in (Crary 2000), using intersection types and contravariant arrow
coercions instead, which we do not need. It would be worth checking whether
union types, which are propsoed as an extension in (Crary 2000), could be used
to encode away our instance-bounded polymorphism form.
Besides this work and the several papers that describe variants of MLF, there
are actually few other related works. Both Leijen and Löh (2005) and Leijen
(2007) have studied the extension of MLF with qualified types, and as a subcase,
the translation of MLF without qualified types into System F. However, in order
to handle type instantiations, a term a of type ∀ (α τ ) τ is elaborated as a
function of type ∀ (α) (τ → α) → τ , where τ is a runtime representation of τ .
The first argument is a runtime coercion, which bears strong similarities with
our instantiations. However, an important difference is that their coercions are at
A Church-Style Intermediate Language for MLF 37
the level of terms, while our instantiations are at the level of types. In particular,
although coercion functions should not change the semantics, this critical result
has not been proved so far, while in our settings the type-erasure semantics
comes for free by construction. The impact of coercion functions in a call-by-
value language with side effects is also unclear. Perhaps, a closer connection
between their coercion functions and our instantiations could be established and
used to actually prove that their coercions do not alter the semantics. However,
even if such a result could be proved, coercions should preferably remain at the
type level, as in our setting, than be intermixed with terms, as in their proposal.
Future works. The demand for an internal language for MLF was first made
in the context of using the eMLF type system for the Haskell language. We
expect x MLF to better accommodate qualified types than eMLF since no evidence
function would be needed for flexible polymorphism, but it remains to be verified.
Type instantiation, which changes the type of an expression without changing
its meaning, goes far beyond type application in System F and resembles retyping
functions in System Fη —the closure of F by η-conversion (Mitchell 1988). Those
functions can be seen either at the level of terms, as expressions of System F
that βη-reduces to the identity, or at the level of types as a type conversion.
Some loose parallel can be made between the encoding of MLF in System F
by Leijen and Löh (2005) which uses term-level coercions, and x MLF which uses
type-level instantiations. Additionally, perhaps Fη could be extended with a
form of abstraction over retyping functions, much as type abstraction ∀ (α τ )
in x MLF amounts to abstracting over the instantiation !α of type τ → α. (Or
perhaps, as suggested by the work of Crary (2000), intersection and union types
could be added to Fη to avoid the need for abstracting over coercion functions.)
Regarding type soundness, it is also worth noticing that the proof of subject
reduction in x MLF does not subsume, but complements, the one in the original
presentation of MLF. The latter does not explain how to transform type annota-
tions, but shows that annotation sites need not be introduced (only transformed)
during reduction. Because x MLF has full type information, it cannot say anything
about type information that could be left implicit and inferred. Given a term in
x MLF, can we rebuild a term in iMLF with minimal type annotations? While this
should be easy if we require that corresponding subterms have identical types in
x MLF and iMLF, the answer is unclear if we allow subterms to have different types.
The semantics of x MLF allows reduction (and elimination) of type instan-
tiations a φ through ι-reduction but does not operate reduction (and simpli-
fication) of instantiations φ alone. It would be possible to define a notion of
reduction on instantiations φ −→ φ (such that, for instance, ∀ ( φ1 ; φ2 ) −→
∀ ( φ1 ); ∀ ( φ2 ), or conversely?) and extend the reduction of terms with a con-
text rule a φ −→ a φ whenever φ −→ φ . This might be interesting for more
economical representations of instantiation. However, it is unclear whether there
exists an interesting form of reduction that is both Church-Rosser and large
enough for optimization purposes. Perhaps, one should rather consider instanti-
ation transformations that preserve observational equivalence, which would leave
more freedom in the way one instantiation could be replaced by another.
38 D. Rémy and B. Yakobowski
References
Barendregt, H.P.: The Lambda Calculus: Its Syntax and Semantics. North-Holland,
Amsterdam (1984), ISBN: 0-444-86748-1
Crary, K.: Typed compilation of inclusive subtyping. In: ICFP 2000: Proceedings of
the fifth ACM SIGPLAN international conference on Functional programming, pp.
68–81. ACM, New York (2000)
Herms, P.: Partial Type Inference with Higher-Order Types. Master’s thesis, University
of Pisa and INRIA (2009) (to appear)
Le Botlan, D.: MLF: An extension of ML with second-order polymorphism and implicit
instantiation. PhD thesis, Ecole Polytechnique (June 2004) (english version)
Le Botlan, D., Rémy, D.: MLF: Raising ML to the power of System-F. In: Proceedings
of the Eighth ACM SIGPLAN International Conference on Functional Programming,
August 2003, pp. 27–38 (2003)
Le Botlan, D., Rémy, D.: Recasting MLF. Research Report 6228, INRIA, Rocquen-
court, BP 105, 78 153 Le Chesnay Cedex, France (June 2007)
Leijen, D.: A type directed translation of MLF to System F. In: The International
Conference on Functional Programming (ICFP 2007). ACM Press, New York (2007)
Leijen, D.: Flexible types: robust type inference for first-class polymorphism. In: Pro-
ceedings of the 36th annual ACM Symposium on Principles of Programming Lan-
guages (POPL 2009), pp. 66–77. ACM, New York (2009)
Leijen, D., Löh, A.: Qualified types for MLF. In: ICFP 2005: Proceedings of the tenth
ACM SIGPLAN international conference on Functional programming, pp. 144–155.
ACM Press, New York (2005)
A Church-Style Intermediate Language for MLF 39
Mitchell, J.C.: Polymorphic type inference and containment. Information and Compu-
tation 2/3(76), 211–249 (1988)
Jones, S.P.: Haskell 98 Language and Libraries: The Revised Report. Cambridge Uni-
versity Press, Cambridge (2003), ISBN: 0521826144
Rémy, D., Yakobowski, B.: A church-style intermediate language for MLF (extended
version) (September 2008a),
http://gallium.inria.fr/~ remy/mlf/xmlf.pdf
Rémy, D., Yakobowski, B.: From ML to MLF: Graphic type constraints with efficient
type inference. In: The 13th ACM SIGPLAN International Conference on Func-
tional Programming (ICFP 2008), Victoria, BC, Canada, September 2008, pp. 63–74
(2008b)
Yakobowski, B.: Graphical types and constraints: second-order polymorphism and in-
ference. PhD thesis, University of Paris 7 (December 2008)
Abstract. The recent success of languages like Agda and Coq demon-
strates the potential of using dependent types for programming. These
systems rely on many high-level features like datatype definitions, pat-
tern matching and implicit arguments to facilitate the use of the lan-
guages. However, these features complicate the metatheoretical study
and are a potential source of bugs.
To address these issues we introduce ΠΣ, a dependently typed core
language. It is small enough for metatheoretical study and the type
checker is small enough to be formally verified. In this language there is
only one mechanism for recursion—used for types, functions and infinite
objects—and an explicit mechanism to control unfolding, based on lifted
types. Furthermore structural equality is used consistently for values and
types; this is achieved by a new notion of α-equality for recursive defini-
tions. We show, by translating several high-level constructions, that ΠΣ
is suitable as a core language for dependently typed programming.
1 Introduction
Dependent types offer programmers a flexible path towards formally verified pro-
grams and, at the same time, opportunities for increased productivity through
new ways of structuring programs (Altenkirch et al. 2005). Dependently typed
programming languages like Agda (Norell 2007) are gaining in popularity, and
dependently typed programming is also becoming more popular in the Coq com-
munity (Coq Development Team 2009), for instance through the use of some re-
cent extensions (Sozeau 2008). An alternative to moving to full-blown dependent
types as present in Agda and Coq is to add dependently typed features without
giving up a traditional view of the distinction between values and types. This
is exemplified by the presence of GADTs in Haskell, and by more experimental
systems like Ωmega (Sheard 2005), ATS (Cui et al. 2005), and the Strathclyde
Haskell Enhancement (McBride 2009).
Dependently typed languages tend to offer a number of high-level features for
reducing the complexity of programming in such a rich type discipline, and at
the same time improve the readability of the code. These features include:
Datatype definitions. A convenient syntax for defining dependently typed
families inductively and/or coinductively.
M. Blume, N. Kobayashi, and G. Vidal (Eds.): FLOPS 2010, LNCS 6009, pp. 40–55, 2010.
c Springer-Verlag Berlin Heidelberg 2010
ΠΣ: Dependent Types without the Sugar 41
These features, while important for the usability of dependently typed languages,
complicate the metatheoretic study and can be the source of subtle bugs in the
type checker. To address such problems, we can use a core language which is
small enough to allow metatheoretic study. A verified type checker for the core
language can also provide a trusted core in the implementation of a full language.
Coq makes use of a core language, the Calculus of (Co)Inductive Constructions
(CCIC, Giménez 1996). However, this calculus is quite complex: it includes the
schemes for strictly positive datatype definitions and the accompanying recursion
principles. Furthermore it is unclear whether some of the advanced features of
Agda, such as dependently typed pattern matching, the flexible use of mixed
induction/coinduction, and induction-recursion, can be easily translated into
CCIC or a similar calculus. (One can argue that a core language is less useful if
the translation from the full language is difficult to understand.)
In the present paper we suggest a different approach: we propose a core lan-
guage that is designed in such a way that we can easily translate the high-level
features mentioned above; on the other hand, we postpone the question of to-
tality. Totality is important for dependently typed programs, partly because
non-terminating proofs are not very useful, and partly for reasons of efficiency:
if a certain type has at most one total value, then total code of that type does not
need to be run at all. However, we believe that it can be beneficial to separate
the verification of totality from the functional specification of the code. A future
version of our core language may have support for independent certificates of to-
tality (and the related notions of positivity and stratification); such certificates
could be produced manually, or through the use of a termination checker.
The core language proposed in this paper is called ΠΣ and is based on a small
collection of basic features:1
WATCHA WATCHIN'?
WINTER A-GO-GO.
WOLF HOUNDED.
WONDERFUL AFRICA.
WONDERFUL AUSTRIA.
WONDERFUL BEAULIEU.
WONDERFUL CARIBBEAN.
WONDERFUL GREECE.
WONDERFUL HONG KONG.
WONDERFUL ISRAEL.
WONDERFUL SARDINIA.
WONDERFUL SCOTLAND.
WONDERFUL SWITZERLAND.
WONDERS OF ARKANSAS.
WONDERS OF DALLAS.
WONDERS OF KENTUCKY.
WONDERS OF ONTARIO.
WONDERS OF PHILADELPHIA.
ZOO IS COMPANY.
ZOTZ!
CASTLE KEEP.
ALBERTO GIACOMETTI.
ERIC BENTLEY.
SMOKE, ANYONE?
RASSLIN' CHAMPS.
LENNY BRUCE.
COCCIDIOIDOMYCOSIS.
CREEPING ERUPTION.
DEMODEX FOLLICULORUM.
DERMATOSES OCCURRING MAINLY IN
JAPANESE.
GENERALIZED KERATOACANTHOMA.
GIANT LICHENIFICATION.
MAL DE MELEDA.
ONCHOCERCIASIS.
PEUTZ-JEGHERS' SYNDROME.
SILICA GRANULOMA.
URTICARIA PIGMENTOSA.
CERVANTES.
A FACE OF WAR.
A FACE OF WAR.
LES BICHES.
Company of Artists, Inc.
BELOVED INFIDEL.
A STUDY IN TERROR.
DATA COMMUNICATIONS.
Comstock Productions.
FUSION.
Concordia Films.
RECLAIMED.
Condon, Richard.
TO MARKET, TO MARKET.
Conrad, Joseph.
LORD JIM.
Conrad, Lawrence H.
Consortium Pathe.
LES BICHES.
Constantin-Film.
HEADSTART ON TOMORROW.
MICHIGAN YEAR.
PIPELINE PEOPLE.
YOUTH DANCES.
Continenza, Alessandro.
HANNIBAL.
Cook, James A.
Cook, Peter.
BEDAZZLED.
Cook, Will.
THE HAND.
HENNESEY.
CONNIE.
Coppel, Alec.
THE GAZEBO.
MOMENT TO MOMENT.
Coppel, Myra.
THE GAZEBO.
Corlou Films.
TOBRUK.
PLAY IT COOL.
Coronet Films.
AN ALPHABET OF BIRDS.
ANCIENT PALESTINE.
ANCIENT PERSIA.
AQUATIC INSECTS.
THE BAGWORM.
BASIC SKILLS.
BASKETBALL FUNDAMENTALS.
THE BEETLE.
BRUSHING UP ON DIVISION.
BRUSHING UP ON MULTIPLICATION.
A CHRISTMAS CAROL.
COMPETITION IN BUSINESS.
ELECTROCHEMICAL REACTIONS.
ELECTROMAGNETIC INDUCTION.
EVERYDAY COURTESY.
EVOLUTION OF MAN.
FEDERAL TAXATION.
GALILEO.
GENETICS: CHROMOSOMES AND GENES (MEIOSIS).
THE HALOGENS.
HOW TO STUDY.
THE INCAS.
ebookbell.com