Sanet - ST Programming Languages Concepts and Implementations
Sanet - ST Programming Languages Concepts and Implementations
Substantial discounts on bulk quantities of Jones & Bartlett Learning publications are available to corporations,
professional associations, and other qualified organizations. For details and specific discount information, contact
the special sales department at Jones & Bartlett Learning via the above contact information or send an email to
[email protected].
Copyright © 2023 by Jones & Bartlett Learning, LLC, an Ascend Learning Company
All rights reserved. No part of the material protected by this copyright may be reproduced or utilized in any form,
electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system,
without written permission from the copyright owner.
The content, statements, views, and opinions herein are the sole expression of the respective authors and not that of
Jones & Bartlett Learning, LLC. Reference herein to any specific commercial product, process, or service by trade
name, trademark, manufacturer, or otherwise does not constitute or imply its endorsement or recommendation
by Jones & Bartlett Learning, LLC and such reference shall not be used for advertising or product endorsement
purposes. All trademarks displayed are the trademarks of the parties noted herein. Programming Languages: Concepts
and Implementation is an independent publication and has not been authorized, sponsored, or otherwise approved by
the owners of the trademarks or service marks referenced in this product.
There may be images in this book that feature models; these models do not necessarily endorse, represent, or participate
in the activities represented in the images. Any screenshots in this product are for educational and instructive purposes
only. Any individuals and scenarios featured in the case studies throughout this product may be real or fictitious but
are used for instructional purposes only.
23862-4
Production Credits
VP, Content Strategy and Implementation: Christine Emerton Product Fulfillment Manager: Wendy Kilborn
Product Manager: Ned Hinman Composition: S4Carlisle Publishing Services
Content Strategist: Melissa Duffy Cover Design: Michael O’Donnell
Project Manager: Jessica deMartin Media Development Editor: Faith Brosnan
Senior Project Specialist: Jennifer Risden Rights Specialist: James Fortney
Digital Project Specialist: Rachel DiMaggio Cover Image: © javarman/Shutterstock.
Marketing Manager: Suzy Balk Printing and Binding: McNaughton & Gunn
Library of Congress Cataloging-in-Publication Data
Names: Perugini, Saverio, author.
Title: Programming languages : concepts and implementation / Saverio
Perugini, Department of Computer Science, University of Dayton.
Description: First edition. | Burlington, MA : Jones & Bartlett Learning,
[2023] | Includes bibliographical references and index.
Identifiers: LCCN 2021022692 | ISBN 9781284222722 (paperback)
Subjects: LCSH: Computer programming. | Programming languages (Electronic
computers)
Classification: LCC QA76.6 .P47235 2023 | DDC 005.13–dc23
LC record available at https://lccn.loc.gov/2021022692
6048
Printed in the United States of America
25 24 23 22 21 10 9 8 7 6 5 4 3 2 1
♰
♰ JMJ ♰
Ad majorem Dei gloriam.
Omnia in Christo.
In loving memory of
George Daloia,
Nicola and Giuseppina Perugini, and
Bob Twarek.
Requiem aeternam dona eis, Domine, et lux perpetua luceat eis.
Requiescant in pace. Amen.
Contents
Preface xvii
Part I Fundamentals 1
1 Introduction 3
1.1 Text Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Chapter Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The World of Programming Languages . . . . . . . . . . . . . . . . . . 4
1.3.1 Fundamental Questions . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Bindings: Static and Dynamic . . . . . . . . . . . . . . . . . . . 6
1.3.3 Programming Language Concepts . . . . . . . . . . . . . . . . 7
1.4 Styles of Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.1 Imperative Programming . . . . . . . . . . . . . . . . . . . . . . 8
1.4.2 Functional Programming . . . . . . . . . . . . . . . . . . . . . . 11
1.4.3 Object-Oriented Programming . . . . . . . . . . . . . . . . . . 12
1.4.4 Logic/Declarative Programming . . . . . . . . . . . . . . . . . 13
1.4.5 Bottom-up Programming . . . . . . . . . . . . . . . . . . . . . . 15
1.4.6 Synthesis: Beyond Paradigms . . . . . . . . . . . . . . . . . . . 16
1.4.7 Language Evaluation Criteria . . . . . . . . . . . . . . . . . . . 19
1.4.8 Thought Process for Problem Solving . . . . . . . . . . . . . . 20
1.5 Factors Influencing Language Development . . . . . . . . . . . . . . . 21
1.6 Recurring Themes in the Study of Languages . . . . . . . . . . . . . . 25
1.7 What You Will Learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.8 Learning Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.9 Thematic Takeaways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.10 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.11 Notes and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . 32
vi CONTENTS
15 Conclusion 713
15.1 Language Themes Revisited . . . . . . . . . . . . . . . . . . . . . . . . . 714
15.2 Relationship of Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . 714
15.3 More Advanced Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 716
15.4 Bottom-up Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . 716
15.5 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719
Bibliography B-1
Index I-1
Preface
Chapter Dependencies
The following figure depicts the dependencies between the chapters of this
text.
PREFACE xix
(online)
ML
Appendix B
(online)
Haskell
Appendix C Part ll: Types
7 8
Part l: Fundamentals
5 6 9
1 4
Part lll: Interpreter Implementation
2 3 10 11 12
Instructors can take multiple pathways through this text to customize their
languages course. Within each of the following tracks, instructors can add or
subtract material based on these chapter dependencies to suit the needs of their
students.
Multiple Pathways
Since the content in this text is arranged in a modular fashion, the pathways
through it are customizable.
Concepts-Based Approach
The following figure demonstrates the concepts-based approach through the text.
(online)
ML
Appendix B Part ll: Types
7 8
(online)
Haskell
Appendix C 9.1–9.5
12.3
(parameter-passing
mechanisms)
Part l: Fundamentals
5 6
1 4 12.5
(lazy evaluation)
2 3
13 14
The path through the text modeled here focuses solely on the conceptual parts
of Chapters 9 and 10–12, and omits the “Interpreter Implementation” module in
favor of the “Other Styles of Programming” module.
Interpreter-Based Approach
The following figure demonstrates the interpreter-based approach using Python.
Part l: Fundamentals
2 3
Part lll: Interpreter Implementation
1 4
10 11 12
Part ll: Types
5 6 9
(online)
Python Primer
Camille
Appendix A
Appendix D
PREFACE xxi
(online)
ML
Appendix B Part ll: Types
7 8
(online)
Haskell 9
Appendix C
2 3 10 11
1 4
5 6
Python Primer 12.5
Appendix A (lazy evaluation)
(online) 12.3
Camille (parameter-passing
Appendix D mechanisms)
13 14
The pathway modeled here retains the entirety of each of the “Types” and
“Other Styles of Programming” modules, but omits Chapter 12 of the “Interpreter
Implementation” module, except for the conceptual parts (i.e., the survey of
parameter-passing mechanisms, including lazy evaluation).
xxii PREFACE
Note to Readers
Establishing an understanding of the organization and concepts of programming
languages and the elegant programming abstractions/techniques enabled by a
mastery of those concepts requires work. This text encourages its reader to learn
PREFACE xxiii
Competency Chapter(s)
A. Present the design and implementation of a class considering 10–12
object-oriented encapsulation mechanisms (e.g., class
hierarchies, interfaces, and private members).
B. Produce a brief report on the implementation of a basic 5
algorithm considering control flow in a program using
dynamic dispatch that avoids assigning to a mutable state
(or considering reference equality) for two different
languages.
C. Present the implementation of a useful function that takes and 5–6, 8–9
returns other functions considering variables and lexical
scope in a program as well as functional encapsulation
mechanisms.
D. Use iterators and other operations on aggregates (including 5, 8, 12–13
operations that take functions as arguments) in two
programming languages and present to a group of
professionals some ways of selecting the most natural
idioms for each language.
E. Contrast and present to peers
(1) the procedural/functional approach (defining a function 8–9
for each operation with the function body providing a case
for each data variant) and
(2) the object-oriented approach (defining a class for each data 10–12
variant with the class definition providing a method for
each operation).
F. Write event handlers for a web developer for use in reactive 13
systems such as GUIs.
G. Demonstrate program pieces (such as functions, classes, 7-13
methods) that use generic or compound types, including for
collections to write programs.
H. Write a program for a client to process a representation of code 5, 10–13
that illustrates the incorporation of an interpreter, an
expression optimizer, and a documentation generator.
I. Use type-error messages, memory leaks, and dangling-pointer 6–7
to debug a program for an engineering firm.
Table 1 Mapping from the ACM/IEEE Computing Curricula 2020 to Chapters of This
Text
language concepts much as one learns to swim or drive a car—not just by reading
about it, but by doing it—and within that space lies the joy. A key theme of this text
is the emphasis on implementation. The programming exercises afford the reader
ample opportunities to implement the language concepts we discuss and require
a fair amount of critical thought and design.
xxiv PREFACE
Supplemental Material
Supplemental material for this text, including presentation slides and other
instructor-related resources, is available online.
PREFACE xxv
Acknowledgments
With a goal of nurturing students, and with an abiding respect for the craft of
teaching and professors who strive to teach well, I have sought to produce a text
that both illuminates language concepts that are enlightening to the mind and is
faithful and complete as well as useful and practical. Doing so has been a labor of
love. This text would not have been possible without the support and inspiration
from a variety of sources.
I owe a debt of gratitude to the computer scientists with expertise in languages
who, through authoring the beautifully crafted textbooks from which I originally
xxvi PREFACE
learned this material, have broken new ground in the pedagogy of programming
languages: Abelson and Sussman (1996); Friedman, Wand, and Haynes (2001);
and Friedman and Felleisen (1996a, 1996b). I am particularly grateful to the
scholars and educators who originally explored the language landscape and how
to most effectively present the concepts therein. They shared their results with
the world through the elegant and innovative books they wrote with precision
and flair. You are truly inspirational. My view of programming languages and
how best to teach languages has been informed and influenced by these seminal
books. In writing this text, I was particularly inspired by Essentials of Programming
Languages (Friedman, Wand, and Haynes 2001). Chapters 10–11 and Sections 12.2,
12.4, 12.6, and 12.7 of this text are inspired by their Chapter 3. Our contribution is
the use of Python to build EOPL-style interpreters. The Little Schemer (Friedman and
Felleisen 1996a) and The Seasoned Schemer (Friedman and Felleisen 1996b) were a
delight to read and work through, and The Structure and Interpretation of Computer
Programs (Abelson and Sussman 1996) will always be a classic. These books are
gifts to our field.
Other books have also been inspiring and influential in forming my approach
to teaching and presenting language concepts, including Dybvig 2009, Graham
(2004b, 1993), Kamin (1990), Hutton (2007), Krishnamurthi (2003, 2017), Thompson
(2007), and Ullman (1997). Readers familiar with these books will observe their
imprint here. I have attempted to weave a new tapestry here from the palette
set forth in these books through my synthesis of a conceptual/principles-based
approach with an interpreter-based approach. I also thank James D. Arthur, Naren
Ramakrishnan, and Stephen H. Edwards at Virginia Tech, who first shared this
material with me.
I have also been blessed with bright, generous, and humble students who have
helped me with the development of this text in innumerable ways. Their help is
heartfelt and very much appreciated. In particular, Jack Watkin, Brandon Williams,
and Zachary Rowland have contributed significant time and effort. I am forever
thankful to and for you. I also thank other University of Dayton students and
alumni of the computer science program for helping in various ways, including
Travis Suel, Patrick Marsee, John Cresencia, Anna Duricy, Masood Firoozabadi,
Adam Volk, Stephen Korenewych, Joshua Buck, Tyler Masthay, Jonathon Reinhart,
Howard Poston, and Philip Bohun.
I thank my colleagues Phu Phung and Xin Chen for using preliminary editions
of this text in their courses. I also thank the students at the University of Dayton
who used early manuscripts of this text in their programming languages courses
and provided helpful feedback.
Thanks to John Lewis at Virginia Tech for putting me in contact with Jones
& Barlett Learning and providing guidance throughout the process of bringing
this text to production. I thank Simon Thompson at the University of Kent (in
the United Kingdom) for reviewing a draft of this maniscript and providing
helpful feedback. I am grateful to Doug Hodson at the Air Force Institute of
Technology and Kim Conde at the University of Dayton for providing helpful
PREFACE xxvii
Saverio Perugini
April 2021
About the Author
13.1 The general call/cc continuation capture and invocation process. 553
13.2 Example of call/cc continuation capture and invocation process. 554
xxxiv LIST OF FIGURES
D.1 The grammar in EBNF for the Camille programming language. . . . 812
List of Tables
5.1 Examples of Shortening car-cdr Call Chains with Syntactic Sugar 151
5.2 Binding Approaches Used in let and let* Expressions . . . . . . . 157
5.3 Reducing let to lambda. . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.4 Reducing let* to lambda. . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.5 Reducing letrec to lambda. . . . . . . . . . . . . . . . . . . . . . . . . 162
5.6 Semantics of let, let*, and letrec . . . . . . . . . . . . . . . . . . . 163
5.7 Functional Programming Design Guidelines . . . . . . . . . . . . . . . 181
9.1 Support for C/C++ Style structs and unions in ML, Haskell,
Python, and Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
9.2 Support for Composition and Decomposition of Variant Records in
a Variety of Programming Languages. . . . . . . . . . . . . . . . . . . . 354
9.3 Summary of the Programming Exercises in This Chapter Involving
the Implementation of a Variety of Representations for an
Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
9.4 The Variety of Representations of Environments in Racket Scheme
and Python Developed in This Chapter . . . . . . . . . . . . . . . . . . 375
9.5 List-of-Lists/Vectors Representations of an Environment Used in
Programming Exercise 9.8.4. . . . . . . . . . . . . . . . . . . . . . . . . . 377
9.6 List-of-Lists Representations of an Environment Used in Program-
ming Exercise 9.8.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
9.7 Comparison of the Main Concepts and Features of ML and Haskell 384
Introduction
A language that doesn’t affect the way you think about programming,
is not worth knowing.
— Alan Perlis (1982)
Since language concepts are the building blocks from which all languages are
constructed and organized, an understanding of the concepts implies that, given a
(new) language, one can:
• Deconstruct it into its essential concepts and determine the implementation
options for these concepts.
• Focus on the big picture (i.e., core concepts/features and options) and not
language nuisances or minutia (e.g., syntax).
• Discern in which contexts (e.g., application domains) it is an appropriate or
ideal language of choice.
• In turn, learn to use, assimilate, and harness the strengths of the language
more quickly.
1. Language definition time (e.g., the keyword int bound to the meaning of
integer)
2. Language implementation time (e.g., int data type bound to a storage size such
as four bytes)
3. Compile time (e.g., identifier x bound to an integer variable)
4. Link time (e.g., printf is bound to a definition from a library of routines)
5. Load time (e.g., variable x bound to memory cell at address 0x7cd7—can
happen at run-time as well; consider a variable local to a function)
Ò static bindings Ò
Ó dynamic bindings Ó
Language definition time involves defining the syntax (i.e., form) and semantics
(i.e., meaning) of a programming language. (Language definition and description
methods are the primary topic of Chapter 2.) Language implementation time is
the time at which a compiler or interpreter for the language is built. (Building
language interpreters is the focus of Chapters 10–12.) At this time some of the
semantics of the implemented language are bound/defined as well. The examples
given in the preceding list are not always performed at the particular time in
which they are classified. For instance, binding the variable x to the memory cell
at address 0x7cd7 can also happen at run-time in cases where x is a variable local
to a function or block.
The aforementioned bindings are often broadly categorized as either static or
dynamic (Table 1.1). A static binding happens before run-time (usually at compile
time) and often remains unchangeable during run-time. A dynamic binding
happens at run-time and can be changed at run-time. Dynamic binding is also
1.3. THE WORLD OF PROGRAMMING LANGUAGES 7
Static bindings occur before run-time and are fixed during run-time.
Dynamic bindings occur at run-time and are changeable during run-time.
and the granularity of those options often vary from language to language and
depend on factors such as the application domain targeted by the language and
the particular problem to be solved. Some concepts, including control abstraction,
are omitted in certain languages.
Beyond these fundamental/universal language concepts, an exploration of a
variety of programming styles and language support for these styles leads to
a host of other important principles of programming languages and language
constructs/abstractions (e.g., closures, higher-order functions, currying, and first-
class continuations).
1. In C, such statements return the value of i after the assignment takes place.
1.4. STYLES OF PROGRAMMING 9
x = i n t (input())
p r i n t (x + x)
If the input stream contains the integer 1 followed by the integer 2, readers
accustomed to imperative programming might predict the output of this
program to be 2 because the input function executes only once, reads the
value 1,2 and stores it in the variable x. However, one might interpret the
line print (x + x) as print (int(input()) + int(input())), since x
stands for int(input()). With this interpretation, one might predict the output
of the program to be 3, where the first and second invocations to input() read
1 and 2, respectively. While mathematics involves binding (e.g., let x = 1 in . . . ),
mathematics does not involve assignment.3
The aforementioned interpretation of the statement print (x + x) as
print (int(input()) + int(input())) might seem unnatural to most
readers. For those readers who are largely familiar with the imperative style of
programming, describing computation through side effect is so fundamental to
and ingrained into their view of programming and so unconsciously integrated
into their programming activities that the prior interpretation is viewed as
entirely foreign. However, that interpretation might seem entirely natural to a
mathematician or someone who has no experience with programming.
Side effects also make a program difficult to understand. For instance, consider
the following Python program:
def f():
global x
x = 2
return x
# main program
x = 1
p r i n t (x + f())
Function f has a side effect: After f is called, the global variable x has value
2, which is different than the value it had prior to the call to f. As a result,
the output of this program depends on the order in which the operands to the
2. The Python int function used here converts the string read with the input function to an integer.
3. The common programming idiom x=x+1 can be confusing to nonprogrammers because it appears
to convey that two entities are equal that are clearly not equal.
10 CHAPTER 1. INTRODUCTION
4. Ironically, John Backus, the recipient of the 1977 ACM A. M. Turing Award for contributions
to the primarily imperative programming language Fortran, titled his Turing Award paper “Can
Programming Be Liberated from the von Neumann Style?: A Functional Style and Its Algebra of
Programs.” This paper introduced the functional programming language FP through which Backus
(1978) cast his argument. While FP was never fully embraced by the industrial programming
community, it ignited both debate and interest in functional programming and subsequently
influenced multiple languages supporting a functional style of programming (Interview with Simon
Peyton-Jones 2017).
5. Computers have been designed for these inherently non-imperative styles as well (e.g., Lisp
machine and Warren Abstract Machine).
1.4. STYLES OF PROGRAMMING 11
Objects are program entities that encapsulate data and functionality. An object-
oriented style of programming typically unifies the concepts of data and
procedural abstraction through the constructs of classes and objects. The object-
oriented style of programming was pioneered in the Smalltalk programming
language, designed by Alan Kay and colleagues in the early 1970s at Xerox PARC.
8. Erlang is a language supporting concurrent and functional programming that was developed by
the telecommunications company Ericsson.
1.4. STYLES OF PROGRAMMING 13
Since the conjunction logical operator (^) is commutative, these two propositions
are semantically equivalent and, thus, it should not matter which of the two forms
we use in a program. However, since computers are deterministic systems, the
interpreter for a language supporting declarative programming typically evaluates
the terms on the left-hand side of these propositions (i.e., R and W) in a left-to-
right or right-to-left order. Thus, the desired result of the program can—due to side
effect and other factors—depend on that evaluation order, akin to the evaluation
order of the terms in the Python expression x + f() described earlier. Languages
supporting logic/declarative programming as the primary mode of performing
computation often equip the programmer with facilities to impart control over
the search strategy used by the system (e.g., the cut operator in Prolog). These
control facilities violate a defining principle of a declarative style—that is, the
programmer need only be concerned with the logic and can leave the control
(i.e., the inference methods used to produce program output) up to the system.
Unlike Prolog, the Mercury programming language is nearly pure in its support
for declarative programming because it does not support control facilities intended
to circumvent or direct the search strategy built into the system (Somogyi,
Henderson, and Conway 1996). Moreover, the form of the specification of the facts
and rules in a logic/declarative program should have no bearing on the output
of the program. Unfortunately, it often does. Mercury is the closest to a language
1.4. STYLES OF PROGRAMMING 15
9. http://arclanguage.org
1.4. STYLES OF PROGRAMMING 17
10. A paradigm is a worldview—a model. A model is a simplified view of some entity in the real world
(e.g., a model airplane) that is simpler to interact with. A programming language paradigm refers to a
style of performing computation from which programming in a language adhering to the tenets of that
style proceeds. A language paradigm can be thought of as a family of natural languages, such as the
Romance languages or the Germanic languages.
11. In the past, even the classical functional and logic/declarative paradigms, and specifically the
languages Lisp and Prolog, respectively, were considered paradigms primarily for artificial intelligence
applications even though the emacs text editor for UNIX and Autocad are two non-AI applications that
are more than 30 years old and were developed in Lisp. Now there are Lisp and Prolog applications
in a variety of other domains (e.g., Orbitz). We refer the reader to Graham (1993, p. 1) for the details of
the origin of the (accidental) association between Lisp and AI . Nevertheless, certain languages are still
ideally suited to solve problems in a particular niche application domain. For instance, C is a language
for systems programming and continues to be the language of choice for building operating systems.
18 CHAPTER 1. INTRODUCTION
15. For instance, the object-relational impedance mismatch between relational database systems (e.g.,
PostgreSQL or MySQL) and languages supporting object-oriented programming—which refers to the
challenge in mapping relational schemas and database tables (which are set-, bag-, or list-oriented) in
a relational database system to class definitions and objects—is more a reflection of differing levels
of granularity in the various data modeling support structures than one fundamental to describing
computation.
16. Some have stated that Lisp stands for Lisp Is Superfluous Parentheses.
20 CHAPTER 1. INTRODUCTION
INTERPRETERS
operationalize
UNIVERSAL CONCEPTS
bindings
syntax semantics scope parameter passing types control
R MATLAB Haskell* Swift Erlang Elixir C # Go Lua SQL
C ML JavaScript Factor Clojure TypeScript
Kotlin Ruby
Java
BASIC Scheme Eiffel C++ Python Perl Julia Mercury*
Fortran Common Lisp Smalltalk* Scala Dylan* Rust Prolog
dataflow
STYLES
OF
PROGRAMMING
Figure 1.2 Within the context of their support for a variety of programming styles,
all languages involve a core set of universal concepts that are operationalized
through an interpreter and provide a basis for (comparative) evaluation. Asterisks
indicate (near-)purity with respect to programming style.
one style (e.g., imperative thought) into another (e.g., functional constructs), and
vice versa.
An advantageous outcome of learning to solve problems using an unfamiliar
style of programming (e.g., functional, declarative) is that it involves a
fundamental shift in one’s thought process toward problem decomposition and
solving. Learning to think and program in alternative styles typically entails
unlearning bad habits acquired unconsciously through the use of other languages
to accommodate the lack of support for that style in those languages. Consider
how a programmer might implement an inherently recursive algorithm such as
mergesort using a language without support for recursion:
Paul Graham (2004b, p. 242) describes the effect languages have on thought
as the Blub Paradox.17 Programming languages and the use thereof are—
perhaps, so far—the only conduit into the science of computing experienced
by students. Because language influences thought and capacity for thought, an
improved understanding of programming languages and the different styles of
programming supported by that understanding result in a more holistic view of
computation.18 Indeed, a covert goal of this text or side effect of this course of
study is to broaden the reader’s understanding of computation by developing
additional avenues through which to both experience and describe/effect
computation in a computer program (Figure 1.3). An understanding of Latin—
even an elementary understanding—not only helps one learn new languages
but also improves one’s use and command over their native language. Similarly,
an understanding of both Lisp and the linguistic ideas central to it—and, more
generally, the concepts of languages—will help you more easily learn new
programming languages and make you a better programmer in your language
of choice. “[L]earning Lisp will teach you more than just a new language—it will
teach you new and more powerful ways of thinking about programs” (Graham
1996, p. 2).
17. Notice use of the phrase “thinking in” instead of “programming in.”
18. The study of formal languages leads to the concept of a Turing machine; thus, language is integral
to the theory of computation.
22 CHAPTER 1. INTRODUCTION
Im
pe
ra
tive
Object-oriented
Computation
Functional
a goal of
ive
this text at
cl ar
De
g ic/
Lo
Figure 1.3 Programming languages and the styles of programming therein are
conduits into computation.
writing generates and clarifies thoughts (Graham 1993, p. 2). For instance,
the process of enumerating a list of groceries typically leads to thoughts
of additional items that need to be purchased, which are then listed, and
so on. An alternative to structured programming is literate programming, a
notion introduced by Donald Knuth. Literate programming involves crafting
a program as a representation of one’s thoughts in natural language rather
than based on constraints imposed by computer architecture and, therefore,
programming languages.20 Moreover, in the 1980s the discussion around
the ideas of object-oriented design emerged through the development of
Smalltalk—an interpreted language. Advances in computer hardware, and
particularly Moore’s Law,21 also helped reduce the emphasis on speed of
program execution as the overriding criterion in the design of programming
languages.
While fewer interpreted languages emerged in the 1980s compared to compiled
ones, the confluence of literate programming, object-oriented design, and Moore’s
Law sparked discussion of speed of development as a criterion for designing
programming languages.
The advent of the World Wide Web in the late 1990s and early 2000s
and the new interactive and networked computing platform on which it runs
certainly influenced language design. Language designers had to address the
challenges of developing software that was intended to run on a variety of
hardware platforms and was to be delivered or interacted with over a network.
Moreover, they had to deal with issues of maintaining state—so fundamental to
imperative programming—over a stateless (http) network protocol. For all these
reasons, programming for the web presented a fertile landscape for the practical
exploration of issues of language design. Programming languages tended toward
the inclusion of more dynamic bindings, so more interpreted languages emerged
at this time (e.g., JavaScript).
On the one hand, the need to develop applications with ever-evolving
requirements rapidly has attracted attention to the speed of development as
a more prominent criterion in the design of programming languages and has
continued to nourish the development of languages adopting more dynamic
bindings (e.g., Python). The ability, or lack thereof, to delay bindings until run-
time affects flexibility of program development. The more dynamic bindings
a language supports, the fewer the number of commitments the programmer
must make during program development. Thus, dynamic bindings provide
for convenient debugging, maintenance, and redesign when dealing with
errors or evolving program requirements. For instance, run-time binding of
messages to methods in Python allows programs to be more easily designed
during their initial development and then subsequently extended during their
maintenance.
20. While a novel concept, embraced by tools (e.g., Noweb) and languages (e.g., the proprietary
language Miranda, which is a predecessor of Haskell and similarly supports a pure form of functional
programming), the idea of literate programming never fully caught on.
21. Moore’s Law states that the number of transistors that can be placed inexpensively on an integrated
circuit doubles approximately every two years and describes the evolution of computer hardware.
24 CHAPTER 1. INTRODUCTION
pioneering
interpreted (meta−)languages
1960
with dynamic bindings
Lisp
Smalltalk
compiled languages
with static bindings supporting
imperative programming
influenced by influenced by
COBOL Fortran safety
computer
Ada C C++ strongly typed languages
architecture ML with static bindings
and
speed of execution supporting
Haskell functional
programming
1980 influenced by
advent of WWW and
speed of development
languages supporting
multiple styles of
2000 programming
Swift
Python JavaScript Scala Kotlin
Go TypeScript
Ruby Clojure Java Rust
Lua
C#
Dart Hack
with dynamic bindings with static bindings
2020
time
structured
programming
software
crisis
literate
programming
awareness of
speed of
object-oriented development
programming as a language
design criterion increased
emphasis
Moore’s Law on dynamic
(faster bindings
processors)
need for
portability
advent of the
WWW awareness
of safety and renewed
mobile/web security as
apps emphasis on
a language static bindings
design criterion
reconciling the need for both safety and flexibility are also starting to emerge (e.g.,
Hack and Dart). Figure 1.5 summarizes the factors influencing language design
discussed here.
With the computing power available today and the time-to-market demands
placed on software development, speed of execution is now less emphasized as a
design criterion than it once was.22 Software development process methodologies
have commensurately evolved in this direction as well and embrace this trend.
Agile methods such as extreme programming involve repeated and rapid tours
through the software development cycle, implying that speed of development is
highly valued.
22. In some engineering applications, speed of execution is still the overriding design criterion.
26 CHAPTER 1. INTRODUCTION
23. Architect Christopher Alexander and colleagues (1977) explored the relationship between
(architectural) patterns and languages and, as a result, inspired design patterns in software (Gamma
et al. 1995).
1.7. WHAT YOU WILL LEARN 27
• an ability to focus on the big picture (i.e., core concepts/features and options)
and not the minutia (e.g., syntax)
• an ability to (more rapidly) understand (new or unfamiliar) programming
languages
• an improved background and richer context for discerning appropriate
languages for particular programming problems or application domains
• an understanding of and experience with a variety of programming styles or,
in other words, an increased capacity to describe computational ideas
• a larger and richer arsenal of programming techniques to bring to bear
upon problem-solving and programming tasks, which will make you a better
programmer, in any language
• an increased ability to design and implement new languages
• an improved understanding of the (historical) context in which languages
exist and evolve
• a more holistic view of computer science
Exercise 1.3 There are many times in the study of programming languages. For
example, variables are bound to types in C at compile time, which means that they
remain fixed to their type for the lifetime of the program. In contrast, variables
are bound to values at run-time (which means that a variable’s value is not bound
until run-time and can change at any time during run-time). In total, there are six
(classic) times in the study of programming languages, of which compile time and
run-time are two. Give an alternative time in the study of programming languages,
and an example of something in C which is bound at that time.
Exercise 1.5 Explain how first-class functions can be simulated in C or C++. Write a
C or C++ program to demonstrate.
1.8. LEARNING OUTCOMES 29
Exercise 1.6 For each of the following entities, give all languages from the set
{C++, ML, Prolog, Scheme, Smalltalk} in which the entity is considered first-class:
(a) Function
(b) Continuation
(c) Object
(d) Class
Exercise 1.8 Are all functions without side effect referentially transparent? If not, give
a function without a side effect that is not referentially transparent.
Exercise 1.9 Are all referentially transparent functions without side effect? If not, give
a function that is referentially transparent, but has a side effect.
This function cannot modify its parameters because it has none. Moreover, it does
not modify its external environment because it does not access any global data or
perform any I / O. Therefore, the function does not have a side effect. However,
the assignment statement on line 3 does have a side effect. How can this be? The
function does not have a side effect, yet it contains a statement with a side effect—
which seems like a contradiction. Does f have a side effect or not, and why?
Exercise 1.11 Identify two language evaluation criteria other than those discussed
in this chapter.
Exercise 1.12 List two language evaluation criteria that conflict with each other.
Provide two conflicts not discussed in this chapter. Give a specific example of each
to illustrate the conflict.
Exercise 1.13 Fill in the blanks in the expressions in the following table with terms
from the set:
{Dylan, garbage collection, Haskell,
lazy evaluation, Prolog, Smalltalk, static typing}
Go = C `
Curry = ` Prolog
= Lisp ` Smalltalk
Objective-C = C `
TypeScript = JavaScript `
Mercury = ´ impurities
Haskell = ML `
30 CHAPTER 1. INTRODUCTION
Exercise 1.15 Explore the Linda programming language. What styles of program-
ming does it support? For which applications is it intended? What is Linda-calculus
and how does it differ conceptually from λ-calculus?
Exercise 1.16 Identify a programming language with which you are unfamiliar—
perhaps even a language mentioned in this chapter. Try to describe the language
through its most defining characteristics.
Exercise 1.17 Read M. Swaine’s 2009 article “It’s Time to Get Good at Functional
Programming” in Dr. Dobb’s Journal and write a 250-word commentary on it.
Exercise 1.18 Read N. Savage’s 2018 article “Using Functions for Easier Program-
ming” in Communications of the ACM, available at https://doi.acm.org/10.1145
/3193776, and write a 100-word commentary on it.
bifurcated into languages involving primarily static binding and those involving
primarily dynamic bindings (Figure 1.4).
Since language concepts are the building blocks from which all languages are
constructed/organized, an understanding of the concepts implies that one can
focus on the core language principles (e.g., parameter passing) and the particular
options (e.g., pass-by-reference) used for those principles in (new or unfamiliar)
languages rather than fixating on the details (e.g., syntax), which results in an
improved dexterity in learning, assimilating, and using programming languages.
Moreover, an understanding and experience with a variety of programming styles
and exotic ways of performing computation establishes an increased capacity for
describing computation in a program, a richer toolbox of techniques from which
to solve problems, and a more well-rounded picture of computing.
lexically valid because all of its words are valid and syntactically valid because the
arrangement of those words conforms to the subject–verb–article–object structure
of English sentences, but it is not semantically valid because the sentence does
not make sense. The fourth candidate sentence is lexically, syntactically, and
semantically valid. Notice that these types of sentence validity are progressive.
Once a candidate sentence fails any test for validity, it automatically fails a more
stringent test for validity. In other words, if a candidate sentence does not even
have valid words, those words can never be arranged correctly. Similarly, if
the words of a candidate sentence are not arranged correctly, that sentence can
never make semantic sense. For instance, the second sentence in Table 2.1 is not
syntactically valid so it can never be semantically valid.
Recall that validating a string as a sentence is a set-membership problem. We
saw previously that the first step to determining if a string of words, where a
word is a string of non-whitespace characters, is a sentence is to determine if each
individual word is a sentence (in a simpler language). Only after the validity of
every individual word in the entire string is established can we examine whether
the words are arranged in a proper order according to the particular language in
which this particular, entire string is a candidate sentence. Notice that these steps
are similar to the steps an interpreter or compiler must execute to determine the
validity of a program (i.e., to determine if the program has any syntax errors).
Table 2.2 illustrates these steps of determining program expression validity. Next,
we examine those steps through a formal lens.
opus(1+2+3+4+5+6+7+8+9)(0+1+2+3+4+5+6+7+8+9)‹
1. Sometimes some of the characters in the set of metacharacters are also in the alphabet of the
language being defined (i.e., RE X ‰ H). In these cases, there must be a way to disambiguate the
meaning of the overloaded character. For example, a \ is used in UNIX to escape the special meaning of
the metacharacter following it.
2.3. REGULAR EXPRESSIONS AND REGULAR LANGUAGES 37
and
(0+1+. . . +8+9)(0+1+. . . +8+9)(0+1+. . . +8+9)–(0+1+. . . +8+9)(0+1+. . . +8+9) –(0+1+. . . +8+9)(0+1+. . . +8+9)(0+1+. . . +8+9)(0+1+. . . +8+9)
either natively in the case of scripting languages (e.g., Perl and Tcl) or through
a library or package (e.g., Python, Java, Go).2
_ + alphabetic + digit
2
_ + alphabetic
non-zero digit
3
digit
Figure 2.1 A finite-state automaton for a legal identifier and positive integer in the
C programming language.
Exercise 2.3.2 Give a regular expression that denotes the language of five-digit zip
codes (e.g., 45469) with an optional four-digit extension (e.g., 45469-0280).
Exercise 2.3.4 Give a regular expression that denotes the language of decimals
representing ASCII characters (i.e., integers between between 0–127, without
leading 0s for any integer except 0 itself). Thus, the strings 0, 2, 25, and 127 are
in the language, but 00, 02, 000, 025, and 255 are not.
40 CHAPTER 2. FORMAL LANGUAGES AND GRAMMARS
Exercise 2.3.5 Give a regular expression for the language of zero or more nested,
matched parentheses, where every opening and closing parenthesis has a match
of the other type, with the matching opening parentheses appearing before the
matching closing parentheses in the sentence, but where the parentheses are
never nested more than three levels deep (i.e., no character in the string is ever
within more than three levels of nesting). To avoid confusion between parentheses
in the string and parentheses used for grouping in the regular expression, use
the “l” and “r” characters to denote left (i.e., opening) and right (i.e., closing)
parentheses in the string, respectively.
Exercise 2.3.6 Since all finite languages are regular, we can construct an FSA for
any finite language. Describe how an FSA for a finite language can be constructed.
S Ñ aS
S Ñ ε
By applying the production rules, beginning with the start symbol, a grammar
can be used to generate a sentence from the language it defines. For instance, the
following is a derivation of the sentence aaaa:
r1 r1 r1 r1 r2
S ñ aS ñ aaS ñ aaaS ñ aaaaS ñ aaaa
Note that every application of a production rule involves replacing the non-
terminal on the left-hand side of the rule with the entire right-hand side of the
rule. The semantics of the symbol ñ is “derives” and the symbol indicates a one-
step derivation relation. The rn annotation over each ñ symbol indicates which
production rule is used in the substitution. The ñ‹ symbol indicates a zero-or-
more-step derivation relation. Thus, S ñ‹ aaaa.
A formal grammar is a generative construct for a formal language. In other
words, a grammar generates sentences from the language it defines. Formally, if
G “ pV, , S, Pq, then the language generated by G is LpGq “ t | P ‹ and
S ñ‹ u. A grammar for the language denoted by the regular expression opus‹
is ptS, W u, to, p, , s u, tS Ñ opW, W Ñ sW, W Ñ εuq, which generates the
language {opu, opus, opuss, . . . }.
X Ñ zY
X Ñ z
where X P V, Y P V, and z P ‹ . A grammar whose production rules conform to
these patterns is called a right-linear grammar. Grammars whose production rules
conform to the following pattern are called left-linear grammars:
X Ñ Yz
X Ñ z
Left-linear grammars also generate regular languages. Notice the one-for-one
replacement of a non-terminal for a non-terminal in V in the rules of a right- or
left-linear grammar. Thus, a regular grammar is also referred to as a linear grammar.
Regular grammars define a class of languages known as regular languages.
A regular grammar is a generative device for a regular language. In other words,
it generates sentences from the regular language it defines. However, a grammar
does not have to be regular to generate a regular language. We leave it as an
42 CHAPTER 2. FORMAL LANGUAGES AND GRAMMARS
exercise to define a non-regular grammar that defines a regular language (i.e., one
that can be denoted by a regular expression; Conceptual Exercise 2.10.7).
In summary, a regular language (which is the most restrictive type of formal
language) is:
X Ñ γ
definition is less restrictive than that of a regular grammar, every regular grammar
is also a context-free grammar, but the reverse is not true.
Context-free grammars define a class of formal languages called context-free
languages. The concept of balanced pairs of syntactic entities—the essence of a
Dyck language—is at the heart of context-free languages. This single syntactic
feature (and its variations) distinguishes regular languages from context-free
languages, and the capability of expressing balanced pairs is the essence of a
context-free grammars.
As briefly shown here, grammars are used to generate sentences from the
language they define. Beginning with the start symbol and repeatedly applying the
production rules until the string contains no non-terminals results in a derivation—
a sequence of applications of the production rules of a grammar beginning with
the start symbol and ending with a sentence (i.e., a string of all terminals arranged
according to the rules of the grammar). For example, consider deriving the
sentence “the apple is there.” from the preceding grammar. The rn parenthesized
annotation on the right-hand side of each application indicates which production
rule was used in the substitution:
ăsentenceą ñ ărtceąănonąăerbąăderbą . (r1 )
ñ ărtceą ănoną ăerbą there. (r11 )
ñ ărtceą ănoną is there. (r8 )
ñ ărtceą apple is there. (r5 )
ñ the apple is there. (r4 )
The result (on the right-hand side of the ñ symbol) of each step is a string
containing terminals and non-terminals that is called a sentential form. A sentence is
a sentential form containing only terminals.
Peter Naur extended BNF for A LGOL 60 to make the definition of the
production rules in a grammar more concise. While we discuss the details of
2.6. LANGUAGE GENERATION: SENTENCE DERIVATIONS 45
the extension, called Extended Backus–Naur Form (EBNF), later (in Section 2.10),
we cover one element of the extension, alternation, here since we use it in the
following examples. Alternation allows us to consolidate various production rules
whose left-hand sides match into a single rule whose right-hand side consists of
the right-hand sides of each of the individual rules separated by the | symbol.
Therefore, alternation is syntactic sugar, in that any grammar using it can be
rewritten without it. Syntatic sugar is a term coined by Peter Landin that refers
to special, typically terse syntax in a language that serves only as a convenient
method for expressing syntactic structures that are traditionally represented in the
language through uniform and often long-winded syntax. With alternation, we
can define the preceding grammar, which contains 11 production rules with only
5 rules:
(r1 ) ăsentenceą Ñ ărtceą ănoną ăerbą ăderbą.
(r2 ) ărtceą Ñ a | an | the
(r3 ) ănoną Ñ apple | rose | umbrella
(r4 ) ăerbą Ñ is | appears
(r5 ) ăderbą Ñ here | there
4. Interestingly, Chomsky and Backus/Naur developed their notion for defining grammars
independently. Thus, the two notions have some minor differences: Chomsky used uppercase letters
for non-terminals, the Ñ symbol in production rules, and ε as the empty string; Backus/Naur used
words in any case enclosed in ăą symbols, ::=, and ăemptyą, respectively.
46 CHAPTER 2. FORMAL LANGUAGES AND GRAMMARS
ñ ădgtąădgtąădgtą (r10 )
ñ 1 ădgtąădgtą (r11 )
ñ 13 ădgtą (r11 )
ñ 132 (r11 )
Some derivations, such as the next two derivations, are neither leftmost nor
rightmost:
grammar grammar
Figure 2.2 The dual nature of grammars as generative and recognition devices.
(left) A language generator that accepts a grammar and a start symbol and
generates a sentence from the language defined by the grammar. (right) A
language parser that accepts a grammar and a string and determines if the string
is in the language.
A generator applies the production rules of a grammar forward. A parser applies the
rules backward.5
Consider parsing the string x ` y ‹ z. In the following parse, . denotes “top of
the stack”:
1 .x`y‹z (shift)
2 x.`y‹z (reduce r6 )
3 ădą . ` y‹ z (reduce r5 )
4 ăeprą . ` y ‹ z (shift)
5 ăeprą ` . y ‹ z (shift)
6 ăeprą ` y . ‹ z (reduce r6 )
5. Another class of parsers applies production rules in a top-down fashion (Section 3.4).
48 CHAPTER 2. FORMAL LANGUAGES AND GRAMMARS
The left-hand side of the . represents a stack and the right-hand side of the . (i.e.,
the top of the stack) represents the remainder of the string to be parsed, called the
handle. At each step, either shift or reduce. To determine which to do, examine
the stack. If the items at the top of the stack match the right-hand side of any
production rule, replace those items with the non-terminal on the left-hand side of
that rule. This is known as reducing. If the items at the top of the stack do not match
the right-hand side of any production rule, shift the next lexeme on the right-hand
side of the . to the stack. If the stack contains only the start symbol when the input
string is entirely consumed (i.e., shifted), then the string is a sentence; otherwise,
it is not.
This process is called shift-reduce or bottom-up parsing because it starts with
the string or, in other words, the terminals, and works back through the non-
terminals to the start symbol. A bottom-up parse of an input string constructs a
rightmost derivation of the string in reverse (i.e., bottom-up). For instance, notice
that reading the lines of the rightmost derivation in Section 2.6 in reverse (i.e., from
the bottom line up to the top line) corresponds to the shift-reduce parsing method
discussed here. In particular, the production rules in the preceding shift-reduce
parse of the string x ` y ‹ z are applied in reverse order as those in the rightmost
derivation of the same string in Section 2.6. Later, in Chapter 3, we contrast this
method of parsing with top-down or recursive-descent parsing. The preceding parse
proves that x ` y ‹ z is a sentence.
1 .x`y‹z (shift)
2 x.`y‹z (reduce r6 )
3 ădą . ` y ‹ z (reduce r5 )
4 ăeprą . ` y ‹ z (shift)
5 ăeprą ` . y ‹ z (shift)
6 ăeprą ` y . ‹ z (reduce r6 )
7 ăeprą ` ădą . ‹ z (reduce r5 )
8 ăeprą ` ăeprą . ‹ z (reduce r1 ; emit addition;
why not shift here instead?)
2.8. SYNTACTIC AMBIGUITY 49
9 ăeprą . ‹ z (shift)
10 ăeprą ‹ . z (shift)
11 ăeprą ‹ z . (reduce r6 )
12 ăeprą ‹ ădą . (reduce r5 )
13 ăeprą ‹ ăeprą . (reduce r3 ; emit multiplication)
14 ăeprą . (start symbol; this is a sentence)
Which of these two parses is preferred? How can we evaluate which is preferred?
On what criteria should we evaluate them? The short answer to these questions
is: It does not matter. The objective of language recognition and parsing is to
determine if the input string is a sentence (i.e., does its structure conform to the
grammar). Both of these parses meet that objective; thus, with respect to syntax,
they both equally meet the objective. Here, we are only concerned with the
syntactic validity of the string, not whether it makes sense (i.e., semantic validity).
Parsing deals with syntax rather than semantics.
However, parsers often address issues of semantics with techniques originally
intended only for addressing syntactic validity. One reason for this is that,
unfortunately, unlike for syntax, we do not have formal models of semantics that
are easily implemented in a computer system. Another reason is that addressing
semantics while parsing can obviate the need to make multiple passes through
the input string. While formal systems help us reason about concepts such as
syntax and semantics, programming language systems implemented based on
these formalisms must address practical issues such as efficiency. (Certain types
of parsers require the production rules of the grammar of the language of the
sentences they parse to be in a particular form, even though the same language
can be defined using production rules in multiple forms. We discuss this concept
in Chapter 3.) Therefore, although this approach is considered impure from a
formal perspective, sometimes we address syntax and semantics at the same time
(Table 2.6).
identifier names that imply the meaning of the variable to which they refer (e.g.,
rate and index vis-à-vis x and y).
Here we would like to infuse semantics into parsing in an identifiable way.
Specifically, we would like to evaluate the expression while parsing it. This helps
us avoid making unnecessary passes over the string if it is a sentence. Again, it
is important to realize we are shifting from the realm of syntactic validity into
interpretation. The two should not be confused, as they serve different purposes.
Determining if a string is a sentence is completely independent of evaluating it
for a return value. We often subconsciously impart semantics onto an expression
such as x ` y ‹ z because without any mention of meaning we presume it is a
mathematical expression. However, it is simply a string conforming to a syntax
(i.e., form) and can have any interpretation or meaning we impart to it. Indeed,
the meaning of the expression x ` y ‹ z could be a list of five elements.
Thus, in evaluating an expression while parsing it, we are imparting
knowledge of how to interpret the expression (i.e., semantics). Here, we interpret
these sentences as standard mathematical expressions. However, to evaluate these
mathematical expressions, we must adopt even more semantics beyond the simple
interpretation of them as mathematical expressions. If they are mathematical
expressions, to evaluate them we must determine which operators have precedence
over each other [i.e., is x ` y‹ z interpreted as (x ` y) ‹ z or x + (y ‹ z)] as well
as the order in which each operator associates [i.e., is 6 ´ 3 ´ 2 interpreted as
(6 ´ 3) ´ 2 or 6 ´ (3 ´ 2)?]. Precedence deals with the order of distinct operators
(e.g., ‹ computes before `), while associativity deals with the order of operators
with the same precedence (e.g., ´ associates left-to-right).
Formally, a binary operator ‘ on a set S is associative if p ‘ bq ‘ c “
‘ pb ‘ cq @, b, c P S. Intuitively, associativity means that the value of an
expression containing more than one instance of a single binary associative
operator is independent of evaluation order as long as the sequence of the
operands is unchanged. In other words, parentheses are unnecessary and
rearranging the parentheses in such an expression does not change its value.
Notice that both parses of the expression x + y ‹ z are the same until line 8,
where a decision must be made to shift or reduce. The first parse shifts while
the second reduces. Both lead to successful parses. However, if we evaluate the
expression while parsing it, each parse leads to different results. One way to
evaluate a mathematical expression while parsing it is to emit the mathematical
operation when reducing. For instance, in step 12 of the first parse, when we
reduce ă epr ą ‹ ă epr ą to ă epr ą, we can compute y ‹ z. Similarly, in
step 13 of that same parse, when we reduce ăepr ą ` ăepr ą to ăepr ą, we
can compute x ` ăthe rest compted n step 12ą. This interpretation [i.e., x
+ (y ‹ z)] is desired because in mathematics multiplication has higher precedence
than addition. Now consider the second parse. In step 8 of that parse, when we
(prematurely) reduce ă epr ą ` ă epr ą to ă epr ą, we compute x ` y.
Then in step 13, when we reduce ă epr ą ‹ ă epr ą to ă epr ą, we compute
ă the rest compted n step 8 ą ‹ z. This interpretation [i.e., (x ` y) ‹ z] is
obviously not desired. If we shift at step 8, multiplication has higher precedence
2.8. SYNTACTIC AMBIGUITY 51
than addition (desired). If we reduce at step 8, addition has higher precedence than
multiplication (undesired). Therefore, we prefer the first parse. These two parses
exhibit a shift-reduce conflict. If we shift at step 8, then multiplication has higher
precedence than addition (which is the desired semantics). If we reduce at step 8,
then addition has higher precedence (which is the undesired semantics).
The possibility of a reduce-reduce conflict also exists. Consider the following
grammar:
(r1 ) ăeprą ::= ătermą
(r2 ) ăeprą ::= ădą
(r3 ) ătermą ::= ădą
(r4 ) ădą ::= x|y|z
and a bottom-up parse of the expression x:
.x (shift)
x. (reduce r4 )
ădą . (reduce r2 or r3 here?)
<expr> <expr>
y z x y
<expr> <expr>
<id> <term>
x <id>
ambiguous; a proof of ambiguity exists in Figure 2.4, which contains two parse
trees for the expression x.
Ambiguity is a term used to describe a grammar, whereas a shift-reduce
conflict and a reduce-reduce conflict are phrases used to describe a particular
parse. However, each concept is a different side of the same coin. If a grammar is
ambiguous, a bottom-up parse of a sentence in the language the grammar defines
will exhibit either a shift-reduce or reduce-reduce conflict, and vice versa.
Thus, proving a grammar is ambiguous is a straightforward process. All we
need to do is build two parse trees for the same expression. Much more difficult,
by comparison, is proving that a grammar is unambiguous.
It is important to note that a parse tree is not a derivation, or vice versa.
A derivation illustrates how to generate a sentence. A parse tree illustrates the
opposite—how to recognize a sentence. However, both prove a sentence is in
a language (Table 2.7). Moreover, while multiple derivations of a sentence (as
illustrated in Section 2.6) are not a problem, having multiple parse trees for a
sentence is a problem—not from a recognition standpoint, but rather from an
interpretation (i.e., meaning) perspective. Consider Table 2.8, which contains four
sentences from the four-function calculator grammar in Section 2.6. While the
Table 2.7 The Dual Use of Grammars: For Generation (Constructing a Derivation)
and Recognition (Constructing a Parse Tree)
<expr>
<number>
<number> <digit>
<number> <digit> 2
<digit> 3
<expr> <expr>
1 3 3 2
first sentence 132 has multiple derivations, it has only one parse tree (Figure 2.5)
and, therefore, only one meaning. The second expression, 1 ` 3 ` 2, in contrast,
has multiple derivations and multiple parse trees. However, those parse trees
(Figure 2.6) all convey the same meaning (i.e., 6). The third expression, 1 ` 3 ‹ 2,
also has multiple derivations and parse trees (Figure 2.7). However, its parse trees
each convey a different meaning (i.e., 7 or 8). Similarly, the fourth expression,
6 ´ 3 ´ 2, has multiple derivations and parse trees (Figure 2.8), and those parse
trees each have different interpretations (i.e., 1 or 5). The last three rows of Table 2.8
show the grammar to be ambiguous even though the ambiguity manifested in the
expression 1 ` 3 ` 2 is of no consequence to interpretation. The third expression
demonstrates the need for rules establishing precedence among operators, and
the fourth expression illustrates the need for rules establishing how each operator
associates (left-to-right or right-to-left).
Bear in mind, that we are addressing semantics using a formalism intended for
syntax. We are addressing semantics using formalisms and techniques reserved
for syntax primarily because we do not have easily implementable methods
54 CHAPTER 2. FORMAL LANGUAGES AND GRAMMARS
<expr> <expr>
3 2 1 3
<expr> <expr>
6 3 3 2
6. In the programming language APL, addition has higher precedence than multiplication.
2.8. SYNTACTIC AMBIGUITY 55
in the case here with precedence and associativity, we can modify the grammar so
that the ambiguity is removed, making the meaning (or semantics) determinable
from the grammar (syntax). When ambiguity is dependent on context, grammar
disambiguation to force one interpretation is not possible because you actually
want more than one interpretation, though only one per context. For instance,
the English sentence “Time flies like an arrow” can be parsed multiple ways. It
can be parsed to indicate that there are creatures called “time flies,” which really
like arrows (i.e., ă djecte ą ă non ą ă erb ą ă rtce ą ă non ą), or
metaphorically (i.e., ă non ą ă erb ą ă preposton ą ă rtce ą ă non ą).
English is a language with an ambiguous grammar. How can we determine
intended meaning? We need the surrounding context provided by the sentences
before and after this sentence. Consider parsing the sentence “Mary saw the
man on the mountain with a telescope.”, which also has multiple interpretations
corresponding to the different parses of it. This sentence has syntactic ambiguity,
meaning that the same sentence can be diagrammed (or parsed) in multiple ways
(i.e., it has multiple syntactic structures). “They are moving pictures.” and “The
duke yet lives that Henry shall depose.”7 are other examples of sentences with
multiple interpretations.
English sentences can also exhibit semantic ambiguity, where there is only
one syntactic structure (i.e., parse), but the individual words can be interpreted
differently. An underlying source of these ambiguities is the presence of
polysemes—a word with one spelling and pronunciation, but different meanings
(e.g., book, flies, or rush). Polysemes are the opposite of synonyms—different words
with one meaning (e.g., peaceful and serene). Polysemes that are different parts of
speech (e.g., book, flies, or rush) can cause syntactic ambiguity, whereas polysemes
that are the same part of speech (e.g., mouse) can cause semantic ambiguity. Note
that not all sentences with syntactic ambiguity contain a polyseme (e.g., “They are
moving pictures.”). For summaries of these concepts, see Tables 2.9 and 2.10.
Similarly, in programming languages, the source of a semantic ambiguity is not
always a syntactic ambiguity. For instance, consider the expression (Integer)-a
on line 5 of the following Java program:
1 c l a s s SemanticAmbiguity {
2 public s t a t i c void main(String args[]) {
3 i n t a = 1;
4 i n t Integer = 5;
5 i n t b = (Integer)-a;
6 System.out.println(b); // prints 4, not -1
7 b = (Integer)(-a);
8 System.out.println(b); // prints -1, not 4
9 }
10 }
The expression (Integer)-a (line 5) has only one parse tree given the grammar
of a four-function calculator presented this section (assuming Integer is an
ă d ą) and, therefore, is syntactically unambiguous. However, that expression
has multiple interpretations in Java: (1) as a subtraction—the variable Integer
minus the variable a, which is 4, and (2) as a type cast—type casting the value -a
(or -1) to a value of type Integer, which is -1. Table 2.11 contains sentences
from both natural and programming languages with various types of ambiguity,
and demonstrates the interplay between those types. For example, a sentence
without syntactic ambiguity can have semantic ambiguity; and a sentence without
semantic ambiguity can have syntactic ambiguity.
We have two options for dealing with an ambiguous grammar, but both have
disadvantages. First, we can state disambiguation rules in English (i.e., attach
notes to the grammar), which means we do not have to alter (i.e., lengthen)
the grammar, but this comes at the expense of being less formal (by the use of
English). Alternatively, we can disambiguate the grammar by revising it, which
is a more formal approach than the use of English, but this inflates the number
of production rules in the grammar. Disambiguating a grammar is not always
possible. The existence of context-free languages for which no unambiguous
context-free grammar exists has been proven (in 1961 with Parikh’s theorem). These
languages are called inherently ambiguous languages.
Ambiguity
Sentence Lexical Syntactic Semantic
‘ ‘ ‘
flies ‘ ‘ ‘
Time flies like an arrow. ‘
They are moving pictures. ˆ ˆ
‘ ‘ ‘
* ‘
1+3+2 ˆ
‘ ‘ ˆ
‘
1+3*2 ‘
(Integer)-a ˆ ˆ
i f expr1
i f expr2
stmt1
else
stmt2
i f expr1
i f expr2
stmt1
else
stmt2
<stmt> <stmt>
(a < 2) y (a < 2)
(b > 3) x (b > 3) x y
Figure 2.9 Parse trees for the sentence if (a < 2) if (b > 3) x else y.
(left) Parse tree for an if–pifq–else construction. (right) Parse tree for an
if–pif–elseq construction.
In C, the semantic rule is that an else associates with the closest unmatched if
and, therefore, the first interpretation is used.
Consider the following grammar for generating if–else statements:
Using this grammar, we can generate the following statement (save for the
comment):
i f (a < 2)
i f (b > 3)
x = 4;
e l s e /* associates with which if above ? */
y = 5;
for which we can construct two parse trees (Figure 2.9) proving that the grammar
is ambiguous. Again, since formal methods for modeling semantics are not easily
implementable, we need to revise the grammar (i.e., syntax) to imply the desired
meaning (i.e., semantics). We can do that by disambiguating this grammar so
that it is capable of generating if sentences that can only be parsed to imply
that any else associates with the nearest unmatched if (i.e., parse trees of the
form shown on the right side of Figure 2.9). We leave it as an exercise to develop
an unambiguous grammar to solve the dangling else problem (Conceptual
Exercise 2.10.25).
Notice that while semantics (e.g., precedence and associativity) can sometimes
be reasonably modeled using context-free grammars, which are devices for
modeling the syntactic structure of language, context-free grammars can always
be used to model the lexical structure (or lexics) of language, since any regular
language can be modeled by a context-free grammar. For instance, embedded into
the first grammar of a four-function calculator presented in this section is the lexics
of the numbers:
60 CHAPTER 2. FORMAL LANGUAGES AND GRAMMARS
ăsymbo-eprą ::= x
ăsymbo-eprą ::= y
ăsymbo-eprą ::= z
ăsymbo-eprą ::= (ăs-stą)
ăs-stą ::= ăs-stą, ăsymbo-eprą
ăs-stą ::= ăsymbo-eprą
which can be used to derive the following sentences: x, (x, y, z), ((x)), and (((x)),
((y), (z))). We can reexpress this grammar in EBNF using alternation as follows:
or expressed alternatively as
These extensions are intended for ease of grammar definition. Any grammar
defined in EBNF can be expressed in BNF. Thus, these shortcuts are simply syntactic
sugar. In summary, a context-free language (which is a type of formal language)
is generated by a context-free grammar (which is a type of formal grammar) and
recognized by a pushdown automaton (which is a model of computation).
Exercise 2.10.2 Define a regular grammar in EBNF for the language of Conceptual
Exercise 2.3.1.
Exercise 2.10.3 Define a regular grammar in BNF for the language of Conceptual
Exercise 2.3.3.
Exercise 2.10.4 Define a regular grammar in EBNF for the language of Conceptual
Exercise 2.3.3.
Exercise 2.10.5 Define a regular grammar in BNF for the language of Conceptual
Exercise 2.3.4.
Exercise 2.10.6 Define a regular grammar in EBNF for the language of Conceptual
Exercise 2.3.4.
Exercise 2.10.7 Define a grammar G, where G is not regular but defines a regular
language (i.e., one that can be denoted by a regular expression).
0s (e.g., 001 and 0001931), which four-function calculators are typically unable to
produce. Revise this grammar so that it is unable to generate numbers with leading
zeros, save for 0 itself.
Exercise 2.10.11 Reduce the number of production rules in the grammar of a four-
function calculator presented in Section 2.6. In particular, consolidate rules r1 –r4
into two rules by adding a new non-terminal ăopertorą.
where ăeprą and ătermą are non-terminals and `, ‹, and id are terminals.
Exercise 2.10.15 Prove that the following grammar defined in EBNF is ambiguous:
Exercise 2.10.17 Define a grammar for a language L consisting of strings that have
n copies of the letter followed by the same number of copies of the letter b, where
n ą 0. Formally, L “ tn bn | n ą 0 and “ t, buu, where n means “n copies of
2.10. EXTENDED BACKUS–NAUR FORM 63
.” For instance, the strings ab, aaaabbbb, and aaaaaaaabbbbbbbb are sentences in
the language, but the strings a, abb, ba, and aaabb are not. Is this language regular?
Explain.
Exercise 2.10.24 The following grammar for if–else statements has been
proposed to eliminate the dangling else ambiguity (Aho, Sethi, and Ullman 1999,
Exercise 4.5, p. 268):
Exercise 2.10.28 Can a language whose sentences are all sets from an infinite
universe of items be defined with a context-free grammar? Explain.
Exercise 2.10.29 Can a language whose sentences are all sets from a finite universe
of items be defined with a context-free grammar? Explain.
Exercise 2.10.30 Consider the language L of binary strings where the first half
of the string is identical to the second half (i.e., all sentences have even length).
For instance, the strings 11, 0000, 0101, 1010, 010010, 101101, and 11111111,
are sentences in the language, but the strings 0110 and 1100 are not. Formally,
L “ t | P t0, 1u‹ u. Is this language context-free? If so, give a context-free
grammar for it. If not, state why not.
8. Note that the use of the words -free and -sensitive in the names of formal grammars is inconsistent.
The -free in context-free grammar indicates what such a grammar is unable to model—namely, context.
In contrast, the -sensitive in context-sensitive grammar indicates what such a grammar can model.
2.11. CONTEXT-SENSITIVITY AND SEMANTICS 65
αXβ Ñ αγβ
i n t main() {
i n t x;
y = 1;
}
Even if all referenced variables are declared, context may still be necessary to
identify type mismatches. For instance, consider the following C++ program:
1 i n t main() {
2 i n t x;
3 bool y;
4
5 x = 1;
6 y = false;
7 x = y;
8 }
a non-terminal on the left-hand side appears; hence, the rules are called
context-free.
• Disambiguate semantic validity. Another example of context-sensitivity in
programming languages is the ‹ operator in C. Its meaning is dependent
upon the context in which it is used. It can be used (1) as the multiplication
operator (e.g., x*3); (2) as the pointer dereferencing operator (e.g., *ptr);
and (3) in the declaration of pointer types (e.g., int* ptr). Without context,
the semantics of the expression x* y are ambiguous. If we see the declara-
tions int x=1, y=2; immediately preceding this expression, the meaning
of the * is multiplication. However, if the statement typedef int x;
precedes the expression x* y, it declares a pointer to an int.
9. Both approaches—use of context-sensitive grammar and use of a context-free grammar with many
rules modeling the context—model context in a purely syntactic way (i.e., without ascribing meaning
to the language). For instance, with a context-sensitive grammar or a context-free grammar with many
rules to enforce semantic rules for C, it is impossible to generate a program referencing an undeclared
variable, and a program referencing an undeclared variable would be syntactically invalid.
2.12. THEMATIC TAKEAWAYS 67
Exercise 2.11.3 We stated in this section that sometimes we can infuse context
into a context-free grammar (often by adding more production rules) even though
a context-free grammar has no provisions for representing context. Express the
context-sensitive grammar given in Section 2.11 enforcing the capitalization of the
first character of an English sentence using a context-free grammar.
Exercise 2.11.4 Define a context-free grammar for the language whose sentences
correspond to sets of the elements , b, and c. For instance, the sentences tu,
t, bu, t, b, cu are in the language, but the sentences t, u, tb, , bu, and
t, b, c, u are not.
the rule that a variable must be declared before it is used. However, we can model
some semantic properties, including operator precedence and associativity, with
a context-free grammar. Thus, not all formal grammars have the same expressive
power; likewise, not all automata have the same power to decide if a string is
a sentence in a language. (The corollary is that there are limits to computation.)
While most programming languages are context-sensitive (because variables often
must be declared before they are used), context-free grammars are the theoretical
basis for the syntax of programming languages (in both language definition and
implementation, as we see in Chapters 3 and 4).
Table 2.13 summarizes each of the progressive four types of formal grammars
in the Chomsky Hierarchy; the class of formal language each grammar generates;
the type of automaton that recognizes each member of each class of those formal
languages; and the constraints on the production rules of the grammars. Regular
and context-free grammars are fundamental topics in the study of the formal
languages. In our course of study, they are useful for both describing the syntax
of and parsing programming languages. In particular, regular and context-free
grammars are essential ingredients in scanners and parsers, respectively, which
are discussed in Chapter 3.
3.2 Scanning
For purposes of scanning, the valid lexical units of a program are called lexemes
(e.g., +, main, int, x, h, hw, hww). The first step of scanning (also referred to
as lexical analysis) is to parcel the characters (from the alphabet ) of the string
representing the line of code into lexemes. Lexemes can be formally described
by regular expressions and regular grammars. Lexical analysis is the process of
determining if a string (typically of a programming language) is lexically valid—
that is, if all of the lexical units of the string are lexemes.
Programming languages must specify how the lexical units of a program are
delimited. There are a variety of methods that languages use to determine where
lexical units begin and end. Most programming languages delimit lexical units
using whitespace (i.e., spaces and tabs) and other characters. In C, lexical units
are delimited by whitespace and other characters, including arithmetic operators.
As an example, consider parceling the characters from the line int i = 20 ; of
C code into lexemes (Table 3.1). The lexemes are int, i, =, 20, and ;. The lines
of code int i=20;, int i = 20;, and int i = 20 ; have this same set of
lexemes.
Free-format languages are languages where formatting has no effect on program
structure—of course, other than use of some delimiter to determine where
lexical units begin and end. Most languages, including C, C++, and Java,
are free-format languages. However, some languages impose restrictions on
formatting. Languages where formatting has an effect on program structure,
and where lexemes must occur in predetermined areas, are called fixed-format
languages. Early versions of Fortran were fixed-format. Other languages, including
Python, Haskell, Miranda, and occam, use layout-based syntactic grouping (i.e.,
indentation).
Once we have a list of lexical units, we must determine whether each is a
lexeme (i.e., lexically valid). This can be done by checking them against the lexemes
of the language (i.e., a lexicon), or by running each through a finite-state automaton
that can recognize the lexemes of the language. Most programming languages
have reserved words that cannot be used as an identifier (e.g., int in C). Reserved
words are not the same as keywords, which are only special in certain contexts (e.g.,
main in C).
Lexeme Token
int reserved word
i identifier
= special symbol
20 constant
; special symbol
Table 3.1 Parceling Lexemes into Tokens in the Sentence int i = 20;
3.2. SCANNING 73
source program
(a string or (context-free
(regular grammar) list of abstract-syntax
list of lexemes) grammar)
Scanner tokens tree
Parser
(concrete
representation)
Figure 3.1 Simplified view of scanning and parsing: the front end.
n=x*y+z
source program
(a string)
(concrete Scanner
representation)
Parser
id1 +
abstract-syntax * id4
tree
id2 id3
_ + alphabetic + digit
2
_ + alphabetic
non-zero digit
3
digit
Figure 3.3 A finite-state automaton for a legal identifier and positive integer in C.
3.3 Parsing
Parsing (or syntactic analysis) is the process of determining whether a string
is a sentence (in some language) and, if so, (typically) converting the concrete
representation of it into an abstract representation, which generally facilitates the
intended subsequent processing of it. A concrete-syntax representation of a program
is typically a string (or a parse tree as shown in Chapter 2, where the terminals
along the fringe of the tree from left-to-right constitute the input string). Since
a program in concrete syntax is not readily processable, it must be parsed into
an abstract representation, where the details of the concrete-syntax representation
3.3. PARSING 75
current state
input character 1 2 3
_ 2 2 ERROR
a + b + ... + y + z 2 2 ERROR
A + B + ... + Y + Z 2 2 ERROR
0 ERROR 2 3
1 + 2 + ... + 8 + 9 3 2 3
lexics syntax
concrete lexeme P parse tree
Ó scanning ù Ó Ó ø parsing
abstract token P abstract-syntax tree
Table 3.3 (Concrete) Lexemes and Parse Trees Vis-à-Vis (Abstract) Tokens and
Abstract-Syntax Trees, Respectively
that are irrelevant to the subsequent processing are abstracted away. A parse
tree and abstract-syntax tree are the syntactic analogs of a lexeme and token from
lexics, respectively (Table 3.3). (See Section 9.5 for more details on abstract-syntax
representations.) A parser (or syntactic analyzer) is the component of an interpreter
or compiler that also typically converts the source program, once syntactically
validated, into an abstract, or more easily manipulable, representation.
Often lexical and syntactic analysis are combined into a single phase (and
referred to jointly as syntactic analysis) to obviate making multiple passes through
the string representing the program. Furthermore, the syntactic validation of a
program and the construction of an abstract-syntax tree for it can proceed in
parallel. Note that parsing is independent of the subsequent processing planned
on the tree: interpretation or compilation (i.e., translation) into another, typically,
lower-level representation (e.g., x86 assembly code).
Parsers can be generally classified as one of two types: top-down or bottom-
up. A top-down parser develops a parse tree starting at the root (or start symbol of
the grammar), while a bottom-up parser starts from the leaves. (In Section 2.7, we
implicitly conducted top-down parsing when we intuitively proved the validity
of a string by building a parse tree for it beginning with the start symbol of the
grammar.) There are two types of top-down parsers: table-driven and recursive
descent. A table-driven, top-down parser uses a two-dimensional parsing table
and a programmer-defined stack data structure to parse the input string. The
parsing table is used to determine which move to apply given the non-terminal
on the top of the stack and the next terminal in the input string. Thus, use of a
table requires looking one token ahead in the input string without consuming it.
The moves in the table are derived from production rules of the grammar. The
76 CHAPTER 3. SCANNING AND PARSING
2. clang is a unified front end for the C family of languages (i.e., C, Objective C, C++, and Objective
C++).
3.4. RECURSIVE-DESCENT PARSING 77
Type of
Top-down Parser Parse Table Used Parse Stack Used
Table-driven explicit 2-D array data explicit stack object in program
structure
Recursive-descent implicit/embedded in the implicit call stack of program
code
1 import sys
2
3 # scanner
4 def validate_lexemes():
5 g l o b a l sentence
6 f o r lexeme in sentence:
7 i f (not valid_lexeme(lexeme)):
8 r e t u r n False
9 r e t u r n True
10
11 def valid_lexeme(lexeme):
12 r e t u r n lexeme in ["(", ")", "x", "y", "z", ","]
13
14 def getNextLexeme():
15 g l o b a l lexeme
16 g l o b a l lexeme_index
17 g l o b a l sentence
18 g l o b a l num_lexemes
19 g l o b a l error
20
21 lexeme_index = lexeme_index + 1
22 i f (lexeme_index < num_lexemes):
23 lexeme = sentence[lexeme_index]
24 else:
25 lexeme = " "
26
27 # parser
28
29 # <symbol_expr> ::= ( <s_list> ) | x | y | z
30 def symbol_expr():
31 g l o b a l lexeme
32 g l o b a l lexeme_index
33 g l o b a l num_lexemes
34 g l o b a l error
35 i f (lexeme == "("):
36 getNextLexeme()
78 CHAPTER 3. SCANNING AND PARSING
37 s_list()
38 i f (lexeme != ")"):
39 error = True
40 e l i f lexeme not in ["x", "y", "z"]:
41 error = True
42 getNextLexeme()
43
44 # <s_list> ::= <symbol_expr> [ , <s_list> ]
45 def s_list():
46 g l o b a l lexeme
47 symbol_expr()
48 # optional part
49 i f lexeme == ',':
50 getNextLexeme()
51 s_list()
52
53 # main program
54 # read in the input sentences
55 f o r line in sys.stdin:
56 line = line[:-1] # remove trailing newline
57 sentence = line.split()
58
59 num_lexemes = len(sentence)
60
61 lexeme_index = -1
62 error = False
63
64 i f (validate_lexemes()):
65 getNextLexeme()
66 symbol_expr()
67
68 # Either an error occurred or
69 # the input sentence is not entirely parsed.
70 i f (error or lexeme_index < num_lexemes):
71 p r i n t ('"{}" is not a sentence.'.format(line))
72 else:
73 p r i n t ('"{}" is a sentence.'.format(line))
74 else:
75 p r i n t ('"{}" contains invalid lexemes and, thus, '
76 'is not a sentence.'.format(line))
where \n is a terminal.
Note that the program is factored into a scanner (lines 3–25) and recursive-
descent parser (lines 27–51), as shown in Figure 3.1.
that this program recognizes two distinct error conditions. First, if a given string
does not consist of lexemes, it responds with this message: "..." contains
invalid lexemes and, thus, is not a sentence.. Second, if a given
string consists of lexemes but is not a sentence according to the grammar, the
parser responds with the message: "..." is not a sentence.. Note that
the lexical error message takes priority over the parse error message. In other
words, the parse error message is issued only if the input string consists entirely
of lexemes. Only one line of output is written standard output per line of input.
The following is a sample interactive session with the parser (> is simply the
prompt for input and is the empty string in the parser):
> ( x )
"( x )" is a sentence.
> ( (
"( (" is not a sentence.
> ( a )
"( a )" contains invalid lexemes and, thus, is not a sentence.
The scanner is invoked on line 64. The parser is invoked on line 66 by calling
the function sym_expr corresponding to the start symbol ă symbo-epr ą.
As functions are called while the string is being parsed, the run-time stack of
activation records keeps track of the current state of the parse. If the stack is empty
when the entire string is consumed, the string is a sentence; otherwise, it is not.
1 import sys
2 import random;
3
4 # <symbol_expr> ::= ( <s_list> ) | x | y | z
5 def symbol_expr():
6 g l o b a l num_tokens
7 g l o b a l max_tokens
8
9 i f (num_tokens < max_tokens):
10 i f (random.randint (0, 1) == 0):
11 p r i n t ("( ", end="")
12 num_tokens = num_tokens + 1
13 s_list()
14 p r i n t (") ", end="")
15 num_tokens = num_tokens + 1
16 else:
17 xyz = random.randint (0, 2)
18 i f (xyz == 0):
19 p r i n t ("x ", end="")
20 e l i f (xyz == 1):
21 p r i n t ("y ", end="")
22 e l i f (xyz == 2):
80 CHAPTER 3. SCANNING AND PARSING
The generator accepts a positive integer on the command line and writes that
many sentences from the language to standard output, one per line. Notice
that this generator, like the recursive-descent parser given in Section 3.4.1, has
one procedure per non-terminal, where each such procedure is responsible for
generating sentences from the sub-language rooted at that non-terminal.
Notice also that the generator produces sentences from the language in a
random fashion. When several alternatives exist on the right-hand side of a
production rule, the generator determines which non-terminal to follow randomly.
The generator also generates sentences with a random number of lexemes. Each
time it generates a sentence, it first generates a random number between the
minimum number of lexemes necessary in a sentence and a maximum number
that keeps the generated string within the character limit of the input strings to
the parser (i.e., ... characters). This random number serves as the maximum
number of lexemes in the generated sentence. Every time the generator encounters
an optional non-terminal (i.e., one enclosed in brackets), it flips a coin to determine
whether it should pursue that path through the grammar. It pursues the path only
if the flip indicates it should and if the number of lexemes generated so far is less
than the random number of maximum lexemes generated.
bottom-up) toward the start symbol of the grammar. In other words, a bottom-up
parse of a string attempts to construct a rightmost derivation of the string in
reverse (i.e., bottom-up). While parsing a string in this bottom-up fashion, we can
also construct a parse tree for the sentence, if desired, by allocating nodes of the
tree as we shift and setting pointers to pre-allocated nodes in the newly created
internal nodes as we reduce. (We need not always build a parse tree; sometimes a
traversal is enough, especially if semantic analysis or code generation phases will
not follow the syntactic phase.)
Shift-reduce parsers, unlike recursive-descent parsers, are typically not written
by hand. Like the construction of a scanner, the implementation of a shift-
reduce parser is well grounded in theoretical formalisms and, therefore, can be
automated. A parser generator is a program that accepts a syntactic specification of
a language in the form of a grammar and automatically generates a parser from
it. Parser generators are available for a wide variety of programming languages,
including Python (PLY) and Scheme ( SLLGEN). A NTLR (ANother Tool for Language
Recognition) is a parser generator for a variety of target languages, including Java.
Scanner and parser generators are typically used in concert with each other to
automatically generate a front end for a language implementation (i.e., a scanner
and parser).
The field of parser generation has its genesis in the classical UNIX tool yacc
(yet another compiler compiler). The yacc parser generator accepts a context-free
grammar in EBNF (in a .y file) as input and generates a shift-reduce parser in C for
the language defined by the input grammar. At any point in a parse, the parsers
generated by yacc always take the action (i.e., a shift or reduce) that leads to a
successful parse, if one exists. To determine which action to take when more than
one action will lead to a successful parse, yacc follows its default actions. (When
yacc encounters a shift-reduce conflict, it shifts by default; when yacc encounters
a reduce-reduce conflict, it reduces based on the first rule in lexicographic order in
the .y grammar file.) The tools lex and yacc together constitute a scanner/parser
generator system.3
The yacc language describes the rules of a context-free grammar and the
actions to take when reducing based on those rules, rather than describing
computation explicitly. Very high-level languages such as yacc are referred to
as fourth-generation languages because three levels of language abstraction precede
them: machine code, assembly language, and high-level language.
Recall (as we noted in Chapter 2) that while semantics can sometimes be
reasonably modeled using a context-free grammar, which is a device for modeling
the syntactic structure of language, a context-free grammar can always be used to
model the lexical structure of language, since any regular language can be modeled
by a context-free grammar. Thus, where scanning (i.e., lexical analysis) ends
and parsing (i.e., syntactic analysis) begins is often blurred from both language
design and implementation perspectives. Addressing semantics while parsing can
3. The GNU implementations of lex and yacc, which are commonly used in Linux, are named flex
and bison, respectively.
82 CHAPTER 3. SCANNING AND PARSING
obviate the need to make multiple passes through the input string. Likewise,4
addressing lexics while parsing can obviate the need to make multiple passes
through the input string.
1 /* symexpr.l */
2 %{
3 # include <string.h>
4 e x t e r n char * temp;
5 e x t e r n i n t lexerror;
6 i n t yyerror(char * errmsg);
7 e x t e r n char * errmsg;
8 %}
9 %%
10 [xyz,()] { strcat(temp,yytext); r e t u r n *yytext; }
11 \n { r e t u r n *yytext; }
12 [ \t] { strcat(temp,yytext); } /* ignore whitespace */
13 . { strcat(temp,yytext);
14 sprintf(errmsg, "Invalid lexeme: '%c'.", *yytext);
15 yyerror(errmsg);
16 lexerror = 1; r e t u r n *yytext; }
17 %%
18 i n t yywrap (void) {
19 r e t u r n 1;
20 }
The pattern-action rules for the relevant lexemes are defined using UNIX-style
regular expressions on lines 10–16. A pattern with outer square brackets matches
exactly one of any of the characters within the brackets (lines 10 and 12) and . (line
13) matches any single character except a newline, which is matched on line 11.
1 /* symexpr.y */
2 %{
3 # include <stdio.h>
4 # include <string.h>
5 i n t yylineno;
6 i n t yydebug=0;
7 char * temp;
8 char * errmsg;
9 i n t lexerror = 0;
10 i n t yyerror(char * errmsg);
11 i n t yylex(void);
12 %}
13 %%
14 input: input sym_expr '\n' { printf ("\"%s\" is an expression.\n", temp);
15 *temp = '\0'; }
16 | sym_expr '\n' { printf ("\"%s\" is an expression.\n", temp);
17 *temp = '\0'; }
18 | error '\n' { i f (lexerror) {
19 printf ("\"%s\" contains invalid ", temp);
20 printf ("lexemes and, thus, ");
21 printf ("is not a sentence.\n");
22 lexerror = 0;
23 } else {
24 printf ("\"%s\" is not an ", temp);
25 printf ("expression.\n");
26 }
27 *temp = '\0';
28 yyclearin; /* discard lookahead */
29 yyerrok; }
30 ;
31 sym_expr: '(' s_list ')' { /* no action */ }
32 | 'x' { /* no action */ }
33 | 'y' { /* no action */ }
34 | 'z' { /* no action */ }
35 ;
36 s_list: sym_expr { /* no action */ }
37 | sym_expr ',' s_list { /* no action */ }
38 ;
39 %%
40 i n t yyerror(char *errmsg) {
41 fprintf(stderr, "%s\n", errmsg);
42 r e t u r n 0;
43 }
44 i n t main(void) {
45 temp = malloc ( s i z e o f (*temp)*255);
46 errmsg = malloc ( s i z e o f (*errmsg)*255);
47 yyparse();
48 free(temp);
49 r e t u r n 0;
50 }
The shift-reduce pattern-action rules for the symbolic expression language are
defined on lines 14–38. The patterns are the production rules of the grammar
and are given to the left of the opening curly brace. Each action associated with a
production rule is given between the opening and closing curly braces to the right
of the rule and represented as C code. The action associated with a production rule
takes place when the parser uses that rule to reduce the symbols on the top of the
stack as demonstrated in Section 2.7.
Note that the actions in the second and third pattern-action rules (lines 31–38)
are empty. In other words, there are no actions associated with the sym_expr and
s_list production rules. (If we were building a parse or abstract-syntax tree, the
C code to allocate the nodes of the tree would be included in the actions blocks of
the second and third rules.) The first rule (lines 14–30) has associated actions and is
used to accept one or more lines of input. If a line of input is a sym_expr, then the
parser prints a message indicating that the string is a sentence. If the line of input
does not parse as a sym_expr, it contains an error and the parser prints a mes-
sage indicating that the string is not a sentence. The parser is invoked on line 47.
These scanner and parser specification files are compiled into an executable
parser as follows:
$ ls
symexpr.l symexpr.y
$ flex symexpr.l # produces the scanner in lex.yy.c
$ ls
lex.yy.c symexpr.l symexpr.y
$ bison -t symexpr.y # produces the parser in symexpr.tab.c
$ ls
84 CHAPTER 3. SCANNING AND PARSING
Table 3.5 later in this chapter compares the top-down and bottom-up methods of
parsing.
26
27 t_ignore = ' \t'
28
29 def t_error(t):
30 r a i s e ValueError("Invalid lexeme '{}'.".format(t.value[0]))
31 t.scanner.skip(1)
32
33 # PARSER
34
35 def p_symexpr(p):
36 """symexpr : LPAREN slist RPAREN
37 | X
38 | Y
39 | Z """
40 p[0] = True
41
42 def p_slist(p):
43 """slist : symexpr
44 | symexpr COMMA slist"""
45 p[0] = True
46
47 def p_error(p):
48 r a i s e SyntaxError ("Parse error.")
49
50 # main program
51 scanner = lex.lex()
52 parser = yacc.yacc()
53
54 f o r line in stdin:
55 line = line[:-1] # remove trailing newline
56 try:
57 i f parser.parse(line):
58 p r i n t ('"{}" is a sentence.'.format(line))
59 else:
60 p r i n t ('"{}" is not a sentence.'.format(line))
61 e x c e p t ValueError as e:
62 p r i n t (e.args[0])
63 p r i n t ('"{}" contains invalid lexemes and, thus, '
64 'is not a sentence.'.format(line))
65 e x c e p t SyntaxError :
66 p r i n t ('"{}" is not a sentence.'.format(line))
The tokens for the symbolic expression language are defined on lines 11–31 and
the shift-reduce pattern-action rules are defined on lines 35–48. Notice that the
syntax of the pattern-action rules in PLY differs from that in yacc. In PLY, the
pattern-action rules are supplied in the form of a function definition. The docstring
string literal at the top of the function definition (i.e., the text between the two """)
specifies the production rule, and the part after the closing """ indicates the action
to be taken. The scanner and parser are invoked on lines 51 and 52, respectively.
Strings are read from standard input (line 54) with the newline character removed
(line 55) and passed to the parser (line 57). The string is then tokenized and parsed.
If the string is a sentence, the parser.parse function returns True; otherwise, it
returns False. The parser is generated and run as follows:
$ ls
symexpr.py
$ python3.8 symexpr.py
Generating LALR tables
86 CHAPTER 3. SCANNING AND PARSING
( x )
"( x )" is a sentence.
( (
"( (" is not a sentence.
( a )
Invalid lexeme 'a'.
"( a )" contains invalid lexemes and, thus, is not a sentence.
$ ls
parser.out parsetab.py symexpr.py
$ python3.8 symexpr.py
( y )
"( y )" is a sentence.
1 import re
2 import sys
3 import operator
4 import ply.lex as lex
5 import ply.yacc as yacc
6 from collections import defaultdict
7
8 # begin lexical specification #
9 tokens = ('NUMBER', 'PLUS', 'WORD', 'MINUS', 'MULT', 'DEC1',
10 'INC1', 'ZERO', 'LPAREN', 'RPAREN', 'COMMA',
11 'IDENTIFIER', 'LET', 'EQ',
12 'IN', 'IF', 'ELSE', 'EQV', 'COMMENT')
13
14 keywords = ('if', 'else', 'inc1', 'dec1',
15 'in', 'let', 'zero?', 'eqv?')
16
17 keyword_lookup = {'if' : 'IF', 'else' : 'ELSE',
18 'inc1' : 'INC1', 'dec1' : 'DEC1', 'in' : 'IN',
19 'let' : 'LET', 'zero?' : 'ZERO',
3.6. PLY: PYTHON LEX-YACC 87
20 'eqv?' : 'EQV' }
21
22 t_PLUS = r'\+'
23 t_MINUS = r'-'
24 t_MULT = r'\*'
25 t_LPAREN = r'\('
26 t_RPAREN = r'\)'
27 t_COMMA = r','
28 t_EQ = r'='
29 t_ignore = " \t"
30
31 def t_WORD(t):
32 r'[A-Za-z_][A-Za-z_0-9*?!]*'
33 pattern = re.compile ("^[A-Za-z_][A-Za-z_0-9?!]*$")
34
35 # if the identifier is a keyword, parse it as such
36 i f t.value in keywords:
37 t.type = keyword_lookup[t.value]
38 # otherwise it might be a variable so check that
39 e l i f pattern.match(t.value):
40 t.type = 'IDENTIFIER'
41 # otherwise it is a syntax error
42 else:
43 p r i n t ("Runtime error: Unknown word %s %d" %
44 (t.value[0], t.lexer.lineno))
45 sys.exit(-1)
46 return t
47
48 def t_NUMBER(t):
49 r'-?\d+'
50 # try to convert the string to an int, flag overflows
51 try:
52 t.value = i n t (t.value)
53 e x c e p t ValueError:
54 p r i n t ("Runtime error: number too large %s %d" %
55 (t.value[0], t.lexer.lineno))
56 sys.exit(-1)
57 return t
58
59 def t_COMMENT(t):
60 r'---.*'
61 pass
62
63 def t_newline(t):
64 r'\n'
65 # continue to next line
66 t.lexer.lineno = t.lexer.lineno + 1
67
68 def t_error(t):
69 p r i n t ("Unrecognized token %s on line %d." % (t.value.rstrip(),
70 t.lexer.lineno))
71 lexer = lex.lex()
72 # end lexical specification #
The following code is a PLY parser specification for the Camille language defined
by this grammar:
73 c l a s s ParserException(Exception):
74 def __init__(self, message):
75 self.message = message
88 CHAPTER 3. SCANNING AND PARSING
76
77 def p_error(t):
78 i f (t != None):
79 r a i s e ParserException("Syntax error: Line %d " % (t.lineno))
80 else:
81 r a i s e ParserException("Syntax error near: Line %d" %
82 (lexer.lineno - (lexer.lineno > 1)))
83
84 # begin syntactic specification #
85 def p_program_expr(t):
86 '''programs : program programs
87 | program'''
88 #do nothing
89
90 def p_line_expr(t):
91 '''program : expression'''
92
93 def p_primitive_op(t):
94 '''expression : primitive LPAREN expressions RPAREN'''
95
96 def p_primitive(t):
97 '''primitive : PLUS
98 | MINUS
99 | INC1
100 | MULT
101 | DEC1
102 | ZERO
103 | EQV'''
104
105 def p_expression_number(t):
106 '''expression : NUMBER'''
107
108 def p_expression_identifier(t):
109 '''expression : IDENTIFIER'''
110
111 def p_expression_let(t):
112 '''expression : LET let_statement IN expression'''
113
114 def p_expression_let_star(t):
115 '''expression : LETSTAR letstar_statement IN expression'''
116
117 def p_expression_let_rec(t):
118 '''expression : LETREC letrec_statement IN expression'''
119
120 def p_expression_condition(t):
121 '''expression : IF expression expression ELSE expression'''
122
123 def p_expression_function_decl(t):
124 '''expression : FUN LPAREN parameters RPAREN expression
125 | FUN LPAREN RPAREN expression'''
126
127 def p_expression_function_call(t):
128 '''expression : LPAREN expression arguments RPAREN
129 | LPAREN expression RPAREN '''
130
131 def p_expression_rec_func_decl(t):
132 '''rec_func_decl : FUN LPAREN parameters RPAREN expression
133 | FUN LPAREN RPAREN expression'''
134
135 def p_parameters(t):
136 '''parameters : IDENTIFIER
137 | IDENTIFIER COMMA parameters'''
138
3.7. TOP-DOWN VIS-À-VIS BOTTOM-UP PARSING 89
Notice that the action part of each pattern-action rule is empty. Thus, this parser
does not build an abstract-syntax tree. For a parser generator that builds an
abstract-syntax tree (used later for interpretation in Chapters 10–11), see the listing
at the beginning of Section 9.6.2.5 For the details of PLY, see https://www.dabeaz
.com/ply/.
5. These specifications have been tested and run in PLY 3.11. The scanner and parser generated by
PLY from these specifications have been tested and run in Python 3.8.5.
90 CHAPTER 3. SCANNING AND PARSING
(a) Implement a recursive-descent parser in any language that accepts strings from
standard input (one per line) until EOF and determines whether each string is
in the language defined by this grammar. Thus, it is helpful to think of this
language using ănptą as the start symbol and the rule:
where \n is a terminal.
Factor your program into a scanner and recursive-descent parser, as shown in
Figure 3.1.
You may not assume that each lexical unit will be valid and separated
by exactly one space, or that each line will contain no leading or trailing
whitespace. There are two distinct error conditions that your program must
recognize. First, if a given string does not consist of lexemes, respond
with this message: "..." contains lexical units which are not
lexemes and, thus, is not an expression., where ... is replaced
with the input string, as shown in the interactive session following. Second,
if a given string consists of lexemes but is not an expression according to
the grammar, respond with this message: "..." is not an expression.,
where ... is replaced with the input string, as shown in the interactive session
following. Note that the “invalid lexemes” message takes priority over the “not
an expression” message; that is, the “not an expression” message can be issued
only if the input string consists entirely of valid lexemes.
You may assume that whitespace is ignored; that no line of input will exceed
4096 characters; that each line of input will end with a newline; and that no
string will contain more than 200 lexical units.
92 CHAPTER 3. SCANNING AND PARSING
Print only one line of output to standard output per line of input, and do not
prompt for input. The following is a sample interactive session with the parser
(> is simply the prompt for input and will be the empty string in your system):
> ()
"()" is a sentence.
> ()()
"()()" is a sentence.
> (())
"(())" is a sentence.
> (()())()
"(()())()" is a sentence.
> ((()())())
"((()())())" is a sentence.
> (a)
"(a)" contains lexical units which are not lexemes and, thus,
is not a sentence.
> )(
")(" is not a sentence.
> )()
")()" is not a sentence.
> )()(
")()(" is not a sentence.
> (()()
"(()()" is not a sentence.
> ())((
"())((" is not a sentence.
> ((()())
"((()())" is not a sentence.
(c) Implement a generator of sentences from the language defined by the grammar
in this exercise as an efficient approach to test-case generation. In other words,
write a program to output sentences from this language. A simple way to build
your generator is to follow the theme of recursive-descent parser construction.
In other words, develop one procedure per non-terminal, where each such
procedure is responsible for generating sentences from the sub-language rooted
at that non-terminal. You can develop this generator from your recursive-
descent parser by inverting each procedure to perform generation rather than
recognition.
lexemes in the generated sentence. Every time you encounter an optional non-
terminal (i.e., one enclosed in brackets), flip a coin to determine whether you
should pursue that path through the grammar. Then pursue the path only if the
flip indicates you should and if the number of lexemes generated so far is less
than the random maximum number of lexemes you generated. Your generator
must read a positive integer given at the command line and write that many
sentences from the language to standard output, one per line.
Testing any program on various representative data sets is an important aspect
of software development, and this exercise will help you test your parsers for
this language.
> 2+3*4
"2+3*4" is an expression.
> 2+3*-4
"2+3*-4" is an expression.
> 2+3*a
"2+3*a" contains lexical units which are not lexemes and, thus,
is not an expression.
> 2+*3*4
"2+*3*4" is not an expression.
(d) At some point in your education, you may have encountered the concept of
diagramming sentences. A diagram of a sentence (or expression) is a parse-
tree-like drawing representing the grammatical (or syntactic) structure of the
sentence, including parts of speech such as subject, verb, and object.
Complete Programming Exercise 3.4.a, but this time build a recursive-descent
parser that writes a diagrammed version of the input string. Specifically, the
output must be the input with parentheses around each non-terminal in the
input string.
Do not build a parse tree to solve this problem. Instead, implement
your recursive-descent parser to construct the diagrammed sentence as
94 CHAPTER 3. SCANNING AND PARSING
Print only one line of output to standard output per line of input as follows.
Consider the following sample interactive session with the parser diagrammar
(> is the prompt for input and is the empty string in your system):
> 2+3*4
"((2)+((3)*(4)))" is an expression.
> 2+3*-4
"((2)+((3)*(-(4))))" is an expression.
> 2+3*a
"2+3*a" contains lexical units which are not lexemes and, thus,
is not an expression.
> 2+*3*4
"2+*3*4" is not an expression.
char * string_representation_of_an_integer =
malloc (10* s i z e o f (*string_representation_of_an_integer));
3.7. TOP-DOWN VIS-À-VIS BOTTOM-UP PARSING 95
(f) Complete Programming Exercise 3.5.d, but this time build a parse tree in
memory and traverse it to output the diagrammed sentence.
Complete Programming Exercise 3.4 (parts a, b, and c) using this grammar, subject
to all of the requirements given in that exercise.
The following is a sample interactive session with the parser:
> ( 6 )
"( 6 )" is an expression.
> a
"a" is an expression.
> ( i) )
"( i) )" is not an expression.
> ,a - 1
",a - 1" contains lexical units which are not lexemes and, thus,
is not an expression.
> ( ( a ) )
"( ( a ) )" is an expression.
> id * index - rate * 1001 - (r - 32) * key
"id * index - rate * 1001 - (r - 32) * key" is not an expression.
> ( ( ( a ) ) )
"( ( ( a ) ) )" is an expression.
> ;10 - 10
";10 - 10" contains lexical units which are not lexemes and, thus,
is not an expression.
96 CHAPTER 3. SCANNING AND PARSING
> 01 - 10
"01 - 10" is not an expression.
> a * b - c
"a * b - c" is not an expression.
> ( ( ( a a ) ) )
"( ( ( a a ) ) )" is an expression.
> ( a ( a ) )
"( a ( a ) )" is not an expression.
> 2 * 3
"2 * 3" is an expression.
> ( )
"( )" is not an expression.
> 2 * rate - (((3)))
"2 * rate - (((3)))" is not an expression.
> (
"(" is not an expression.
> ( f ( t ) ) )
"( f ( t ) ) )" is not an expression.
> f!a+u
"f!a+u" contains lexical units which are not lexemes and, thus,
is not an expression.
> a*
"a*" is not an expression.
> _aaa+1
"_aaa+1" is an expression.
> ____aa+y
"____aa+y" is an expression.
Complete Programming Exercise 3.5 (parts a–g) using this grammar, subject to all
of the requirements given in that exercise.
> f | t & f | ~t
"f | t & f | ~t" is an expression.
> ~t | t | ~f & ~f & t & ~t | f
"~t | t | ~f & ~f & t & ~t | f" is an expression.
> f | t ; f | ~t
"f | t ; f | ~t" contains lexical units which are not lexemes and, thus,
is not an expression.
> f | t & & f | ~t
"f | t & & f | ~t" is not an expression.
3.7. TOP-DOWN VIS-À-VIS BOTTOM-UP PARSING 97
> f | t & f | ~t
"(((f) | ((t) & (f))) | (~(t)))" is a diagrammed expression.
> ~t | t | ~f & ~f & t & ~t | f
"((((~(t)) | (t)) | ((((~(f)) & (~(f))) & (t)) & (~(t)))) | (f))"
is a diagrammed expression.
> f | t ; f
"f | t ; f" contains lexical units which are not lexemes and, thus,
is not an expression.
> f | | t & ~t
"f | | t & ~t" is not an expression.
Complete Programming Exercise 3.4 (parts a, b, and c) using this grammar, subject
to all of the requirements given in that exercise.
Exercise 3.9 Consider the following grammar in EBNF for some simple English
sentences:
s t a t i c void adj() {
i f (lexeme.equals("humble") || lexeme.equals("patient") ||
lexeme.equals("prudent")) {
diagrammedSentence += "\"" + lexeme + "\"";
getNextLexeme();
} else {
error = t r u e;
}
}
def adj():
g l o b a l diagrammedSentence
g l o b a l lexeme
g l o b a l error
i f lexeme in ["humble", "patient", "prudent"]:
diagrammedSentence += "\"" + lexeme + "\""
getNextLexeme()
else:
error = True
Exercise 3.11 Build a graphical user interface, akin to that shown here, for the
postfix expression evaluator developed in Programming Exercise 3.10.
Exercise 3.12 Augment the PLY parser specification for Camille given in
Section 3.6.2 with a read-eval-print loop ( REPL) that accepts strings until EOF
and indicates whether the string is a Camille sentence. Do not modify the code
presented in lines 78–166 in the parser specification. Only add a function or
functions at the end of the specification to implement the REPL.
Examples:
$ python3.8 camilleparse.py
Camille> +(-(35,33), inc1(8))
"+(-(35,33), inc1(8))" is a Camille sentence.
Camille> +(-(35,33), inc(8))
"+(-(35,33), inc(8))" is not a Camille sentence.
Camille> l e t a = 9 in a
"let a = 9 in a" is a Camille sentence.
Camille> l e t a = 9 in
"let a = 9 in" is not a Camille sentence.
Programming Language
Implementation
Front End
(regular
source program grammar)
scanner
(a string or
list of lexemes)
list of tokens
(concrete
representation) (context-free
grammar)
parser
abstract-syntax tree
Interpreter
1. This is not always true. For instance, the Java compiler javac outputs Java bytecode.
4.2. INTERPRETATION VIS-À-VIS COMPILATION 105
Front End
(regular
source program grammar)
(a string or scanner
list of lexemes)
list of tokens
(concrete
representation)
(context-free
grammar)
parser
abstract-syntax tree
Compiler
semantic
analyzer
code
generator/
translator
translated program
(e.g., object code)
Interpreter
2. https://dart.dev
4.2. INTERPRETATION VIS-À-VIS COMPILATION 107
# include <limits.h>
# include <stdio.h>
# include <ctype.h>
i n t main() {
char string[LINE_MAX];
/* sentences have exactly three non-whitespace characters */
char program[4];
/* purge whitespace */
while (string[i] != '\0') {
i f (!isspace(string[i])) {
program[j] = string[i];
j++;
/* syntactic analysis */
i f (j == 4) {
fprintf (stderr, "Program is invalid.\n");
r e t u r n -1;
}
}
i++;
}
program[3] = '\0';
r e t u r n 0;
} e l s e { /* invalid lexeme */
fprintf (stderr, "Program is invalid.\n");
r e t u r n -2;
}
}
108 CHAPTER 4. PROGRAMMING LANGUAGE IMPLEMENTATION
Simple Interpreter
(a C program compiled into object code)
string
2+3 5
(i.e., a simple "2 + 3" (i.e., program
program) output)
program num1 num2 sum
2+3 2 3 5
Figure 4.3 Interpreter for the language simple, illustrating that the simple program
becomes part of the running interpreter process.
4.3. RUN-TIME SYSTEMS: METHODS OF EXECUTIONS 109
source program
/* mathematical expression */
(a string) n = x * y + z;
(concrete
representation) Front End
preprocessor
scanner
parser
=
id1 +
abstract-syntax tree
* id4
id2 id3
Compiler
semantic analyzer
code generator
load id2
mul id3
assembly code add id4
store id1
assembler
001101010110110
000110101010111
object code 111100011100101
010101010101010
5. The Java bytecode interpreter (i.e., java) is typically referred to as the Java Virtual Machine or JVM
by itself. However, it really is a virtual machine for Java bytecode rather than Java. Therefore, it is more
accurate to say that the Java compiler and Java bytecode interpreter (traditionally, though somewhat
inaccurately, called a JVM) together provide a virtual machine for Java.
112 CHAPTER 4. PROGRAMMING LANGUAGE IMPLEMENTATION
Front End
(regular
source program grammar)
(a string or scanner
list of lexemes)
list of tokens
(concrete
representation)
(context-free
grammar)
parser
abstract-syntax tree
Interpreter
program input
(i.e., the input
program output
to the software (e.g., processor
interpreter) or virtual
machine)
(compiled to
object code)
software
interpreter
Compilation Interpretation
executed by
Interpreted interpreter
Hardware executed by
Interpreter
STACK OF
INTERPRETED
SOFTWARE
INTERPRETERS
4
Software Interpreter
program
output
executed by
Software Interpreter
executed by
Hardware Interpreter
program output
that can
produces run on
COMPILER target code hardware interpreter
or can
run on that can run on
(if written in
compiled with a object code)
software INTERPRETER
stack of interpreters
i n t i=0;
i n t result=2;
f o r (i=1; i < 1000000; i++)
result *= 2;
6. This code will not actually compute 21,000,000 because attempting to do so will overflow the
integer variable. This code is purely for purposes of discussion.
4.4. COMPARISON OF INTERPRETERS AND COMPILERS 115
space requirements (Chapter 9). Often the internal representation of the source
program accessed and manipulated by an interpreter is an abstract-syntax tree. An
abstract-syntax tree, like a parse tree, depicts the structure of a program. However,
unlike a parse tree, it does not contain non-terminals. It also structures the program
in a way that facilitates interpretation (Chapters 10–12).
The advantages of a pure interpreter and the disadvantages of a traditional
compiler are complements of each other. At a core level, program development
using a compiled language is inconvenient because every time the program
is modified, it must be recompiled to be tested and often the programmer
cycles through a program-compile-debug-recompile loop ad nauseam. Program
development with an interpreter, by comparison, involves one less step.
Moreover, if provided with an interpreter, a read-eval-print loop ( REPL)
facilitates testing and debugging program units (e.g., functions) in isolation of the
rest of the program, where possible.
Since an interpreter does not translate a program into another representation
(other than an abstract-syntax representation), it does not obfuscate the original
source program. Therefore, an interpreter can more accurately identify source-
level (i.e., syntactic) origins (e.g., the name of an array whose index is out-of-
bounds) of run-time errors and refer directly to lines of code in error messages
with more precision than is possible in a compiled language. A compiler, due to
translation, may not be able to accurately identify the origin of a compile-time error
in the original source program by the time the error is detected. Run-time errors
in compiled programs are similarly difficult to trace back to the source program
because the target program has no knowledge of the original source program.
Such run-time feedback can be invaluable to debugging a program. Therefore,
the mechanics of testing and debugging are streamlined and cleaner using an
interpreted, as opposed to a compiled, language.
Also, consider that a compiler involves three languages: the source and target
languages, and the language in which the compiler is written. By contrast, an
interpreter involves only two languages: the source language and the language
in which the interpreter is written—sometimes called the defining programming
language or the host language.
116 CHAPTER 4. PROGRAMMING LANGUAGE IMPLEMENTATION
Exercise 4.1 Reconsider the following context-free grammar defined in EBNF from
Programming Exercise 3.5:
Table 4.3 Features of the Parsers Used in Each Subpart of the Programming
Exercises in This Chapter (Key: R-D = recursive-descent; S-R = shift-reduce.)
118 CHAPTER 4. PROGRAMMING LANGUAGE IMPLEMENTATION
(a) Extend your program from Programming Exercise 3.5.a to interpret programs.
Normal precedence rules hold: ´ has the highest, * has the second highest,
and + has the lowest. Assume left-to-right associativity. The following is sample
input and output for the expression evaluator (> is simply the prompt for input
and will be the empty string in your system):
> 2+3*4
14
> 2+3*-4
-10
> 2+3*a
"2+3*a" contains invalid lexemes and, thus, is not an expression.
> 2+*3*4
"2+*3*4" is not an expression.
Do not build a parse tree to solve this problem. Factor your program into a
recursive-descent parser (i.e., solution to Programming Exercise 3.5.a) and an
interpreter as shown in Figure 4.1.
(b) Extend your program from Programming Exercise 3.5.b to interpret expressions
as shown in Programming Exercise 4.1.a. Do not build a parse tree to solve
this problem. Factor your program into a shift-reduce parser (solution to
Programming Exercise 3.5.b) and an interpreter as shown in Figure 4.1.
(c) Complete Programming Exercise 4.1.a, but this time build a parse tree and
traverse it to evaluate the expression.
(d) Complete Programming Exercise 4.1.b, but this time build a parse tree and
traverse it to evaluate the expression.
(e) Extend your program from Programming Exercise 3.5.d to interpret expressions
as shown here:
> 2+3*4
((2)+((3)*(4))) = 14
> 2+3*-4
((2)+((3)*(-(4)))) = -10
> 2+3*a
"2+3*a" contains invalid lexemes and, thus, is not an expression.
> 2+*3*4
"2+*3*4" is not an expression.
(f) Extend your program from Programming Exercise 3.5.e to interpret expressions
as shown in Programming Exercise 4.1.e.
(g) Extend your program from Programming Exercise 3.5.f to interpret expressions
as shown in Programming Exercise 4.1.e.
4.5. INFLUENCE OF LANGUAGE GOALS ON IMPLEMENTATION 119
(h) Extend your program from Programming Exercise 3.5.g to interpret expressions
as shown in Programming Exercise 4.1.e.
(i) Complete Programming Exercise 4.1.e, but this time, rather than diagramming
the expression, decorate each expression with parentheses to indicate the order
of operator application and interpret expressions as shown here:
> 2+3*4
(2+(3*4)) = 14
> 2+3*-4
(2+(3*(-4))) = -10
> 2+3*a
"2+3*a" contains invalid lexemes and, thus, is not an expression.
> 2+*3*4
"2+*3*4" is not an expression.
(j) Complete Programming Exercise 4.1.f with the same addendum noted in part i.
(k) Complete Programming Exercise 4.1.g with the same addendum noted in part i.
(l) Complete Programming Exercise 4.1.h with the same addendum noted in part i.
Exercise 4.2 Reconsider the following context-free grammar defined in BNF (not
EBNF ) from Programming Exercise 3.7:
where t, f, |, &, and „ are terminals that represent true, false, or, and, and not,
respectively. Thus, sentences in the language defined by this grammar represent
logical expressions that evaluate to true or false.
Complete Programming Exercise 4.1 (parts a–l) using this grammar, subject to
all of the requirements given in that exercise. Specifically, build a parser and an
interpreter to evaluate and determine the order in which operators of a logical
expression are evaluated. Normal precedence rules hold: „ has the highest, & has
the second highest, and | has the lowest. Assume left-to-right associativity.
The following is a sample interactive session with the pure interpreter:
> f | t & f | ~t
false
> ~t | t | ~f & ~f & t & ~t | f
true
> f | t ; f | ~t
"f | t ; f | ~t" contains invalid lexemes and, thus, is not an expression.
> f | t & & f | ~t
"f | t & & f | ~t" is not an expression.
120 CHAPTER 4. PROGRAMMING LANGUAGE IMPLEMENTATION
> f | t & f | ~t
(((f) | ((t) & (f))) | (~(t))) is false.
> ~t | t | ~f & ~f & t & ~t | f
((((~(t)) | (t)) | ((((~(f)) & (~(f))) & (t)) & (~(t)))) | (f)) is true.
> f | t ; f
"f | t ; f" contains invalid lexemes and, thus, is not an expression.
> f | | t & ~t
"f | | t & ~t" is not an expression.
The following is a sample interactive session with the decorating (i.e., parentheses-
for-operator-precedence) interpreter:
> f | t & f | ~t
((f | (t & f)) | (~t)) is false.
> ~t | t | ~f & ~f & t & ~t | f
((((~t) | t) | ((((~f) & (~f)) & t) & (~t))) | f) is true.
> f | t ; f
"f | t ; f" contains invalid lexemes and, thus, is not an expression.
> f | | t & ~t
"f | | t & ~t" is not an expression.
Exercise 4.3 Reconsider the following context-free grammar defined in BNF (not
EBNF ) from Programming Exercise 3.8:
where t, f, |, &, and „ are terminals that represent true, false, or, and, and
not, respectively, and all lowercase letters except for f and t are terminals, each
representing a variable. Each variable in the variable list is bound to true in the
expression. Any variable used in any expression not contained in the variable list
is assumed to be false. Thus, programs in the language defined by this grammar
represent logical expressions, which can contain variables, that can evaluate to true
or false.
4.6. THEMATIC TAKEAWAYS 121
Complete Programming Exercise 4.1 (parts a–d and i–l) using this grammar,
subject to all of the requirements given in that exercise.
Specifically, build a parser and an interpreter to evaluate and determine the order
in which operators of a logical expression with variables are evaluated. Normal
precedence rules hold: „ has the highest, & has the second highest, and | has the
lowest. Assume left-to-right associativity.
Functional Programming in
Scheme
[L]earning Lisp will teach you more than just a new language—it will
teach you new and more powerful ways of thinking about programs.
— Paul Graham, ANSI Common Lisp (1996)
1. This distinction may be a remnant of the Pascal programming language, which used the function
and procedure lexemes in the definition of a function and a procedure, respectively.
2. Alonzo Church was Alan Turing’s PhD advisor at Princeton University from 1936 to 1938.
5.2. INTRODUCTION TO FUNCTIONAL PROGRAMMING 127
The semantics of an expression generated using this rule in Q are as follows: If the
value of the first expression (on the right-hand side) is true, return the value of
the second expression (on the right-hand side). Otherwise, return the value of the
third expression (on the right-hand side). In other words, the third expression on
the right-hand side (the “else” part) is mandatory.
Why does language Q not permit the third expression on the right-hand side to
be optional? In other words, why is the following production rule absent from the
grammar of Q?
Exercise 5.2.2 Notice that there is no direct provision in the λ-calculus grammar
for integers. Investigate the concept of Church Numerals and define the integers
0, 1, and 2 in λ-calculus. When done, define an increment function in λ-calculus,
which adds one to its only argument and returns the result. Also, define addition
and multiplication functions in λ-calculus, which adds and multiplies its two
128 CHAPTER 5. FUNCTIONAL PROGRAMMING IN SCHEME
arguments and returns the result, respectively. You may only use the three
production rules in λ-calculus to construct these numbers and functions.
Exercise 5.2.3 Write a simple expression in λ-calculus that creates an infinite loop.
5.3 Lisp
5.3.1 Introduction
Lisp (List processing)3 was developed by John McCarthy and his students at MIT
in 1958 for artificial intelligence (McCarthy 1960). (Lisp is, along with Fortran, one
of the two oldest programming languages still in use.) An understanding of Lisp
will both improve your ability to learn new languages with ease and help you
become a more proficient programmer in your language of choice. In this sense,
Lisp is the Latin of programming languages.
There are two dialects of Lisp: Scheme and Common Lisp. Scheme can be
used for teaching language concepts; Common Lisp is more robust and often
preferred for developing industrial applications. Scheme is an ideal programming
language for exploring language semantics and implementing language concepts,
and we use it in that capacity particularly in Chapters 6, 8, 12, and 13. In this text,
we use the Racket programming language, which is based on Scheme, for learning
Lisp. Racket is a dialect of Scheme well suited for this course of study.
Much of the power of Lisp can be attributed to its uniform representation
of Lisp program code and data as lists. A Lisp program is expressed as a
Lisp list. Recall that lists are the fundamental and only primitive Lisp data
structure. Because the ability to leverage the power Lisp derives from this uniform
representation, we must first introduce Lisp lists (i.e., data).
(1 2 3)
(x y z)
(1 (2 3))
((x) y z)
Here, 1, 2, 3, x, y, and z are atoms from which these lists are constructed. The lists
(1 (2 3)) and ((x) y z) each contain a sublist.
3. Some jokingly say Lisp stands for Lots of Irritating Superfluous Parentheses.
5.4. SCHEME 129
(1 2 3)
(x 1 y 2 3 z)
((((Nothing))) ((will) (()()) (come ()) (of nothing)))
5.4 Scheme
The Scheme programming language was developed at the MIT AI Lab by Guy L.
Steele and Gerald Jay Sussman between 1975 and 1980. Scheme predates Common
Lisp and influenced its development.
1 > 1
2 1
3 > 2
4 2
5 > 3
6 3
7 > +
8 #<procedure:+>
9 > #t
10 #t
11 > #f
12 #f
4. We use the Racket language implementation in this text when working with Scheme code. See
https://racket-lang.org.
130 CHAPTER 5. FUNCTIONAL PROGRAMMING IN SCHEME
13 > (+ 1 2)
14 3
15 > (+ 1 2 3)
16 6
17 > (lambda (x) (+ x 1))
18 #<procedure>
19 > ((lambda (x) (+ x 1)) 2)
20 3
21 > (define increment (lambda (x) (+ x 1)))
22 > increment
23 #<procedure:increment>
24 > (increment 2)
25 3
26 ;;; a power function
27 > (define pow
28 > (lambda (x n)
29 > (cond
30 > ((zero? n) 1)
31 > (else (* x (pow x (- n 1)))))))
binds (in the environment) the identifier immediately following it with the result
of the evaluation of the expression immediately following the identifier. Thus, line
21 associates (in the environment) the identifier increment with the function
defined on line 21. Lines 22–25 confirm that the function is bound to the identifier
increment. Line 24 invokes the increment function by name; that is, now that
the function name is in the environment, it need not be used literally.
Lines 27–31 define a function pow that, given a base x and non-negative
exponent n, returns the base raised to the exponent (i.e., n ). This function
definition introduces the control construct cond, which works as follows. It
accepts a series of lists and evaluates the first element of each list (from top to
bottom). As soon as the interpreter finds a first element that evaluates to true,
it evaluates the tail of that list and returns the result. In the context of cond,
else always evaluates to true. The built-in Scheme function zero? returns #t
if its argument is equal to zero and #f otherwise. Functions with a boolean
return type (i.e., those that return either #t or #f) are called predicates. Built-
in predicates in Scheme typically end with a question mark (?); we recommend
that the programmer follow this convention when naming user-defined functions
as well.
Two types of parameters exist: actual and formal. Formal parameters (also
known as bound variables or simply parameters) are used in the declaration and
definition of a function. Consider the following function definition:
The identifiers x and y on line 2 are formal parameters. Actual parameters (or
arguments) are passed to a function in an invocation of a function. For instance,
when invoking the preceding function as add(a,b), the identifiers a and b are
actual parameters. Throughout this text, we refer to identifiers in the declaration
of a function as parameters (of the function) and values passed in a function call as
arguments (to the function).
Notice that the pow function uses recursion for repetition. A recursive solution
often naturally mirrors the specification of the problem. Cultivating the habit of
thinking recursively can take time, especially for those readers from an imperative
or object-oriented background. Therefore, we recommend you follow these two
steps to develop a recursive solution to any problem.
1. Identify the smallest instance of the problem—the base case—and solve the
problem for that case only.
2. Assume you already have a solution to the penultimate (in size) instance of
the problem named n ´ 1. Do not try to solve the problem for that instance.
Remember, you are assuming it is already solved for that instance. Now
given the solution for this n ´ 1 case, extend that solution for the case n.
This extension is much easier to conceive than an original solution to the
problem for the n ´ 1 or n cases.
132 CHAPTER 5. FUNCTIONAL PROGRAMMING IN SCHEME
For instance,
1. The base case of the pow function is n “ 0, for which the solution is 1.
2. Assuming we have the solution for the case n ´ 1, all we have to do is
multiply that solution by to obtain the solution for the case n.
This is the crux of recursion (see Design Guideline 1: General Pattern of Recursion in
Table 5.7 at the end of the chapter). With time and practice, you will master this
technique for recursive-function definition and no longer need to explicitly follow
these two steps because they will become automatic to you. Eventually, you will
become like those who learned Scheme as a first programming language, and find
iterative thinking and iterative solutions to problems more difficult to conceive
than recursive ones.
At this point, a cautionary note is necessary. We advise against solving
problems iteratively and attempting a translation into a recursive style. Such an
approach is unsustainable. (Anyone who speaks a foreign natural language knows
that it is impossible to hold a synchronous and effortlessly flowing conversation
in that language while thinking of how to respond in your native language
and translating the response into the foreign language while your conversation
partner is speaking.) Recursive conception of problems and recursive thinking are
fundamental prerequisites for functional programming.
It is also important to note that in Lisp and Scheme, values (not identifiers)
have types. In a sense, Lisp is a typeless language—any value can be bound to
any identifier. For instance, in the pow function, the base x has not been declared
to be of any specific type, as is typically required in the signature of a function
declaration or definition. The identifier x can be bound to value of any time at
run-time. However, only a binding to an integer or a real number will produce a
meaningful result due to the nature of the multiplication (‹) function. The ability
to bind any identifier to any type at run-time—a concept called manifest typing—
relieves the programmer from having to declare types of variables, requires less
planning and design, and provides a more flexible, malleable implementation.
(Manifest typing is a feature that supports the oil painting metaphor discussed
in Chapter 1.)
Notice there are no side effects in the session with the Scheme interpreter.
Notice also that a semicolon (;) introduces a comment that extends until the
end of the line (line 26). The short interactive session demonstrates the crux
of functional programming: evaluation of expressions that involve storing and
retrieving items from the environment, defining functions, and applying them to
arguments.
Notice that the λ-calculus grammar, given in Section 5.2.2., does not have
a provision for a lambda expression with more than one argument. (Functions
that take one, two, three, and n arguments are called unary, binary, ternary,
and n-ary functions, respectively.) That is because λ-calculus is designed to
provide the minimum constructs necessary for describing computation. In other
words, λ-calculus is a mathematical model of computation, not a practical
implementation. Any lambda expression in Scheme with more than one argument
5.4. SCHEME 133
> ((lambda (x y)
> (+ x y)) 1 2)
3
is semantically equivalent to
Thus, syntax for defining a function with more than one argument is syntactic
sugar. Recall that syntactic sugar is special, typically terse, syntax in a language
that serves only as a convenient method for expressing syntactic structures that are
traditionally represented in the language through uniform and often long-winded
syntax. (To help avoid syntax errors, we recommend using an editor that matches
parentheses [e.g., vi or emacs] while programming in Scheme.)
5. The words homo and icon are of Greek origin and mean same and representation, respectively.
134 CHAPTER 5. FUNCTIONAL PROGRAMMING IN SCHEME
from evaluating an S-expression; that is, adding quotes protects expressions from
evaluation. Consider the following transcript of a session with Scheme:
1 > (quote a)
2 a
3 > 'b
4 b
5 > '(a b c d)
6 (a b c d)
7 > (quote (1 2 3 4))
8 (1 2 3 4)
The ’ symbol (line 5) is a shorthand notation for quote—the two can be used
interchangeably. For purposes of terseness of exposition, we exclusively use ’
throughout this text. If the a and b (on lines 1 and 3, respectively) were not
quoted, the interpreter would attempt to retrieve a value for them in the language
environment. Similarly, if the lists on lines 5 and 7 were not quoted, the interpreter
would attempt to evaluate those S-expressions as functional applications (e.g., the
function a applied to the arguments b, c, and d). Thus, you should use the quote
function if you want an S-expression to be treated as data and not code; do not use
the quote function if you want an S-expression to be evaluated as program code
and not to be treated as data. Symbols do not evaluate to themselves unless they
are preceded with a quote. Literals (e.g., 1, 2.1, "hello") need not be quoted.
(define square
(lambda (n)
(* n n)))
(define square
(lambda (n)
(cond
((eqv? 1 n) 1)
(else (* (* n n) (square 1))))))
To be recursive, a function must not only call itself, but must do so in a way such
that each successive recursive call reduces the problem to a smaller problem.
Exercise 5.4.3 Define a recursive Scheme function cube that accepts only an integer
x and returns x3 . Do not use any user-defined auxiliary, helper functions. Use only
three lines of code. Hint: Define a recursive squaring function first (Programming
Exercise 5.4.2).
Exercise 5.4.4 Define a Scheme function applytoall that accepts two argu-
ments, a function and a list, applies the function to every element of the list, and
returns a list of the results.
Examples:
head tail
(car) (cdr)
the structure of the data” (Friedman, Wand, and Haynes 2001, p. 12). We move
onward, bearing these two themes in mind.
a b
learn and use. English is a difficult language to learn because of the numerous
exceptions to the voluminous set of rules (e.g., i before e except after c7 ).
Similarly, many programming languages are inconsistent in a variety of aspects.
For instance, all objects in Java must be accessed through a reference (i.e., you
cannot have a direct handle to an object in Java); moreover, Java uses implicit
dereferencing. However, Java is not entirely uniform in this respect because only
objects—not primitives such as ints—are accessed through references. This is not
the case in C++, where a programmer can access an object directly or through a
reference.
Understanding how dynamic memory structures are represented through
list-box diagrams is the precursor to building and manipulating abstract data
structures. Figures 5.5–5.8 depict the list-boxes for the following lists:
’((a) (b) ((c)))
’(((a) b) c)
a b
a b
a b
’((a b) c)
’((a . b) . c)
Note that Figures 5.6 and 5.8 depict improper lists. The following transcript
illustrates how the Scheme interpreter treats these lists. The car function returns
the value pointed to by the left side of the list-box, and the cdr function returns
the value pointed to by the right side of the list-box.
> '(a b)
(a b)
> '(a . (b))
(a b)
>
> '(a b c)
(a b c)
>
> (car '(a b c))
a
> (cdr '(a b c))
(b c)
>
> '(a . (b c))
(a b c)
>
> (car '(a . (b c)))
a
> (cdr '(a . (b c)))
(b c)
>
> '(a . (b . (c)))
(a b c)
>
> (car '(a . (b . (c))))
a
> (cdr '(a . (b . (c))))
(b c)
>
> '(a . b)
(a . b)
>
> (car '(a . b))
a
> (cdr '(a . b))
b
>
> '((a) (b) ((c)))
((a) (b) ((c)))
>
> (car '((a) (b) ((c))))
(a)
> (cdr '((a) (b) ((c))))
((b) ((c)))
>
> '((a) . ((b) ((c))))
((a) (b) ((c)))
>
> (car '((a) . ((b) ((c)))))
(a)
140 CHAPTER 5. FUNCTIONAL PROGRAMMING IN SCHEME
When working with lists, always follow The Laws of car, cdr, and
cons (Friedman and Felleisen 1996a):
The Law of car: The primitive car is defined only for non-empty lists
(p. 5).
The Law of cdr: The primitive cdr is only defined for non-empty lists.
The cdr of a non-empty list is always another list (p. 7).
The Law of cons: The primitive cons accepts two arguments. The
second argument to cons must be a list [(so to construct only proper
lists)]. The result is a list (p. 9).
5.6. FUNCTIONS ON LISTS 141
(a) (a (b (c (d))))
(c) ((((a) b) c) d)
(define length1
(lambda (l)
(cond
((n u l l? l) 0)
(else (+ 1 (length1 (cdr l)))))))
The built-in Scheme predicate null? returns true if its argument is an empty list
and false otherwise. The built-in Scheme predicate empty? can be used for this
purpose as well.
Notice that the pattern of the recursion in the preceding function is similar
to that used in the pow function in Section 5.4.1. Defining functions in Lisp can
be viewed as pattern application—recognizing the pattern to which a problem
fits, and then adapting that pattern to the details of the problem (Friedman and
Felleisen 1996a).
8. When defining a function in Scheme with the same name as a built-in function (e.g., length), we
use the name of the built-in function with a 1 appended to the end of it as the name of the user-defined
function (e.g., length1), where appropriate, to avoid any confusion and/or clashes (in the interpreter)
with the built-in function.
9. The function append is built into Scheme and accepts an arbitrary number of arguments, all of
which must be proper lists. The version we define is named append1.
142 CHAPTER 5. FUNCTIONAL PROGRAMMING IN SCHEME
1 (define append1
2 (lambda (x y)
3 (cond
4 ((n u l l? x) y)
5 (else (cons (car x) (append1 (cdr x) y))))))
Intuitively, append works by recursing through the first list and consing the
car of each progressively smaller, first list to the appendage of the cdr of each
progressively smaller list with the second list. Recall that the cons function is a
constant operation—it allocates space for two pointers and copies the pointers of
its two arguments into those fields—and recursion is not involved. The append
function works differently: It deconstructs the first list and creates a new cons
cell for each element. In other words, append makes a complete copy of its first
argument. Therefore, the run-time complexity of append is linear [or Opnq] in the
size of the first list. Unlike the first list, which is not contained in the resulting list
(i.e., it is automatically garbage collected), the cons cell of the second list remains
intact and is present in the resulting appended list—it is the cdr of the list whose
car is the last element of the first list. To reiterate, cons and append are not the
same function. To construct a proper list, cons accepts an atom and a list. To do
the same, append accepts a list and a list.
While the running time of append is not constant like that of cons, it is also
not polynomial [e.g., Opn2 q]. However, the effect of the less efficient append
function is compounded in functions that use append where the use of cons
would otherwise suffice. For instance, consider the following reverse10 function,
which accepts a list and returns the list reversed:
(define reverse1
(lambda (l)
(cond
((n u l l? l) '())
(else (append (reverse1 (cdr l)) (cons (car l) '()))))))
10. The function reverse is built into Scheme. The version we define is named reverse1.
5.6. FUNCTIONS ON LISTS 143
Notice that rotating this expansion 90 degrees left forms a parabola showing how
the run-time stack grows until it reaches the base case of the recursion (line 6) and
then shrinks. This is called recursive-control behavior and is discussed in more detail
in Chapter 13.
As this expansion illustrates, reversing a list of n items requires n ´ 1 calls
to append. Recall that the running time of append is linear, Opnq. Therefore, the
run-time complexity of this definition of reverse1 is Opn2 q, which is unsettling.
Intuitively, to reverse a list, we need pass through it only once; thus, the upper
bound on the running time should be no worse than Opnq. The difference in
running time between cons and append is magnified when append is employed
in a function like reverse1, where cons would suffice. This suggests that we
should never use append where cons will suffice (see Design Guideline 3: Efficient
List Construction). We rewrite reverse1 using only cons and no appends in a
later example. Before doing so, however, we make some instructional observations
on this initial version of the reverse1 function.
(define reverse1
(lambda (l)
(cond
(( n u l l? l) '())
(else (append (reverse1 (cdr l)) ( l i s t (car l)))))))
The function append accepts only arguments that are proper lists. In
contrast, the function list accepts any values as arguments (atoms or lists).
The list function is not to be confused with the built-in Scheme predicate
list?, which returns true if its argument is a proper list and false otherwise:
> ( l i s t ? 'a)
#f
> ( l i s t ? 3)
#f
> ( l i s t ? '(a . b))
#f
our guidelines for developing recursive algorithms in defining it. Improving the
run-time complexity of reverse1 involves obviating the use of append through
a method called the difference lists technique (see Design Guideline 7: Difference
Lists Technique). (We revisit the difference lists technique in Section 13.7, where
we introduce the concept of tail recursion.) Using the difference lists technique
compromises the natural correspondence between the recursive specification of
a problem and the recursive solution to it. Compromising this correspondence
and, typically, the readability of the function, which follows from this break
in symmetry, for the purposes of efficiency of execution is a theme that recurs
throughout this text. We address this trade-off in more detail in Chapter 13, where
a reasonable solution to the problem is presented.
In the absence of side effects, which are contrary to the spirit of functional
programming, the only ways for successive calls to a recursive function to
share and communicate data is through return values (as is the case in the
reverse1 function) or parameters. The difference lists technique involves using
an additional parameter that represents the solution (e.g., the reversed list)
computed thus far. A solution to the problem of reversing a list using the difference
lists technique is presented here:
1 (define reverse1
2 (lambda (l)
3 (cond
4 ((n u l l? l) '())
5 (else (rev l '())))))
6
7 (define rev
8 (lambda (l rl)
9 (cond
10 ((n u l l? l) rl)
11 (else (rev (cdr l) (cons (car l) rl))))))
Notice that this solution involves the use of a helper function rev, which ensures
that the signature of the original function reverse1 remains unchanged. The
additional parameter is rl, which stands for reversed list. When rev is first called
on line 5, the reversed list is empty. On line 11, we grow that reversed list by
consing each element of the original list into rl until the original list l is empty
(i.e., the base case on line 10), at which point we simply return rl because it is the
completely reversed list at that point. Thus, the reversed list is built as the original
list is traversed. Notice that append is no longer used.
Conducting a similar run-time analysis of this version of reverse1 as we did
with the prior version, we see:
Now the running time of the function is linear [i.e., Opnq] in the size of the list to
be reversed. Notice also that, unlike in the original function, when the expansion
is rotated 90 degrees left, a rectangle is formed, rather than a parabola. Thus,
the improved version of reverse1 is more efficient not only in time, but also
in space. An unbounded amount of memory (i.e., stack) is required for the first
version of reverse1. Specifically, we require as many frames on the run-time
stack as there are elements in the list to be reversed. Unbounded memory is
required for the first version because each function call in the first version must
wait (on the stack) for the recursive call it invokes to return so that it can complete
the computation by appending (cons (car l) ’()) to the intermediate result
that is returned:
The same is not true for the second version. The second version only requires
a constant memory size because no pending computations are waiting for the
recursive call to return:
Formally, this is because the recursive call to rev is in tail position or is a tail
call, and the difference lists version of reverse1 is said to use tail recursion
(Section 13.7).
While working through these examples in the Racket interpreter, notice
that the functions can be easily tested in isolation (i.e., independently of the
rest of the program) with the read-eval-print loop. For instance, we can test
rev independently of reverse1. This fosters a convenient environment for
debugging, and facilitates a process known as interactive or incremental testing.
Compiled languages, such as C, in contrast, require test drivers in main (which
clutter the program) to achieve the same.
Exercise 5.6.2 Define a Scheme function remove that accepts only a list and an
integer i as arguments and returns another list that is the same as the input list,
but with the ith element of the input list removed. If the length of the input list is
5.6. FUNCTIONS ON LISTS 147
less than i, return the same list. Assume that i = 1 refers to the first element of the
list.
Examples:
Exercise 5.6.3 Define a Scheme function called makeset that accepts only a list of
integers as input and returns the list with any repeating elements removed. The
order in which the elements appear in the returned list does not matter, as long as
there are no duplicate elements. Do not use any user-defined auxiliary functions,
except the built-in Scheme member function.
Examples:
> (makeset '(1 3 4 1 3 9))
'(4 1 3 9)
> (makeset '(1 3 4 9))
'(1 3 4 9)
> (makeset '("apple" "orange" "apple"))
'("orange" "apple")
Exercise 5.6.4 Define a Scheme function cycle that accepts only a list and an
integer i as arguments and cycles the list i times. Do not use any user-defined
auxiliary functions and do not use the difference lists technique (i.e., you may use
append).
Examples:
> (cycle 0 '(1 4 5 2))
'(1 4 5 2)
> (cycle 1 '(1 4 5 2))
'(4 5 2 1)
> (cycle 2 '(1 4 5 2))
'(5 2 1 4)
> (cycle 4 '(1 4 5 2))
'(1 4 5 2)
> (cycle 6 '(1 4 5 2))
'(5 2 1 4)
> (cycle 10 '(1))
'(1)
> (cycle 9 '(1 4))
'(4 1)
Exercise 5.6.7 Define a Scheme function oddevensum that accepts only a list of
integers as an argument and returns a pair consisting of the sum of the odd and
even positions of the list. Do not use any user-defined auxiliary functions.
Examples:
Exercise 5.6.8 Define a Scheme function intersect that returns the set
intersection of two sets represented as lists. Do not use any built-in Scheme
functions or syntactic forms other than cons, car, cdr, or, null?, and member.
Examples:
(a)
> (intersect '(a b) '(a b))
(a b)
> (intersect '(a b) '(c d))
()
> (intersect '(a b c) '(e d c))
(c)
> (intersect '(a b c) '(b d c))
(b c)
> (intersect '(a c b d e f) '(c e d))
(c d e)
> (intersect '(a b c d e f) '(a b c d e f))
(a b c d e f)
> ( r e v e r s e * '())
()
> ( r e v e r s e * '((((Nothing))) ((will) (()())
(come ()) (of nothing))))
'(((nothing of) (() come) (() ()) (will)) (((Nothing))))
> ( r e v e r s e * '(((1 2 3) (4 5)) ((6)) (7 8) (9 10)
((11 12 (13 14 (15 16))))))
'(((((16 15) 14 13) 12 11)) (10 9) (8 7) ((6)) ((5 4) (3 2 1)))
The following sentences in the language defined by this grammar represent binary
trees:
111
32
(opus 111 32)
(sonata 1820 (opus 111 32))
(Beethoven (sonata 32 (opus 110 31)) (sonata 33 (opus 111 32)))
The following function accepts a binary tree as an argument and returns the
number of internal and leaf nodes in the tree:
1 (define bintree-size
2 (lambda (s)
3 (cond
4 ((number? s) 1)
5 (else (+ (bintree-size (car (cdr s)))
6 (bintree-size (car (cdr (cdr s))))
7 1))))) ; count self
Table 5.1 Examples of Shortening car-cdr Call Chains with Syntactic Sugar
Moreover, with a similar pattern of recursion, and the help of these abbreviated
call chains, we can define a variety of binary tree traversals:
(define preorder
(lambda (bintree)
(cond
((number? bintree) (cons bintree '()))
(else
(cons (car bintree)
(append (preorder (cadr bintree))
(preorder (caddr bintree))))))))
Using the definitions of the following three functions, we can make the definitions
of the traversals more readable (see the definition of preorder on lines 13–19):
1 (define root
2 (lambda (bintree)
3 (car bintree)))
4
5 (define left
6 (lambda (bintree)
7 (cadr bintree)))
8
9 (define right
10 (lambda (bintree)
11 (caddr bintree)))
12
13 (define preorder
14 (lambda (bintree)
15 (cond
16 ((number? bintree) (cons bintree '()))
17 (else (cons (root bintree)
18 (append (preorder (left bintree))
19 (preorder (right bintree))))))))
This context-free grammar does not define the semantic property of a binary search
tree (i.e., that the nodes are arranged in an order rendering the tree amenable to an
efficient search), which is an example of context.
Exercise 5.7.2 (Friedman, Wand, and Haynes 2001, Exercise 1.17.1, p. 27) Consider
the following BNF specification of a binary search tree.
ăbnserchtreeą ::= ()
ăbnserchtreeą ::= (ăntegerą ăbnserchtreeą ăbnserchtreeą)
Define a Scheme function path that accepts only an integer n and a list bst
representing a binary search tree, in that order, and returns a list of lefts and
rights indicating how to locate the vertex containing n. You may assume that the
integer is always found in the tree.
Examples:
Exercise 5.7.3 Complete Programming Exercise 5.7.2, but this time do not assume
that the integer is always found in the tree. If the integer is not found, return the
atom ’notfound.
Examples:
Exercise 5.7.4 Complete Programming Exercise 5.7.3, but this time do not assume
that the binary tree is a binary search tree.
Examples:
(define atom?
(lambda (x)
(and (not (pair? x)) (not (n u l l? x)))))
We can extend this idea by trying to recognize a list of atoms—in other words, by
trying to determine whether a list is composed only of atoms:
(define list-of-atoms?
(lambda (lst)
(or ( n u l l? lst)
(and (pair? lst)
(atom? (car lst))
(list-of-atoms? (cdr lst))))))
Notice also that the definition of this function is a reflection of the two production
rules given previously. The pattern used to recognize the list of atoms can be
manually reused to recognize a list of numbers:
(define list-of-numbers?
(lambda (lst)
(or ( n u l l? lst)
(and (pair? lst)
(number? (car lst))
(list-of-numbers? (cdr lst))))))
(define list-of
(lambda (predicate lst)
(or ( n u l l? lst)
(and (pair? lst)
(predicate (car lst))
(list-of predicate (cdr lst))))))
In this way, the list-of function abstracts the details of the predicate from the
pattern of recursion used in the original definition of list-of-numbers?:
Recall that the first-class nature of functions also supports the definition of a
function that returns a function as a value. Thus, we can refine the list-of
function further by also abstracting away the list to be parsed, which further
generalizes the pattern of recursion. Specifically, we can redefine the list-of
function to accept a predicate as its only argument and to return a predicate that
calls this input predicate on the elements of a list to determine whether all elements
are of the given type (Friedman, Wand, and Haynes 2001, p. 45):
(define list-of
(lambda (predicate)
(lambda (lst)
(or (n u l l? lst)
(and (pair? lst)
(predicate (car lst))
((list-of predicate) (cdr lst)))))))
Examples:
The semantics of a let expression are as follows. Bindings are created in the
list of lists immediately following let [e.g., ((a 1) (b 2))] and are only
bound during the evaluation of the second S-expression [e.g., (+ a b)]. Use of
let does not violate the spirit of functional programming for two reasons: (1)
let creates bindings, not assignments, and (2) let is syntactic sugar used to
improve the readability of a program; any let expression can be rewritten as
an equivalent lambda expression. To make the leap from a let expression to
a lambda expression, we must recognize that functional application is the only
mechanism through which to create a binding in λ-calculus; that is, the argument
to the function is bound to the formal parameter. Moreover, once an identifier is
bound to a value, it cannot be rebound to a different value within the same scope:
Scheme provides syntactic sugar for this style of nesting with a let* expression,
in which bindings are evaluated in sequence (Table 5.2):
Thus, just as let is syntactic sugar for lambda, let* is syntactic sugar for let.
Therefore, any let* expression can reduced to a lambda expression as well:
Never use let* when there are no dependencies in the list of bindings [e.g.,
((a 1) (b 2) (c 3))].
Evaluation of this expression results in an error because length1 is not yet bound
on line 4—it is not bound until line 5. Notice the issue here is not one of parallel
vis-à-vis sequential bindings since there is only one binding (i.e., length1).
Rather, the issue is that a binding cannot refer to itself until it is bound. Scheme
has the letrec expression to make bindings visible while they are being created:
(define reverse1
(letrec ((rev
(lambda (lst rl)
(cond
((n u l l? lst) rl)
(else (rev (cdr lst) (cons (car lst) rl)))))))
(lambda (l)
(cond
((n u l l? l) '())
(else (rev l '()))))))
Just as let* is syntactic sugar for let, letrec is also syntactic sugar
for let (and, therefore, both are syntactic sugar for lambda through let). In
demonstrating how a letrec expression can be reduced to a lambda expression,
we witness the power of first-class functions and λ-calculus supporting the use
of mathematical techniques such as recursion, even in a language with no native
5.9. LOCAL BINDING: LET, LET*, AND LETREC 159
support for recursion. We start by reducing the preceding letrec expression for
length1 to a let expression. Functions only know about what is passed to them,
and what is in their local environment. Here, we need the length1 function to
know about itself—so it can call itself recursively. Thus, we pass length1 to
length1 itself!
Reducing this let expression to a lambda expression involves the same idea
and technique used in Section 5.9.1—bind a function to an identifier length1 by
passing a literal function to another function that accepts length1 as a parameter:
From here, we simply need to make one more transformation to the code so that it
conforms to λ-calculus, where only unary functions can be defined:
We have just demonstrated how to define a recursive function from first prin-
ciples (i.e., assuming the programming language being used to define the function
does not support recursion). The pattern used to define the length1 function
recursively is integrated (i.e., tightly woven) into the length1 function itself. If
we want to implement additional functions recursively (e.g., reverse1), without
using the define syntactic form (i.e., the built-in support for recursion in Scheme),
we would have to embed the pattern of code used in the definition of the function
length1 into the definitions of any other functions we desire to define recursively.
Just as with the list-of-atoms? function, it is helpful to abstract the approach
to recursion presented previously from the actual function we desire to define
recursively. This is done with a λ-expression called the (normal-order) Y combinator,
which expresses the essence of recursion in a non-recursive way in the λ-calculus:
λƒ .pλ.ƒ p qq pλ.ƒ p qq
The Y combinator expression in the λ-calculus was invented by Haskell Curry.
Some have hypothesized a connection between the Y combinator and the double
160 CHAPTER 5. FUNCTIONAL PROGRAMMING IN SCHEME
helix structure in human DNA, which consists of two copies of the same strand
adjacent to each other and is the key to the self-replication of DNA. Similarly,
the structure of the Y combinator λ-expression consists of two copies of the
same subexpression [i.e., pλ.ƒ p qq] adjacent to each other and is the key
to recursion—a kind of self-replication—in the λ-calculus or a programming
language. Programming Exercise 6.10.15 explores the Y combinator.
These transformations demonstrate that Scheme is an attractive language
through which to explore and implement concepts of programming languages.
We continue to use Scheme in this capacity in this text. For instance, we
explore binding, and implement lazy evaluation—an alternative parameter-passing
mechanism—and a variety of control abstractions, including coroutines, in Scheme
in Chapters 6, 12, and 13, respectively.
Since lambda is primitive, any let, let*, and letrec expression can be
reduced to a lambda expression (Figure 5.9). Thus, λ-calculus is sufficient to create
programming abstractions.
Again, the grammar rules for λ-calculus, given in Section 5.2.2, have no provi-
sion for defining a function accepting more than one argument. However, here, we
have defined multiple functions accepting more than one argument. Any function
accepting more than one argument can be rewritten as an expression in λ-calculus
by nesting λ-expressions. For instance, the function definition and invocation
> (lambda (a b)
(+ a b))
#<procedure>
> ((lambda (a b)
(+ a b)) 1 2)
3
let* letrec
let
lambda
Table 5.3 Reducing let to lambda (All rows of each column are semantically
equivalent.)
Table 5.4 Reducing let* to lambda (All rows of each column are semantically
equivalent.)
Tables 5.3, 5.4, and 5.5 summarize the reductions from let, let*, and letrec,
respectively, into λ-calculus. Table 5.6 provides a summary of all three syntactic
forms.
Table 5.5 Reducing letrec to lambda (All rows of each column are semantically equivalent.)
CHAPTER 5. FUNCTIONAL PROGRAMMING IN SCHEME
General Pattern Instance of Pattern
( le t ( ( a 1) (b 2) )
( l e t ( ( sym1 val1 ) ( sym2 val2 ) ¨ ¨ ¨ ( symn valn ) )
let sym1 and sym2 are only visible here in body )
; a and b a r e only v i s i b l e h ere
(+ a b ) )
(parallel)
; sym1 i s v i s i b l e h ere and beyond ; a i s v i s i b l e h ere and beyond
; sym2 i s v i s i b l e h ere and beyond ( l e t * ( ( a 1 ) ( b (+ a 1 ) ) )
let* ( l e t * ( ( sym1 val1 ) ( sym2 sym1 ) ¨ ¨ ¨ ( symn sym2 ) ) ; a and b a r e v i s i b l e h ere i n body
sym1 sym2 ¨ ¨ ¨ symn are visible here in body ) (+ a b ) )
(sequential)
; l e n g t h 1 i s v i s i b l e h ere and i n body
; f i s v i s i b l e h ere and i n body ( letrec ( ( length1 ( lambda ( l )
( letrec ( ( f ( lambda ( sym1 sym2 ¨ ¨ ¨ symn ) ( cond
5.9. LOCAL BINDING: LET, LET*, AND LETREC
Exercise 5.9.2 Read Paul Graham’s essay “Beating the Averages” from the book
Hackers and Painters (2004a, Chapter 12), available at http://www.paulgraham
.com/avg.html, and write a 250-word commentary on it.
Exercise 5.9.4 Using letrec, define mutually recursive odd? and even?
predicates to demonstrate that bindings are available for use within and before
the blocks for definitions in the letrec are evaluated.
Exercise 5.9.5 Define a Scheme function reverse1 that accepts only an S-list s
as an argument and reverses the elements of s in linear time (i.e., time directly
proportional to the size of s), Opnq. You may use only define, lambda, let, cond,
null?, cons, car, and cdr in reverse1. Do not use append or letrec in your
definition. Define only one function.
Examples:
> (reverse1 '(1 2 3 4 5))
(5 4 3 2 1)
> (reverse1 '(1))
(1)
> (reverse1 '(2 1))
(1 2)
> (reverse1 '(Twelfth Night and day))
(day and Night Twelfth)
> (reverse1 '(1 (2 (3)) (4 5)))
((4 5) (2 (3)) 1)
( l e t ((a 1))
( l e t ((b (+ a 1)))
(+ a b)))
( l e t ((sum (lambda (s l)
(cond
(( n u l l? l) 0)
(else (+ (car l) (s s (cdr l))))))))
(sum sum '(1 2 3 4 5)))
Exercise 5.9.9 Rewrite the following Scheme member1? function without a let
expression (and without side effect) while maintaining the binding of head to
(car lat) and tail to (cdr lat). Only define one function. Do not use
let*, letrec, set!, or any imperative features, and do not compute any single
subexpression more than once.
(define member1?
(lambda (a lat)
( l e t ((head (car lat)) (tail (cdr lat)))
(cond
(( n u l l? lat) #f)
((eqv? a head) #t)
(else (member1? a tail))))))
Exercise 5.9.10 Complete Programming Exercise 5.9.9 without the use of define.
((lambda (a b) (+ a b)) 1 2)
( l e t * ((x 1) (y (+ x 1)))
((lambda (a b) (+ a b)) x y))
166 CHAPTER 5. FUNCTIONAL PROGRAMMING IN SCHEME
1 (define remove_first
2 (lambda (a lat)
3 (cond
4 (( n u l l? lat) '())
5 ((eqv? a (car lat)) (cdr lat))
6 (else (cons (car l) (remove_first a (cdr lat)))))))
Here the eqv? predicate returns true if its two arguments are equal and false
otherwise. The function remove_all extends remove_first by removing
all occurrences of an atom a from a list of atoms lat by simply returning
(remove_all (cdr lat)) in line 5 rather than (cdr lat):
(define remove_all
(lambda (a lat)
(cond
(( n u l l? lat) '())
((eqv? a (car lat)) (remove_all a (cdr lat)))
(else (cons (car lat) (remove_all a (cdr lat)))))))
1 (define remove_all*
2 (lambda (a l)
3 (cond
4 ((n u l l? l) '())
5 ((atom? (car l))
6 (cond
7 ((eqv? a (car l)) (remove_all* a (cdr l)))
8 (else (cons (car l) (remove_all* a (cdr l))))))
9 (else (cons (remove_all* a (car l))
10 (remove_all* a (cdr l)))))))
11. A Scheme convention followed in this text is to use a * as the last character of any function name
that recurses on an S-expression (e.g., remove_all*), whenever a corresponding function operating
on a list of atoms is also defined (Friedman and Felleisen 1996a, Chapter 5).
5.10. ADVANCED TECHNIQUES 167
Notice that in developing these functions, the pattern of recursion strictly follows
Design Guideline 2.
1 (define remove_all*
2 (lambda (a l)
3 (cond
4 ((n u l l? l) '())
5 (else ( l e t ((head (car l)))
6 (cond
7 ((atom? head)
8 (cond
9 ((eqv? a head) (remove_all* a (cdr l)))
10 (else (cons head (remove_all* a (cdr l))))))
11 (else (cons (remove_all* a head)
12 (remove_all* a (cdr l))))))))))
Notice that binding the result of the evaluation of the expression (cdr l) to
the mnemonic tail, while improving readability, does not actually improve
performance. While the expression (cdr l) appears more than once in this
definition (lines 9, 10, and 12), it is computed only once per function invocation.
1 (define remove_all*
2 (lambda (a l)
3 (letrec ((remove_all_helper*
4 (lambda (l)
5 (cond
6 (( n u l l? l) '())
7 (else ( l e t ((head (car l)))
8 (cond
9 ((atom? head)
10 (cond
11 ((eqv? a head)
12 (remove_all_helper* (cdr l)))
13 (else
14 (cons head
15 (remove_all_helper*
16 (cdr l))))))
17 (else
18 (cons
19 (remove_all_helper* head)
20 (remove_all_helper*
21 (cdr l)))))))))))
22 (remove_all_helper* l))))
This distinction is important. If the nested function f must access one or more of
the parameters (i.e., Design Guideline 6), which is the case with remove_all*, then
the style illustrated in lines 1–11 must be used. Conversely, if one or more of the
parameters to the outer function should be hidden from the nested function, which
is the case with reverse1, then the style used on lines 13–28 must be used. If we
apply these guidelines to improve the last definition of list-of, we determine
that while the nested function list-of-helper does need to know about the
predicate argument to the outer function, predicate does not change—so it
need not be passed through each successive recursive call. Therefore, we should
nest the letrec within the lambda:
(define list-of
(lambda (predicate)
(letrec ((list-of-helper
(lambda (lst)
(or ( n u l l? lst)
(and (pair? lst)
(predicate (car lst))
(list-of-helper (cdr lst)))))))
list-of-helper)))
While the choice of which of the two styles is most appropriate for a program
depends on the context of the problem, in some cases in functional programming
it is a matter of preference. Consider the following two letrec expressions, both
of which yield the same result:
6 5
7
8 > ((letrec ((length1 (lambda (l)
9 > (cond
10 > (( n u l l? l) 0)
11 > (else (+ 1 (length1 (cdr l))))))))
12 > length1) '(1 2 3 4 5))
13 5
While these two expressions are functionally equivalent (i.e., they have the same
denotational semantics), they differ in operational semantics. The first expression
(lines 1–5) calls the local function length1 in the body of the letrec (line 5). The
second expression (lines 8–12) first returns the local function length1 in the body
of the letrec (line 12) and then calls it—notice the double parentheses to the left
of letrec on line 8. The former expression uses binding to invoke the function
length1, while the latter uses binding to return the function length1.
Exercise 5.10.2 Redefine the member1? function from Programming Exercise 5.6.1
so that it follows Design Guidelines 5 and 6.
Exercise 5.10.3 Define a Scheme function member*? that accepts only an atom
and an S-list (i.e., a list possibly nested to an arbitrary depth), in that order, and
returns #t if the atom is an element found anywhere in the S-list and #f otherwise.
Examples:
Exercise 5.10.5 Redefine the makeset function from Programming Exercise 5.6.3
so that it follows Design Guideline 4.
Exercise 5.10.6 Redefine the cycle function from Programming Exercise 5.6.5 so
that it follows Design Guideline 5.
5.10. ADVANCED TECHNIQUES 171
Exercise 5.10.9 Define a Scheme function count-atoms that accepts only an S-list
as an argument and returns the number of atoms that occur in that S-list at all
levels. You may use the atom? function given in Section 5.8.1. Follow Design
Guideline 4.
Examples:
Exercise 5.10.10 Define a Scheme function flatten1 that accepts only an S-list as
an argument and returns it flattened as a list of atoms.
Examples:
Exercise 5.10.12 Define a function samefringe that accepts an integer n and two
S-expressions, and returns #t if the first non-null n atoms in each S-expression are
equal and in the same order and #f otherwise.
Examples:
Exercise 5.10.14 Define a Scheme function permutations that accepts only a list
representing a set as an argument and returns a list of all permutations of that list
as a list of lists. You will need to define some nested auxiliary functions. Pass a
λ-function to map where applicable in the bodies of the functions to simplify their
definitions. Follow Design Guideline 5. Hint: This solution requires approximately
20 lines of code.
Examples:
Exercise 5.10.15 Define a function sort1 that accepts only a list of numbers as an
argument and returns the list of numbers sorted in increasing order. Follow Design
Guidelines 4, 5, and 6 completely.
Examples:
(1 2 3)
> (sort1 '(9 8 7 6 5 4 3 2 1)
(1 2 3 4 5 6 7 8 9)
> (sort1 '(1 4 6 3 2))
(1 2 3 4 6)
Exercise 5.10.17 Define a function sort1 that accepts only a numeric comparison
predicate and a list of numbers as arguments, in that order, and returns the list of
numbers sorted by the predicate. Follow Design Guidelines 4, 5, and 6 completely.
Examples:
Exercise 5.10.19 Rewrite the final version of the remove_all* function presented
in this section without the use of any letrec or let expressions, without
the use of define, and without the use of any function accepting more than
one argument, while maintaining the bindings to the identifiers remove_all*,
remove_all_helper*, and head. In other words, redefine the final version of
the remove_all* function in λ-calculus.
also support
abstract. The easiest program to change is one that’s very short” (Graham 2004b,
p. 27). [While Lisp is a programming language, it pioneered the idea of language
support for abstractions (Sinclair and Moon 1991).]
programmer (Graham 1996, p. 27), but should also support those higher-order
activities.
In programming, an original design or prototype is typically sketched and
used primarily for generating thoughts and discovering the parameters of the
design space. For this reason, it is sometimes called a throwaway prototype.
However, “[a] prototype doesn’t have to be just a model; you can refine it into
the finished product. . . . It lets you take advantage of new insights you have
along the way” (Graham 2004b, p. 221). Program design can then be informed
by an invaluable source of practical insight: “the experience of implementing
it.” (Graham 1996, p. 5). Like the use of oil in painting, we would like to discover
a medium (in this case, a language and its associated tools) that reduces the cost
of mistakes, not only tolerates, but even encourages second (and third and so on)
thoughts, and, thus, favors exploration rather than planning.
Thus, a programming language and the tools available for use with it should
not only dampen the effects of the constraints of the environment in which a
programmer must work (e.g., changing specifications, incremental testing, routine
maintenance, and major redesigns) rather than amplify them, but also foster
design exploration, creativity, and discovery without the (typical) associated fear
of risk.
The tenets of functional programming combined with a language supporting
abstractions and dynamic bindings support these aspects of software development
and empower programmers to embark on more ambitious projects (Graham 1996,
p. 6). The organic, improvised style of functional programming demonstrated in
this chapter is a natural fit. We did little to no design of the programs we developed
here. As we journey deeper into functional programming, we encounter more
general and, thus, powerful patterns, techniques, and abstractions.
Macros
5.13 Concurrency
As we conclude this chapter, we leave readers with a thought to ponder. We know
from the study of operating systems that when two or more concurrent threads
share a resource, we must synchronize their activities to ensure that the integrity
of the resource is maintained and the system is never left in an inconsistent state—
we must synchronize to avoid data races. Therefore, in the absence of side effects
178 CHAPTER 5. FUNCTIONAL PROGRAMMING IN SCHEME
and, thus, any shared state and/or mutable data, functional programs are natural
candidates for parallelization:
You can’t change the state of anything, and no function can have side
effects, which is the reason why [functional programming] is ideal for
distributing algorithms over multiple cores. You never have to worry
about some other thread modifying a memory location where you’ve
stored some value. You don’t have to bother with locks and deadlocks
and race conditions and all that mess. (Swaine 2009, p. 14)
You may define one or more helper functions. Keep your program to
approximately 120 lines of code. Use of the pattern-matching facility in Racket will
significantly reduce the size of the evaluator to approximately 30 lines of code.
See https://docs.racket-lang.org/guide/match.html for the details of pattern
matching in Racket. (Try building a graphical user interface for this expression
evaluator in Racket; see https://docs.racket-lang.org/gui/.)
1. General Pattern of Recursion. Solve the problem for the smallest instance of the problem
(called the base case; e.g., n “ 0 for n!, which is n0 “ 1). Assume the penultimate [i.e.,
pn ´ 1qth, e.g., pn ´ 1q!] instance of the problem is solved and demonstrate how you
can extend that solution to the nth instance of the problem [e.g., multiply it by n; i.e.,
n ˚ pn ´ 1q!].
2. Specific Patterns of Recursion. When recurring on a list of atoms, lat, the base case
is an empty list [i.e., (null? lat)] and the recursive step is handled in the else
clause. Similarly, when recurring on a number, n, the base case is, typically, n “ 0 [i.e.,
(zero? n)] and the recursive step is handled in the else clause.
When recurring on a list of S-expressions, l, the base case is an empty list [i.e.,
(null? l)] and the recursive step involves two cases: (1) where the car of the list is
an atom [i.e., (atom? (car l))] and (2) where the car of the list is itself a list (handled
in the else clause, or vice versa).
3. Efficient List Construction. Use cons to build lists.
4. Name Recomputed Subexpressions. Use (let (¨ ¨ ¨ ) ¨ ¨ ¨ ) to name the values of
repeated expressions in a function definition if they may be evaluated more than once
for one and the same use of the function. Moreover, use (let (¨ ¨ ¨ ) ¨ ¨ ¨ ) to name the
values of the expressions in the body of the let that are reevaluated every time a function
is used.
5. Nest Local Functions. Use (letrec (¨ ¨ ¨ ) ¨ ¨ ¨ ) to hide and protect recursive functions
and (let (¨ ¨ ¨ ) ¨ ¨ ¨ ) or (let* (¨ ¨ ¨ ) ¨ ¨ ¨ ) to hide and protect non-recursive functions.
Nest a lambda expression within a letrec (or let or let*) expression:
(define f
(letrec ((g (lambda (¨ ¨ ¨ ) ¨ ¨ ¨ ))) ; or let or let*
(lambda (¨ ¨ ¨ ) ¨ ¨ ¨ )))
6. Factor out Constant Parameters. Use letrec to factor out parameters whose arguments
are constant (i.e., never change) across successive recursive applications. Nest a letrec
(or let or let*) expression within a lambda expression:
(define member1
(lambda (a lat)
(letrec ((M (lambda (lat) ...)))
(M lat))))
7. Difference Lists Technique. Use an additional argument representing the return value of
the function that is built up across the successive recursive applications of the function
when that information would otherwise be lost across successive recursive calls.
8. Correctness First, Simplification Second. Simplify a function or program, by nesting
functions, naming recomputed values, and factoring out constant arguments, only after
the function or program is thoroughly tested and correct.
concepts though they may not have been aware of it. Binding is the topic of
Chapter 6.
We also demonstrated how, within a small language (we focused on the
λ-calculus as the substrate of Scheme), lies the core of computation through which
powerful programming abstractions can be created and leveraged. We introduced
the compelling implications of the properties of functional programming (and
Lisp) for software development, such as prototypes evolving into deployable
software, speed of program development vis-à-vis speed of program execution,
bottom-up programming, and concurrency. While Lisp has a simple and uniform
syntax, it is a powerful language that can be used to create advanced data
structures and sophisticated abstractions in a few lines of code. Ultimately, we
demonstrated that functional programming unites beauty with utility.
Scheme was the first Lisp dialect to use lexical scoping, which is discussed
in Chapter 6. The language also required implementations of it to perform tail-
call optimization, which is discussed in Chapter 13. Scheme was also the first
language to support first-class continuations, which are an important ingredient for
the creation of user-defined control structures and are also discussed Chapter 13.
Chapter 6
1. In this text we refer to subprograms and subroutines as procedures and to procedures that return a
value as functions.
186 CHAPTER 6. BINDING AND SCOPE
6.2 Preliminaries
6.2.1 What Is a Closure?
An understanding of lexical closures is fundamental not only to this chapter, but
more broadly to the study of programming languages. A closure is a function
that remembers the lexical environment in which it was created. A closure can be
thought of as a pair of pointers: one to a block of code (defining the function)
and one to an environment (in which function was created). The bindings in the
environment are used to evaluate the expressions in the code. Thus, a closure
encapsulates data and operations and thus, bears a resemblance to an object as used
in object-oriented programming. Closures are powerful constructs in functional
programming (as we see throughout this text), and an essential element in the
study of binding and scope.
6.3 Introduction
Implicit in the study of let, let*, and letrec expressions is the concept of
scope. Scope is a concept that programmers encounter in every language. Since
scope is often so tightly woven into the semantics of a language, we unconsciously
understand it and rarely ever give it a second thought. In this chapter, we examine
the details more closely.
In a program, variables appear as either references or declarations—even in
typeless languages like Lisp that use manifest typing. The value named by a variable
is called its denotation. Consider the following Scheme expression:
The denotations of x, a, and b are 5, 1, and 2, respectively. The x on line 1 and the
a and b on line 3 are declarations, while the a, b, and x on line 4 are references. A
reference to a variable (e.g., the a on line 4) is bound to a declaration of a variable
(e.g., the a on line 3).
Declarations have limited scope. The scope of a variable declaration in a program
is the region of that program (i.e., a range of lines of code) within which references
to that variable refer to the declaration (Friedman, Wand, and Haynes 2001). For
instance, the scope of the declaration of a in the preceding example is line 4—the
same as for b. The scope of the declaration of x is lines 2–4. Thus, the same
identifier can be used in different parts of a program for different purposes. For
instance, the identifier i is often used as the loop control variable in a variety of
different loops in a program, and multiple functions can have a parameter x. In
each case, the scope of the declaration is limited to the body of the loop or function,
respectively.
The scope rules of a programming language indicate to which declaration
a reference is bound. Languages where that binding can be determined by
examining the text of the program before run-time use static scoping. Languages
where the determination of that binding requires information available at run-
time use dynamic scoping. In the earlier example, we determined the declarations
to which references are bound as well as the scope of declarations based on
our knowledge of the Scheme programming language—in other words, without
consulting any formal rules.
This entire expression (lines 1–6) is a block, which contains a nested block (lines
2–6), which itself contains another block (lines 3–6), and so on. Lines 5–6 are the
innermost block and lines 1–6 constitute the outermost block; lines 3–6 make up an
intervening block. The spatial nesting of the blocks of a program is depicted in a
lexical graph:
Figure 6.1 depicts the run-time stack at the time the expression (+ a b x) is
evaluated.
Design Guideline 6: Factor out Constant Parameters in Table 5.7 indicates that we
should nest a letrec within a lambda only when the body of the letrec must
Top of stack
(+ a b x)
lambda (c a)
+
lambda (a b)
lambda (x)
Figure 6.1 Run-time call stack at the time the expression (+ a b x) is evaluated.
The arrows indicate to which declarations the references to a, b, and x are bound.
190 CHAPTER 6. BINDING AND SCOPE
know about arguments to the outer function. For instance, as recursion progresses
in the reverse1 function, the list to be reversed changes (i.e., it gets smaller). In
turn, in Section 5.9.3 we defined the reverse1 function (i.e., the lambda) in the
body block of the letrec expression. For purposes of illustrating a scope hole, we
will do the opposite here; that is, we will nest the letrec within the lambda. (We
are not implying that this is an improvement over the other definition.)
1 (define reverse1
2 (lambda (l)
3 (letrec ((rev
4 (lambda (lst rl)
5 (cond
6 (( n u l l? lst) rl)
7 (else (rev (cdr lst)
8 (cons (car lst) rl)))))))
9 (cond
10 ((n u l l? l) '())
11 (else (rev l '()))))))
(define reverse1
(lambda (l)
(letrec ((rev
(lambda (l rl)
(cond
(( n u l l? l) rl)
(else (rev (cdr l) (cons (car l) rl)))))))
(cond
((n u l l? l) '())
(else (rev l '()))))))
The set of declarations associated with the innermost block in which a reference
is contained differs from the referencing environment, which is typically much
larger because it contains bindings for nonlocal references, at the program point
where that reference is made. For instance, the referencing environment at line 6
in the expression given at the beginning of this section is {(a, 4), (b, 2),
(c, 3), (x, 5)} while the declarations associated with the innermost block
containing line 6 is ((c 3) (a 4)).
There are two perspectives from which we can study scope (i.e., the
determination of the declaration to which a reference is bound): the programmer
and the interpreter. The programmer, or a human, follows the innermost-
to-outermost search process described previously. (Programmers typically do
not think through the referencing environment.) Internally, that process is
operationalized by the interpreter as a search of the environment. In turn, (static
or dynamic) scoping (and the scope rules of a language) involves how and when
the referencing environment is searched in the interpreter.
In a statically scoped language, that determination can be made before
run-time (often by a human). In contrast, in a statically scoped, interpreted
language, the interpreter makes that determination at run-time because that is
the only time during which the interpreter is in operation. Thus, an interpreter
progressively constructs a referencing environment for a computer program
during execution.
While the specific structure of an environment is an implementation issue
extraneous to the discussion at hand (though covered in Chapter 9), some
cursory remarks are necessary. For now, we simply recognize that we want to
represent and structure the environment in a manner that renders searching it
efficient with respect to the scope rules of a language. Therefore, if the human
process involves an innermost-to-outermost search, we would like to structure
the environment so that bindings of the declarations of the innermost block
are encountered before those in any ancestor block. One way to represent and
structure an environment in this way is as a list of lists, where each list contains
a list of name–value pairs representing bindings, and where the lists containing
the bindings are ordered such that the bindings from the innermost block
appear in the car position (the head) of the list and the declarations from the
192 CHAPTER 6. BINDING AND SCOPE
ancestor blocks constitute the cdr (the tail) of the list organized in innermost-
to-outermost order. Using this structure, the referencing environment at line 6
is represented as (((c 3) (a 4)) ((a 1) (b 2)) ((x 5))). These are the
scoping semantics with which most of us are familiar. Representation options for
the structure of an environment (e.g., flat list, nested list, tree) as well as how an
environment is progressively constructed are the topic of Section 9.8.
Exercise 6.4.1
Exercise 6.4.2
(lambda (f)
((lambda (x)
(f (lambda (y) ((x x) y))))
(lambda (x)
(f (lambda (y) ((x x) y))))))
1 x = 10
2 def f():
3 x = 11
4 f()
1 def f():
2 x = 10
3 def g():
4 x = 11
6.5. LEXICAL ADDRESSING 193
5 g()
6 return x
7 p r i n t (f())
Investigate the semantics of the keywords global and nonlocal in Python. How
do they address the problem of discerning whether a line of code is a declaration
or a reference? What are the semantics of global x? What are the semantics of
nonlocal x?
depth: 0 1 2
position: 0 1 0 1 0
environment: ( (( c 3) ( a 4)) (( a 1) ( b 2)) (( x 5)) )
Given only a lexical address (i.e., lexical depth and declaration position), we can
(efficiently) lookup the binding associated with the identifier in a reference—a step that
is necessary to evaluate the expression containing that reference. Lexically scoped
identifiers are useful for writing and understanding programs, but are superfluous
and unnecessary for evaluating expressions and executing programs. Therefore,
we can purge the identifiers from each lexical address:
With identifiers omitted from the lexical address, the formal parameter lists
following each lambda are unnecessary and, therefore, can be replaced with their
length:
Thus, lexical addressing renders variable names and formal parameter lists
unnecessary. These progressive layers of translation constitute a mechanical
process, which can be automated by a computer program called a compiler. A
symbol table is an instance of an environment often used to associate variable names
with lexical address information.
1 (lambda (x y)
2 ((lambda (z)
6.5. LEXICAL ADDRESSING 195
3 (x (y z)))
4 y))
This expression has two lexical depths: 0 and 1. Indicate at which lexical depth
each of the four references in this expression resides. Refer to the references by line
number.
Exercise 6.5.2 Purge each identifier from the following Scheme expression and
replace it with its lexical address. Replace each parameter list with its length.
Replace any free variable with (: free).
((lambda (x y)
((lambda (proc2)
((lambda (proc1)
(cond
((zero? (read)) (proc1 5 20))
(else (proc2))))
(lambda (x y) (cons x (proc2)))))
(lambda () (cons x (cons y (cons (+ x y) '()))))))
10 11)
For instance, in the expression ((lambda (x) x) y), the x in the body of
the lambda expression occurs bound to the declaration of x in the formal
parameter list, while the argument y occurs free because it is unbound by any
declaration in this expression. A variable bound in the nearest enclosing λ-
expression corresponds to a slot in the current activation record.
A variable may occur free in one context but bound in another enclosing
context. For instance, in the expression
1 (lambda (y)
2 ((lambda (x) x) y))
the reference to y on line 2 occurs bound by the declaration of the formal parameter
y on line 1.
The value of an expression e depends only on the values to which the
free variables within the expression e are bound in an expression enclosing
e. For instance, the value of the body (line 2) of the lambda expression
in the preceding example depends only on the denotation of its single free
variable y on line 1; therefore, the value of y comes from the argument to the
function. The value of an expression e does not depend on the values bound
to variables within the expression e. For instance, the value of the expression
((lambda (x) x) y) is independent of the denotation of x at the time when
the entire expression is evaluated. By the time the free occurrence of x in the
body of (lambda (x) x) is evaluated, it is bound to the value associated
with y.
The semantics of an expression without any free variables is fixed. Consider
the identity function: (lambda (x) x). It has no free variables and its meaning is
always fixed as “return the value that is passed to it.” As another example, consider
the following expression:
(lambda (x)
(lambda (f)
(f x)))
6.6. FREE OR BOUND VARIABLES 197
Table 6.4 Definitions of Free and Bound Variables in λ-Calculus (Friedman, Wand,
and Haynes 2001, Definition 1.3.3, p. 31)
The semantics of this expression, which also has no free variables, is always
“a function that accepts a value x and returns ‘a function that accepts a
function f and returns the result of applying the function f to the value
x.”’ Expressions in λ-calculus not containing any free variables are referred
to as combinators; they include the identity function (lambda (x) x) and
the application combinator (lambda (f) (lambda (x) (f x))), which are
helpful programming elements. We saw combinators in Chapter 5 and encounter
combinators further in subsequent chapters.
The definitions of free and bound variables given here are general and
formulated for any programming language. The definitions shown in Table 6.4
apply specifically to the language of λ-calculus expressions. Notice that the
cases of each definition correspond to the three types of λ-calculus expressions,
except there is no symbol case in the definition of a bound variable—a variable
cannot occur bound in a λ-calculus expression consisting of just a single
symbol.
Using these definitions, we can define recursive Scheme functions
occurs-free? and occurs-bound? that each accept a variable var and
a λ-calculus expression expr and return #t if var occurs free or bound,
respectively, in expr and #f otherwise. These functions, which process
expressions, are shown in Listing 6.1. The three cases of the cond expression
in the definition of each function correspond to the three types of λ-calculus
expressions.
The occurrence of the functions caadr and caddr make these occurs-free?
and occurs-bound? functions unreadable because it is not salient that the
198 CHAPTER 6. BINDING AND SCOPE
(define occurs-bound?
(lambda (var expr)
(cond
((symbol? expr) #f)
((eqv? (car expr) 'lambda)
(or (occurs-bound? var (caddr expr))
(and (eqv? (caadr expr) var)
(occurs-free? var (caddr expr)))))
(else (or (occurs-bound? var (car expr))
(occurs-bound? var (cadr expr)))))))
Exercise 6.6.2 (Friedman, Wand, and Haynes 2001, Exercise 1.19, p. 31) Define
a function bound-symbols in Scheme that accepts only a list representing a
λ-calculus expression and returns a list representing a set (not a bag) of all the
symbols that occur bound in the expression.
Examples:
1 ((lambda (x y)
2 ( l e t ((proc2 (lambda () (cons x (cons y (cons (+ x y) '()))))))
3 ( l e t ((proc1 (lambda (x y) (cons x (proc2)))))
4 (cond
5 ((zero? (read)) (proc1 5 20))
6 (else (proc2))))))
7 10 11)
lambda
proc1 proc2
Figure 6.2 Static call graph of the program used to illustrate dynamic scoping in
Section 6.7.
6.7. DYNAMIC SCOPING 201
Figure 6.3 The two run-time call stacks possible from the program used to illustrate
dynamic scoping in Section 6.7. The stack on the left corresponds to call chain
lambdapx yq Ñ proc1px yq Ñ proc2. The stack on the right corresponds to call
chain lambdapx yq Ñ proc2.
would appear on the run-time call stack. From the static call graph in Figure 6.2
we can derive three possible run-time call chains:
lambdapx yq Ñ proc1px yq
lambdapx yq Ñ proc1px yq Ñ proc2
lambdapx yq Ñ proc2
Since proc2 is the function containing the nonlocal references, we only need
to consider the two call chains ending in proc2. Figure 6.3 depicts the two
possible run-time stacks at the time the cons expression on line 2 is evaluated
(corresponding to these two call chains). The left side of Figure 6.3 shows the stack
that results when a 0 is given as run-time input, while the right side shows the
stack resulting from a non-zero run-time input.
Since there is no declaration of x or y in the definition of proc2, we must
search back through the call chain. When a 0 is input, a backward search of the call
chain reveals that the first declarations to x and y appear in proc1 (see the left
side of Figure 6.3), so the output of the program is (5 5 20 25). When a non-
zero integer is input, the same search reveals that the first declarations to x and y
appear in the lambda expression (see the right side of Figure 6.3), so the output of
the program is (10 11 21).
Shadowed declarations and, thus, scope holes can exist in dynamically scoped
programs, too. However, with dynamic scoping, the hole is created not by
an intervening declaration (in a block nested within the block containing the
shadowed declaration), but rather by an intervening activation record (sometimes
called a stack frame or environment frame) on the stack. For instance, when the run-
time input to the example program is 0, the declarations of x and y in proc1 on
line 3 shadow the declarations of x and y in the lambda expression on line 1,
creating a scope hole for those declarations in the body of proc1 as well as any of
the functions it or its descendants call.
The lexical graph of a program illustrates how the units or blocks of the program
are spatially nested, while a static call graph indicates which procedures have
access to each other. Both can be determined before run-time. The lexical graph
is typically a tree, whereas the static call graph is often a non-tree graph. The
call chain of a program depicts the series of functions called by the program as
they would appear on the run-time call stack and is always linear—that is, a tree
202 CHAPTER 6. BINDING AND SCOPE
structure where every vertex has exactly one parent and child except for the first
vertex, which has no parent, and the last vertex, which has no child. While all
possible call chains can be extracted from the static call graph, every process (i.e.,
program in execution) has only one call graph, but it cannot always be determined
before run-time, especially if the execution of the program depends on run-time
input.
Do not assume dynamic scoping when the only run-time call chain of a program
matches the lexical structure of the nested blocks of that program. For instance, the
run-time call chain of the program in Section 6.4.1 mirrors its lexical structure
exactly, yet that program uses lexical scoping. When the call chain of a program
matches its lexical structure, the declarations to which its references are bound
are the same when using either lexical or dynamic scoping. Note that the lexical
structure of the nested blocks of the lambda expression in the example program
containing the call to read (i.e., lambdapx yq Ñ proc2 Ñ proc1) does not match
any of its three possible run-time call chains; thus, the resolutions of the nonlocal
references (and output of the program) are different using lexical and dynamic
scoping.
Similarly, do not assume static scoping when you can determine the call chain
and, therefore, resolve the nonlocal references before run-time. Consider the following
Scheme expression:
1 ((lambda (x y)
2 ( l e t ((proc2 (lambda () (cons x (cons y (cons (+ x y) '()))))))
3 ( l e t ((proc1 (lambda (x y) (cons x (proc2)))))
4 (proc1 5 20))))
5 10 11)
( l e t ((a 1))
( l e t ((a (+ a 2)))
a))
Exercise 6.8.2 Can the Scheme expression from Conceptual Exercise 6.8.1 be
rewritten with only let*? Explain.
204 CHAPTER 6. BINDING AND SCOPE
1 # include <iostream>
2 using namespace std;
3
4 i n t main() {
5 i n t a = 10; {
6 i n t a = a + 2;
7 cout << a << endl;
8 }
9 }
Does the reference to a on the right-hand side of the assignment operator on line
7 of the first program bind to the declaration of the global variable a on line 4?
Similarly, does the reference to a on line 6 of the second program bind to the local
variable a declared on line 5? Run these programs. What can you infer about how
C++ addresses scope based on the outputs?
Exercise 6.8.4 Consider the Java expression int x = x + 1;. Determine where
the scope of x begins. In other words, is the x on the right-hand side of the
assignment from another scope or does it refer to the x being declared on the left-
hand side? Alternatively, is this expression even valid in Java? Explain.
1 i n t x;
2
3 void p (void) {
4 char x;
5 x = 'a'; /* assigns to char x */
6 }
7
8 i n t main() {
9 x = 2; /* assigns to global x */
10 }
Using static scoping, the declaration of x in p (line 4) takes precedence over the
global declaration of x (line 1) in the body of p. Thus, the global integer x cannot
be accessed from within the procedure p. The global declaration of x has a scope
hole inside of p.
In C++, can you access the x declared in line 1 from the body of p? If so, how?
(define i 0)
(define f
(lambda (lat)
(cond
((n u l l? lat) (cons i '()))
(else ( l e t ((i (+ i 1)))
( l e t ((i (+ i 1)))
(cons i (f (cdr lat)))))))))
(define i 0)
(define g
( l e t ((i (+ i 1)))
( l e t ((i (+ i 1)))
(lambda (lat)
(cond
((n u l l? lat) (cons i '()))
(else (cons i (g (cdr lat)))))))))
1 program main;
2 var x: i n t e g e r ;
3
4 procedure p1;
5
6 var x: r e a l ;
7
8 procedure p2; begin
9 ...
10 end;
11
12 begin
13 ...
14 end;
15
16 procedure p3; begin
17 w r i t e(x);
18 end;
19
20 begin
21 ...
22 end.
(a) If this language uses static scoping, what is the type of the variable x printed on
line 17 in procedure p3?
(b) If this language uses dynamic scoping, what is the type of the variable x printed
on line 17 in procedure p3?
206 CHAPTER 6. BINDING AND SCOPE
1 > (define f
2 (lambda (f)
3 (map
4 (lambda (f)
5 (* f 2))
6 f)))
7 > (f '(2 4 6))
8 (4 8 12)
(a) Annotate lines 5–7 of this program with comments indicating to which
declaration of f on lines 1, 2, and 4 the references to f on lines 5–7 are bound.
(b) Annotate lines 1, 2, and 4 of this program with comments indicating, with line
numbers, the scope of the declarations of f on lines 1, 2, and 4.
Exercise 6.8.10 Evaluate the Scheme expression in the last paragraph of Section 6.7
using lexical scoping.
(define x 10)
(define y 11)
(define proc2
(lambda ()
(cons x (cons y '()))))
(define proc1
(lambda (x y)
(proc2)))
(cond
((zero? (read)) (proc1 5 20))
(else (proc2)))
(define x 10)
(define y 11)
(define proc2
(lambda ()
(cons x (cons y '()))))
(define proc1
(lambda (x y)
(proc2)))
6.9. MIXING LEXICALLY AND DYNAMICALLY SCOPED VARIABLES 207
(define main
(lambda ()
(cond
((zero? (read)) (proc1 5 20))
(else (proc2)))))
(main)
Exercise 6.8.13 Can a programming language that uses dynamic scoping also use
static type checking? Explain.
Exercise 6.8.14 Can a programming language that uses dynamic type checking also
use static scoping? Explain.
Will this expression return 1 when evaluated under dynamic scoping, even in the
absence of a letrec expression? Explain.
Exercise 6.8.16 Write a Scheme program that outputs different results when run
using lexical scoping and dynamic scoping.
of the variable l specifies that l is a lexically scoped variable. This means that any
reference to l in proc1 or any blocks nested therein follow the lexical scoping
rule given previously, unless there is an intervening declaration of l. The local
qualifier on the declaration of the variable d specifies that d is a dynamically
scoped variable. This means that any reference to d in proc1 or any procedure
called from proc1 or called from that procedure, and so on, is bound to this
declaration of d unless there is an intervening declaration of d. Thus, the first two
lines of program output are
Figure 6.4 Depiction of run-time stack at call to print on line 37 of Listing 6.2.
can evaluate the print statement, we must determine to which declarations the
references to l and d are bound.
Examining this program, we see that the only possible run-time call sequence of
procedures is main Ñ proc1 Ñ proc2. Figure 6.4 depicts the run-time stack at the
time of the print statement on line 37. While static scoping involves a search of the
program text, dynamic scoping involves a search of the run-time stack. Specifically,
while determining the declaration to which a reference is bound in a lexically
scoped language involves an outward search of the nested blocks enclosing the
block where the reference is made, doing the same in a dynamically scoped
language involves a downward search from the top of the stack to the bottom.
Using the approach outlined in Section 6.4.1 for determining the declaration
associated with a reference to a lexically scoped variable, we discover that the
reference to l on line 37 is bound to the declaration of l on line 1. Since d is a
dynamically scoped variable and d is not declared in the definition of proc2, we
must search back through the call chain. Examining the definition of the procedure
that called proc2 (i.e., proc1), we find a declaration for d. Thus, our search is
complete and we use the denotation of d in proc1 at the time proc2 is called: 20.
Therefore, proc2 prints
Listing 6.3 A Perl program, whose run-time call chain depends on its input,
demonstrating dynamic scoping.
1 $l = 10;
2 $d = 11;
3
4 # reads an integer from standard input
5 $input = <STDIN>;
6
7 p r i n t "Before the call to proc1 --- l: $l, d: $d\n";
8
9 i f ($input == 5) {
10 &proc1(); # call to proc1
11 } else {
12 &proc2(); # call to proc2
13 }
14
15 $l++;
16 $d++;
17
18 p r i n t "After the call to proc1 --- l: $l, d: $d\n";
19
20 e x i t (0);
21
22 sub proc1 {
23
24 # keyword "my" makes l a lexically scoped variable,
25 # meaning it is accessible only in this block
26 # and any nested blocks herein
27 my $l;
28
29 # keyword "local" makes d a dynamically scoped variable,
30 # meaning it is accessible to its descendants in the call chain
31 l o c a l $d;
32
33 $l = 5;
34 $d = 20;
35
36 p r i n t "Inside the call to proc1 --- l: $l, d: $d\n";
37
38 &proc2(); # call to proc2
39
40 p r i n t "After the call to proc2 --- l: $l, d: $d\n";
41 }
42
43 sub proc2 {
44 p r i n t "Inside the call to proc2 --- l: $l, d: $d\n";
45 }
Consider the Perl program in Listing 6.3, which is similar to Listing 6.2 except
that the call chain depends on program input. If the input is 5, then the call
chain is
and the output is the same as the output for Listing 6.2. Otherwise, the call chain is
main Ñ proc2
6.9. MIXING LEXICALLY AND DYNAMICALLY SCOPED VARIABLES 211
As we can see, just because we can determine the declaration to which a reference
is bound before run-time in a particular program, that does not mean that the
language in which the program is written uses static scoping.
Listings 6.2 and 6.3 contain both shadowed lexical and dynamic declarations
and, therefore, lexical and dynamic scope holes, respectively. For instance, the
declaration of l on line 20 in Listing 6.2 shadows the declaration of l on line 1.
Furthermore, the declaration of d on line 24 in Listing 6.2 shadows the declaration
of d on line 2, creating a scope hole in the definition of proc1 as well as any of
the functions it or its descendants (on the stack) call. In other words, the shadow is
cast into proc2. In contrast, the declaration of l on line 20 in Listing 6.2 does not
create scope holes in any descendant procedures.
Exercise 6.9.2 Identify all of the scope holes on lines 7, 18, 36, 40, 44 of Listing 6.3.
For each of those lines, state which declarations create shadows and indicate the
declarations they obscure.
Exercise 6.9.3 Sketch a graph depicting the lexical structure of the procedures (i.e.,
a lexical graph), including the body of the main program, in Listing 6.2.
Exercise 6.9.4 Sketch the static call graph for Listing 6.2. Is the static call graph for
Listing 6.3 the same? If not, give the static call graph for Listing 6.3 as well.
Exercise 6.9.5 Consider the following Scheme, C, and Perl programs, which are
analogs of each other:
1 ((lambda (x y)
2 ( l e t ((proc1 (lambda (x) (+ x 1)))
3 (proc2 (lambda (y) (* y 2))))
4 (proc2 (* y (proc1 x))))) 2 3)
1 i n t main() {
2 i n t x=2;
3 i n t y=3;
4
5 i n t proc1 ( i n t x) {
6 return x+1;
7 }
8 i n t proc2 ( i n t y) {
9 return y*2;
10 }
212 CHAPTER 6. BINDING AND SCOPE
11
12 r e t u r n proc2(proc1(x)*y);
13 }
1 sub main() {
2 $x=2;
3 $y=3;
4
5 sub proc1 {
6 r e t u r n $_[0]+1;
7 }
8
9 sub proc2 {
10 r e t u r n $_[0]*2;
11 }
12 p r i n t proc2(proc1($x)*$y);
13 }
14 main;
The following graph depicts the lexical structure of these three programs (i.e., a
lexical graph):
lambda/main
proc1 proc2
The rules in Scheme, C, and Perl that specify which procedures have access to call
other procedures are different. Therefore, while each program has the same lexical
structure, they may not have the same static call graph.
Exercise 6.9.6 Consider the following Scheme, C, and Perl programs, which are
analogs of each other:
1 (define x 2)
2 (define y 3)
3 (define proc1 (lambda (x) (+ x 1)))
4 (define proc2 (lambda (y) (* y 2)))
5 (proc2 (* y (proc1 x)))
1 i n t proc1 ( i n t x) {
2 r e t u r n x+1;
3 }
4
5 i n t proc2 ( i n t y) {
6 r e t u r n y*2;
7 }
8
9 i n t main() {
10 i n t x=2;
11 i n t y=3;
12
13 r e t u r n proc2(proc1(x)*y);
14 }
6.10. THE FUNARG PROBLEM 213
1 $x=2;
2 $y=3;
3
4 p r i n t proc2(proc1($x)*$y);
5
6 sub proc1 {
7 r e t u r n $_[0]+1;
8 }
9
10 sub proc2 {
11 r e t u r n $_[0]*2;
12 }
The following graph depicts the lexical structure of these three programs (i.e., a
lexical graph):
The rules in Scheme, C, and Perl that specify which procedures have access to call
other procedures are different. Therefore, while each program has the same lexical
structure, they may not have the same static call graph.
Exercise 6.9.7 Does line 4 [i.e., print proc2 (proc1($x)*$y);] of the Perl
program in Exercise 6.9.6 demonstrate that Perl supports first-class functions?
Explain why or why not.
Exercise 6.9.8 Common Lisp, like Perl, allows the programmer to declare statically
or dynamically scoped variables. Figure out how to set the scoping method of
a variable in a Common Lisp program and write a Common Lisp program that
illustrates the difference between static and dynamic scoping, similarly to the
Perl programs in this section. (Do not replicate that program in Common Lisp.)
Use GNU CLISP implementation of Common Lisp, available at https://clisp
.sourceforge.io/. Writing a program that only demonstrates how to set the scoping
method in Common Lisp is insufficient.
containing the nonlocal reference. McCarthy’s first version of Lisp used dynamic
scoping, though this was unintentional. This is an instance of a programming
language being historically designed based on ease of implementation rather than
the abilities of programmers.
When we include first-class procedures in the discussion of scope, the issue
of resolving nonlocal references suddenly becomes more complex, particularly
with respect to implementation. The issue of determining the declaration to
which a reference is bound is more interesting in languages with first-class
procedures implemented using a run-time stack. The question is: To which
declaration does a reference in the body of a passed or returned function
bind? The difficulty of implementing first-class procedures in a stack-based
programming language is dubbed the FUNARG (FUNctional ARGument) problem.
The FUNARG problem helps to illustrate the relationship between scope and
closures and ties together multiple concepts related to scope (within the
context of broader themes and history). Moreover, this discussion provides
background for more implementation-oriented issues addressed elsewhere in
this text.
The difficulty arises when a nested function makes a nonlocal reference (i.e.,
a reference to an identifier not representing a parameter) to an identifier in the
environment in which the function is defined, but not invoked. In such a case, we
must determine the environment in which to resolve that reference so that we can
evaluate the body of the function. The problem is that the environment in which the
function is created may not be on the stack. In other words, what do we do when a
function refers to something that may not be currently executing (i.e., not on the
run-time stack)? There are two instances of the FUNARG problem: the downward
FUNARG problem and the upward FUNARG problem.
1 ((lambda (x y)
2 ((lambda (proc2)
3 ((lambda (proc1) (proc1 5 20))
4 (lambda (x y) (cons x (proc2)))))
5 (lambda () (cons x (cons y (cons (+ x y) '()))))))
6 10 11)
The functions passed on lines 4 and 5, and accessed through the parameters proc1
and proc2, respectively, are downward FUNARGs.
6.10. THE FUNARG PROBLEM 215
1 (define add_x
2 (lambda (x)
3 (lambda (y)
4 (+ x y))))
5
6 (define main
7 (lambda ()
8 ( l e t ((add5 (add_x 5))
9 (add6 (add_x 6)))
10 (cons (add5 2) (cons (add6 2) '())))))
11 (main)
The function add_x returns a closure (lines 3–4), which adds its argument (i.e.,
y) to the argument to add_x (i.e., x) and returns the result. The add_x function
provides the simplest nontrivial example of a closure. The add_x function creates
(and returns) a closure around the inner function.
The left side of Figure 6.5 illustrates the upward FUNARG problem by
depicting the run-time stack after add_x is called, but before it returns to main
(line 8). The right side of Figure 6.5 depicts the run-time stack after add_x
returns to main (line 9). As seen in this figure, the function returned by the
(add_x 5) call to add_x is (lambda (y) (+ x y)), which appears to have
no free variables. The reference to y is bound to the declaration of y in the
inner lambda expression; the reference to x is bound to the declaration of x
in the outer lambda expression. However, once add_x returns the function
(lambda (y) (+ x y)), its activation record is popped off the stack and
destroyed. Therefore, the binding of x to 5 no longer exists; moreover, the x itself
no longer exists. In other words, a closure can outlive its lexical parent. A closure
1 # include <stdio.h>
2
3 /* f is a function that accepts an int x
4 as an argument and returns a pointer
5 to a function that accepts an int as an
6 argument and returns an int as a return value. */
7 i n t (*f( i n t x)) ( i n t ) {
8
9 i n t g( i n t y) {
10 r e t u r n (x+y);
11 }
12
13 /* return a pointer to the function g */
14 r e t u r n &g;
15 }
16
17 i n t main() {
18 /* add5 is a pointer to a function that
19 accepts an int as an argument and returns
20 an int as a return value. */
21 i n t (*add5) ( i n t ) = f(5);
22 i n t (*add6) ( i n t ) = f(6);
23
24 printf("%d\n", add5(2));
25 printf("%d\n", add6(2));
26 }
$ gcc makeadder.c
$ ./a.out
8
8
The program is trying to access a stack frame that is no longer guaranteed to exist.
Another way of saying that C does not address the FUNARG problem is to say
that it does not support first-class closures. Given the presence of function pointers
in C, it is more accurate to say that C does have first-class procedures, but does not
have first-class closures and, therefore, does not solve the FUNARG problem. Trying
to simulate first-class closures in a language without direct support for them is an
arduous task.
Python supports both first-class procedures and first-class closures. The
following program is the Python analog of the Scheme program given at the
beginning of this subsection:
1 (define new_counter
2 (lambda ()
3 ( l e t ((current 0))
4 (lambda ()
5 ( s e t ! current (+ current 1))
6 current))))
Lines 4–6 represent a closure. The set! function is the assignment operator in
Scheme. Although there is only one declaration of the variable current in the
3. The output of C programs like this are highly compiler- and system-dependent and
may generate a fatal run-time error rather than producing erroneous output. Moreover, you
may need to compile such programs with the -fnested-functions option to gcc (e.g.,
gcc -fnested-functions makeadder.c).
6.10. THE FUNARG PROBLEM 219
program (line 3), the two counter closures each have their own copy of it and,
therefore, are independent:
As a result, the counters never get mixed up (lines 4–17). The binding to data
popped off the stack (e.g., current) still exists.
The new_counter function resembles a constructor—it constructs new
counters (i.e., objects). Often constructors are parameterized so that the
constructed objects are created with a user-specified state rather than a default
state. For example, they might set the maximum number of items in a queue object
to be 11 rather than the default of 10. Here, we can parameterize the constructor so
that the counter created is initialized to a user-specified value rather than 0:
(define new_counter
(lambda (initial)
( l e t ((current initial))
(lambda ()
( s e t ! current (+ current 1))
current))))
This example makes the analogy between closures and objects stronger: In
addition to packaging behavior and state, these closures hide and protect
220 CHAPTER 6. BINDING AND SCOPE
Again, the analogous C program does not work because once the new_counter
function returns and is popped off the stack, the local variable current no longer
exists:
# include <stdio.h>
i n t (*new_counter( i n t initial)) () {
i n t current = initial;
i n t increment () {
current++;
r e t u r n current;
}
r e t u r n &increment;
}
i n t main() {
i n t (*counter1) () = new_counter (1);
i n t (*counter2) () = new_counter (100);
$ gcc makecounter.c
$ ./a.out
101
102
103
104
# include <stdio.h>
i n t increment1() {
s t a t i c i n t current = 1;
r e t u r n ++current;
}
i n t increment100() {
s t a t i c i n t current = 100;
r e t u r n ++current;
}
i n t main() {
printf ("%d\n", increment1());
printf ("%d\n", increment100());
printf ("%d\n", increment1());
printf ("%d\n", increment100());
printf ("%d\n", increment1());
printf ("%d\n", increment100());
printf ("%d\n", increment1());
printf ("%d\n", increment100());
}
$ gcc increment.c
$ ./a.out
2
101
3
102
4
103
5
104
Notice that even though the variable current has static storage, each function
has its own global variable current, with the same name. Thus, the functions
maintain separate counters. This approach is not much different from the
following:
# include <stdio.h>
i n t counter1 = 1;
i n t counter100 = 100;
i n t increment1() {
r e t u r n ++counter1;
}
i n t increment100() {
r e t u r n ++counter100;
}
i n t main() {
printf ("%d\n", increment1());
printf ("%d\n", increment100());
printf ("%d\n", increment1());
printf ("%d\n", increment100());
printf ("%d\n", increment1());
printf ("%d\n", increment100());
printf ("%d\n", increment1());
printf ("%d\n", increment100());
}
222 CHAPTER 6. BINDING AND SCOPE
$ gcc increment2.c
$ ./a.out
2
101
3
102
4
103
5
104
The reason the call to counter1() on line 16 does not run is not related to the
FUNARG problem because Python addresses the FUNARG problem. Instead, the
interference arises from implicit typing. In this example, we want current on
the left-hand side of the assignment operator in line 4 to be interpreted as a
reference bound to the declaration of current in line 2, rather than as a new
declaration of a variable with the same name. While current on the right-hand
side of the assignment operator in line 4 is a reference, the Python interpreter
thinks it is a reference to a variable that has yet to be assigned a value. In this case,
as with other languages that use implicit typing, it is unclear whether current on
the left-hand side of the assignment operator in line 4 is intended as a reference or
a declaration.
To force the semantics we want upon this program, so that the current in
the definition of increment refers to the declaration of current in the enclosing
new_counter function, we wrap current in a list:
Wrapping the initial value in a list of one element named current has
the convenient side effect of making the intended semantics unambiguous. The
occurrence of current using list bracket notation on the left-hand side of the
assignment operator (line 8) is a reference bound to the list current declared
in the enclosed scope (i.e., the definition of new_counter function) rather than
a new intervening declaration. (We return to the concept of implicit/manifest
typing in Chapter 7.) Also notice here that we use a named (i.e., def) rather than
anonymous (i.e., lambda) function.
The first-class function returned in this program (increment) is bound to the
environment in which it is created. In object-oriented programming, an object
encapsulates multiple functions (called methods) and one environment. In other
words, an object binds multiple functions to the same environment. The same effect
can be achieved with first-class closures by returning a list of closures:
4. Another use of the term closure in computer science is for the Kleene closure or Kleene star operator
(discussed in Chapter 2) used in regular expressions and EBNF grammars to match zero or more of the
preceding expression (e.g., the regular expression aa* matches the strings a, aa, aaa and so on).
6.10. THE FUNARG PROBLEM 225
All modern languages relevant to this discussion use static scoping and, thus,
all functions are closed; no longer do functions exist containing free variables
whose declarations are unknown until run-time. The term closure has persisted,
but assumed a new meaning. Instead of referring a function whose free variables
are all bound to a declaration before run-time, it now means a function containing
free variables bound to declarations before run-time that may not exist at run-
time (e.g., the function returned by add_x that references the environment of
add_x even after add_x has returned). Of course, this mutated sense is difficult to
implement and creates the upward FUNARG problem discussed in Section 6.10.2).
In turn, the term closure has persisted to distinguish between two different types
of closed functions rather than between open and closed functions as originally
conceived. Some people refer to a closure as a function that “remembers” the
lexical environment in which it is created, because its defining environment is
packaged within it. This is why we define a closure as an abstract data type with
only two pointers: one to an expression and one to an environment (in which to
evaluate that expression).
The terms closure and anonymous function are often mistakenly used
interchangeably. Most languages that support anonymous functions allow them
to be nested inside another function or scope and returned—thus creating a
closure. While a closure “remembers” the environment in which it is created, an
anonymous function—which is simply an unnamed function—may not. Multiple
languages support closures and anonymous functions (e.g., Python, C#).
(define compose
(lambda (f g)
(lambda (x)
(f (g x)))))
226 CHAPTER 6. BINDING AND SCOPE
(define list-of
(lambda (pred)
(lambda (lst)
(cond
((n u l l? lst) #t)
((pred (car lst)) ((list-of pred) (cdr lst)))
(else #f)))))
1 ((lambda (x y)
2 ((lambda (proc2)
3 ((lambda (proc1) (proc1 5 20))
4 ;; this function gets bound to proc1
5 (lambda (x y) (cons x (proc2)))))
6 ((lambda (x y)
7 ;; this function (closure) gets bound to proc2
8 (lambda () (cons x (cons y (cons (+ x y) '())))))
9 100 101)))
10 10 11)
5. The word heap when used in the context of dynamic memory allocation does not refer to the heap
data structure. Rather, it simply means a heap (or pile) of memory.
6.10. THE FUNARG PROBLEM 227
3ff5 0 1 2 3 4 5 6 7
32 bytes of memory
3ff5
Figure 6.6 The heap (of memory) in a process from which dynamic memory is
allocated.
records in Scheme are stored on the heap, so closures in Scheme have indefinite
extent.
It is important to note which attribute the adjectives static and dynamic modify
in this context. All memory is allocated at run-time when a program becomes a
process—that is, a program in execution. For instance, a variable local to a function
(e.g., int x) is not allocated until the function is called and its activation record is
pushed onto the stack at run-time. Thus, the adjectives static and dynamic cannot be
referring to the time at which memory is allocated, but instead refer to the size of the
memory. The size of static data is fixed before run-time (e.g., int x is 4 bytes) even
though it is not allocated until run-time. Conversely, the size of dynamic memory
can grow or shrink at run-time.
Figure 6.6 illustrates the allocation of dynamic memory from the heap using
the C function malloc (i.e., memory allocation). The function malloc accepts the
number of bytes the programmer wants to allocate from the heap and returns
a pointer to the memory. We use the sizeof function to make the allocation
portable. On many architectures, ints are 4 bytes, so we are allocating 32 bytes
of memory from the heap, or an array of 8 integers. However, this allocation is
from the heap. If we declared the array as int arrayofints[8], the allocation
would come from the stack or static data region.
Although the C programming language supports function pointers, and
functions are first-class entities because they can be passed and returned,
historically functions could not be nested in a C program. In consequence,
the language was not required to address the upward FUNARG problem. The
prevention of function nesting also mitigated the downward FUNARG problem.
Since functions could not nest, the environment of each function could be entirely
specified as the local environment of the function plus the statically allocated
global variables and the top-level functions.
Since the allocation of a closure is automatic, happening implicitly when
a function is called, languages that allocate closures from the heap (called
heap-allocated closures) typically use garbage collection as opposed to manual
228 CHAPTER 6. BINDING AND SCOPE
memory management, such as through the use of the free() function in C. Recall
that a closure must remember the activation record of its parent closure, and that
the memory occupied by this activation record must not be reclaimed until it is no
longer required (i.e., until there are no more remaining references to the closure).
Garbage collection is an ideal solution for dealing with closures with unlimited
extent. Scheme uses garbage collection, and its use was later adopted by other
languages, including Java. We return to the idea of allocating first-class entities
from the heap in subsequent chapters, particularly in Chapter 13, where we discuss
control.
i n t main() {
i n t (*counter1) () = new_counter (1);
i n t (*counter2) () = new_counter (100);
$ gcc makecounter.c
$ ./a.out
104 103 102 101
Exercise 6.10.3 Under which conditions will λ-lifting not work to convert a closure
(i.e., a λ-expression with free variables) into a pure function (i.e., a λ-expression
with no free variables)?
> (counter1)
1
> (counter1)
2
> (counter2)
3
> (counter2)
5
> (counter1)
3
> (counter1)
4
> (counter2)
7
> (counter50)
150
> (counter50)
200
> (counter50)
250
> (counter1)
5
package main
import "fmt"
func main() {
f := fib()
// Function calls are evaluated left-to-right.
// Prints: 1 1 2 3 5
fmt.Println(f(), f(), f(), f(), f())
}
230 CHAPTER 6. BINDING AND SCOPE
Exercise 6.10.8 Go, unlike C, does not have a static keyword: A function name
or variable whose identifier starts with a lowercase letter has internal linkage,
while one starting with an uppercase letter has external linkage. How can we
simulate in Go a variable local to a function with static (i.e., global) storage? Write
a program demonstrating a variable with both local scope to a function and static
(i.e., global) storage. Hint: Use a closure.
Apply λ-lifting to this expression so that values for the free variable article1
and article2 referenced in the λ-expression on lines 2–3 are passed to the
λ-expression itself.
Exercise 6.10.10 Rather than using λ-lifting (which does not work in all cases),
eliminate the free variables in the Scheme expression from Programming
Exercise 6.10.9 by building a closure as a Scheme vector. The vector must contain
the λ-expression and the values for the free variables in the λ-expression, in
that order. Pass this constructed closure to the λ-expression as an argument
when the function is invoked so it can be used to retrieve values for the free
variables when they are accessed. The function vector is the constructor for a
Scheme vector and accepts the ordered values of the vector as arguments—for
example, (define fruit (vector ’apple ’orange ’pear)). The func-
tion vector-ref is the vector accessor; for example, (vector-ref fruit 1)
returns ’orange.
Exercise 6.10.11 Define a class Circle in Scheme with member variable radius
and member functions getRadius, getArea, and getCircumference. Access
these member functions in the vector representing an object of the class through
accessor functions:
Exercise 6.10.12 Create a stack object in Scheme, where the stack is a vector
of closures and the stack data structure is represented as a list. Specifically,
define an argumentless function6 new-stack that returns a vector of closures—
reset-stack, empty-stack?, push, pop, and top—that access the stack list.
You may use the functions vector, vector-ref, and set!. The following client
code must work with your stack:
Exercise 6.10.13 (Friedman, Wand, and Haynes 2001, Section 2.4, p. 66) Create
a queue object in Scheme, where the queue is a vector of closures. Specifically,
define an argumentless function new-queue that returns a vector of closures—
6. The arity of a function with zero arguments (i.e., 0-ary) is nullary (from nũllus in Latin) and niladic
(from Greek).
232 CHAPTER 6. BINDING AND SCOPE
Exercise 6.10.14 Consider the binary tree abstraction, and the suite of functions
accessing it, created in Section 5.7.1. Specifically, consider the addition of the
functions root, left, and right at the end of the example to make the
definition of the preorder and inorder traversals more readable (by obviating
the necessity of the car-cdr call chains). The inclusion of the root, left, and
right helper functions creates a function protection problem. Specifically, because
these helper functions are defined at the outermost block of the program, any other
functions in that outermost block also have access to them—in addition to the
preorder and inorder functions—even though they may not need access to
them. To protect these root, left, and right helper functions from functions
that do not use them, we can nest them within the preorder function with a
letrec expression. That approach creates another problem: The definitions of
6.11. DEEP, SHALLOW, AND AD HOC BINDING 233
the root, left, and right functions need to be replicated in the inorder
function and any other functions requiring access to them (e.g., postorder).
Solve this function-protection-access problem in the binary tree program without
duplicating any code by using first-class closures.
• deep binding uses the environment at the time the passed function was created
• shallow binding uses the environment of the expression that invokes the passed
function
• ad hoc binding uses the environment of the invocation expression in which the
procedure is passed as an argument
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ;; to which declaration of y is the reference to y bound?
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
9
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12
13 (g f x))))))
The question is: To which declaration of y does the reference to y on line 4 bind? In
other words, from which environment does the denotation of y on line 4 derive?
There are multiple options:
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ; 6 ? 6 6
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
9 ; f 6 6 f 6
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12 ; 6
13 (g f x))))))
Deep binding evaluates the body of the passed procedure in the environment in
which it is created. The environment in which f is created is ((y 3)). Therefore,
when the argument f is invoked using the formal parameter x on line 10, which is
passed the argument y bound to 6 (because the reference to x on line 13 is bound to
the declaration of x on line 8; i.e., static scoping), the return value of (x y) on line
10 is (* 3 (+ 6 6)). This expression equals 36, so the return value of the call to
g (on line 13) is (* 6 36), which equals 216. The next three Scheme expressions
are progressively annotated with comments to help illustrate the return value of
216 with deep binding:
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ; 6 3 12
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
9 ; f 6 6
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12 ; 6
13 (g f x))))))
6.11. DEEP, SHALLOW, AND AD HOC BINDING 235
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ; 6 36
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
9 ; f 6 6 36
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12 ; 6
13 (g f x))))))
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ; 6 36
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
9 ; f 6 216
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12 ; 216
13 (g f x))))))
(((y 4))
((x 10)
(f (lambda (x) (* (y (+ x x))))))
((y 3)))
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ; 6 4 12
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
236 CHAPTER 6. BINDING AND SCOPE
9 ; f 6 6
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12 ; 6
13 (g f x))))))
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ; 6 48
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
9 ; f 6 6 48
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12 ; 6
13 (g f x))))))
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ; 6 48
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
9 ; f 6 288
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12 ; 288
13 (g f x))))))
(((y 2))
((y 5)
(x 6)
(g (lambda (x y) (* y (x y)))))
((y 4))
((x 10)
(f (lambda (x) (* (y (+ x x))))))
((y 3)))
Thus, the free variable y on line 4 is bound to 2 on line 11. Evaluating the
body, (* y (+ x x)), of the passed procedure f in this environment results
in (* 2 (+ 6 6)), which equals 24. Thus, the return value of the call to g (on
line 13) is (* 6 24), which equals 144. The next three Scheme expressions are
6.11. DEEP, SHALLOW, AND AD HOC BINDING 237
progressively annotated with comments to help illustrate the return value of 144
with ad hoc binding:
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ; 6 2 12
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
9 ; f 6 6
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12 ; 6
13 (g f x))))))
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ; 6 24
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
9 ; f 6 6 24
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12 ; 6
13 (g f x))))))
1 ( l e t ((y 3))
2 ( l e t ((x 10)
3 ; 6 24
4 (f (lambda (x) (* y (+ x x)))))
5
6 ( l e t ((y 4))
7 ( l e t ((y 5)
8 (x 6)
9 ; f 6 144
10 (g (lambda (x y) (* y (x y)))))
11 ( l e t ((y 2))
12 ; 144
13 (g f x))))))
The terms shallow and deep derive from the means used to search the run-
time stack. Resolving nonlocal references with shallow binding often results in
only searching a few activation records back in the stack (i.e., a shallow search).
Resolving nonlocal references with deep binding (even though we do not think of
searching the stack) often involves searching deeper into the stack—that is, going
beyond the first few activation records on the top of the stack.
Deep binding most closely resembles lexical scoping not only because it can be
done before run-time, but also because resolving nonlocal references depends on
the nesting of blocks. Conversely, shallow binding most closely resembles dynamic
scoping because we cannot determine the calling environment until run-time. Ad
hoc binding lies somewhere in between the two. However, deep binding is not the
same as static scoping, and shallow binding is not the same as dynamic scoping.
238 CHAPTER 6. BINDING AND SCOPE
Scope
bound a variable
variable declaration is
to reference.
The determination
of which
the closure
is bound of a passed
environment
Environment (to be) to or returned
binding function.
A language that uses lexical scoping can also use shallow binding for passed
procedures. Even though we cannot determine the calling environment until run-
time (i.e., shallow binding), that environment can contain bindings as a result of
static scoping. In other words, while we cannot determine the point in the program
where the passed procedure is invoked until run-time (i.e., shallow binding), once
it is determined, the environment at that point can be determined before run-time
if the language uses static scoping. For instance, the expression that invokes the
passed procedure f in our example Scheme expression is (x y) on line 10, and
we said the environment at line 10 is
(((y 4))
((x 10)
(f (lambda (x) (* (y (+ x x))))))
((y 3)))
(define g
(lambda (f)
(f)))
(define e
(lambda ()
(cdr x)))
(define d
(lambda (f x)
(g f)))
(define c
(lambda ()
(d e '(m n o))))
(define b
(lambda (x)
(c)))
(define a
(lambda ()
(b '(c d e))))
(a)
(a) Draw the sequence of procedures on the run-time stack (horizontally, where it
grows from left to right) when e is invoked (including e). Clearly label local
variables and parameters, where present, in each activation record on the stack.
(b) Using dynamic scoping and shallow binding, what value is returned by e?
(c) Using dynamic scoping and ad hoc binding, what value is returned by e?
Exercise 6.11.2 Give the value of the following JavaScript expression when
executed using (a) deep, (b) shallow, and (c) ad hoc binding:
1 ((x, y) => (
2 ((proc2) => (
3 ((proc1) => proc1(5,20))((x, y) => [x, ...proc2()])
4 )
5 )(() => [x, y, x + y])
6 )
7 )(10, 11)
Exercise 6.11.5 Give a Scheme program that outputs different results when run
using deep, shallow, and ad hoc binding.
phrase indicates that the binding takes place before run-time; the adjective dynamic
indicates that the binding takes place at run-time. For instance, the binding of a
variable to a data type (e.g., int a;) takes place before run-time—typically at
compile time—while the binding of a variable to a value takes place at run-time—
typically when an assignment statement (e.g., a = 1;) is executed. Binding is
one of the most foundational concepts in programming languages because other
language concepts involve binding. Scope is a language concept that can be studied
as a type of binding.
Identifiers in a program appear as declarations [e.g., in the expressions
(lambda (tail) ¨ ¨ ¨ ) and (let ((tail ¨ ¨ ¨ )) ¨ ¨ ¨ ) the occurrences of tail
are as declarations] and as references [e.g., in the expression (cons head tail),
cons, head, and tail are references]. There is a binding relationship—defined by
the programming language—between declarations of and references to identifiers
in a program. Each reference is statically or dynamically bound to a declaration
that has limited scope. The scope of a variable declaration in a program is the
region of that program (a range of lines of code) within which references to
that variable refer to the declaration (Friedman, Wand, and Haynes 2001). In
programming languages that use static scoping (e.g., Scheme, Python, and Java),
the relationship between a reference and its declaration is established before run-
time. In a language using dynamic scoping, the determination of the declaration
to which a reference is bound requires run-time information, such as the calling
sequence of procedures.
Languages have scoping rules for determining to which declaration a
particular reference is bound. Lexical scoping is a type of static scoping in which the
scope of a declaration is determined by examining the lexical layout of the blocks
of the program. The procedure for determining the declaration to which a reference
is bound in a lexically scoped language is to search the blocks enclosing the
reference in an inside-out fashion (i.e., from the innermost block to the outermost
block) until a declaration is found. If a declaration is not found, the variable
reference is free (as opposed to bound). Bound references to a declaration can be
shadowed by inner declarations using the same identifier, creating a scope hole.
Lexically scoped identifiers are useful for writing and understanding
programs, but are superfluous and unnecessary for evaluating expressions and
executing programs. Thus, we can replace each reference to a lexically scoped
identifier in a program with its lexical depth and position; this pair of non-negative
integers serves to identify the declaration to which the reference is bound. Depth
indicates the block in which the declaration is found, and position indicates
precisely where in the declaration list of that block the declaration is found;
they use zero-based indexing from inside-out relative to the reference and left-
to-right in the declaration list, respectively. The functions occurs-free? and
occurs-bound? each accept a λ-expression and an identifier and determine
whether the identifier occurs free or bound, respectively, in the expression.
These functions are examples of programs that process other programs, which
we increasingly encounter and develop as we progress toward the interpreter-
implementation part of this text (i.e., Chapters 10–12).
242 CHAPTER 6. BINDING AND SCOPE
Type Systems
[A] proof is a program; the formula it proves is a type for the program.
— Haskell Curry and his intellectual descendants
study programming language concepts related to types—particularly, type
W E
systems and type inference—in this chapter.
7.2 Introduction
The type system in a programming language broadly refers to the language’s
approach to type checking. In a static type system, types are checked and almost all
type errors are detected before run-time. In a dynamic type system, types are checked
and most type errors are detected at run-time. Languages with static type systems
are said to be statically typed or to use static typing. Languages with dynamic
type systems are said to be dynamically typed or to use dynamic typing. Reliability,
predictability, safety, and ease of debugging are advantages of a statically typed
246 CHAPTER 7. TYPE SYSTEMS
There are a variety of methods for achieving a degree of flexibility within the
confines of the type safety afforded by some statically typed languages: parametric
and ad hoc polymorphism, and type inference.
The type concepts we study in this chapter were pioneered and/or made
accessible to programmers in the research projects that led to the development
of the languages ML and Haskell. For this reason as well as because of the
elegant and concise syntax employed in ML/Haskell for expressing types, we
use ML/Haskell as vehicles through which to experience and explore most type
concepts in Chapters 7–9.1 Bear in mind that our objective is not to study how
a particular language addresses type concepts, but rather to learn type concepts
so that we can understand and evaluate how a variety of languages address
type concepts. The interpreted nature, interactive REPL, and terse syntax in
ML/Haskell render them appropriate languages through which concepts related
to types can be demonstrated with ease and efficacy and, therefore, support this
objective.
# include <stdio.h>
void f( i n t x) {
printf ("f accepts a value of type int.\n");
}
i n t main() {
f(1.7);
}
1. The value and utility of ML (Harper, n.d.a, n.d.b) and Haskell (Thompson 2007, p. 6) as teaching
languages have been well established.
2. Note that ints in C are not guaranteed to be 16-bits; an int is only guaranteed to be at least
16-bits. Commonly, on 32-bit and 64-bit processors, an int is 32-bits. Programmers can use int8_t,
int16_t, and int32_t to avoid any ambiguity.
7.3. TYPE CHECKING 247
$ gcc notypechecking.c
$
$ ./a.out
f accepts a value of type int.
Data types for function parameters in C are not required in function definitions or
function declarations (i.e., prototypes):
# include <stdio.h>
void f(x) {
printf ("f accepts a value of any type.\n");
}
i n t main() {
f(1.7);
}
$ gcc notypechecking.c
$
$ ./a.out
f accepts a value of any type.
A warning is issued if data types for function parameters are not used in function
declarations (line 3):
1 # include <stdio.h>
2
3 void f(x);
4
5 i n t main() {
6 f(1.7);
7 }
8
9 void f(x) {
10 printf ("f accepts a value of any type.\n");
11 }
$ gcc notypechecking.c
notypechecking.c:3: warning: parameter names (without types)
in function declaration
$
$ ./a.out
f accepts a value of any type.
c l a s s TypeChecking {
s t a t i c void f( i n t x) {
System.out.println ("f accepts a value of any type.\n");
}
$ javac TypeChecking.java
TypeChecking.java:8: error: incompatible types: possible
lossy conversion from double to int
f(1.7);
^
1 error
The terms strongly and weakly typed do not have universally agreed upon
definitions in reference to languages or type systems. Generally, a weakly or strongly
typed language is one that does or does not, respectively, permit the programmer to
violate the integrity constraints on types. The terms strong and weak typing are
often used to incorrectly mean static and dynamic typing, respectively, but the two
pairs of terms should not be conflated. The nature of a type system (e.g., static or
dynamic) and type safety are orthogonal concepts. For instance, C is a statically
typed language that has a unsafe type system, whereas Python is a dynamically
typed language that has a safe type system.
There are a variety of methods for providing programmers with a degree
of flexibility within the confines of the type safety afforded by some statically
typed languages, thereby mitigating the rigidity enforced by a sound type
system. These methods, which include conversions of various sorts, parametric
and ad hoc polymorphism, and type inference, are discussed in the following
sections.
1 # include <stdio.h>
2
3 i n t main() {
4
5 i n t y;
6
7 /* 3.7 is coerced into an int (3) by truncation */
8 y = 3.7;
9
10 printf ("y as an int: %d\n", y);
11 printf ("y as a float: %f\n", y);
12
13 /* 4.1 is coerced into an int (4) by truncation */
14 y = 4.1;
15
16 printf ("y as an int: %d\n", y);
17 printf ("y as a float: %f\n", y);
18
19 /* 1 is coerced into
20 a double (the default floating-point type) */
21 printf ("3.1+1=%f\n", 3.1+1);
22
23 /* 1 is coerced into a double and then the result of
24 the addition is coerced into an int by truncation */
25 y = 3.1+1;
26
27 printf ("y as an int: %d\n", y);
28 printf ("y as a float: %f\n", y);
29 }
30 $ gcc coercion.c
31 $
32 $ ./a.out
33 y as an int: 3
34 y as a float: -0.000000
35 y as an int: 4
36 y as a float: -0.000000
37 3.1+1=4.100000
38 y as an int: 4
39 y as a float: 4.099998
250 CHAPTER 7. TYPE SYSTEMS
There are five coercions in this program: one each on lines 8, 14, and 21, and two on
line 25. Notice also that coercion happens automatically without any intervention
from the programmer.
While the details of how coercions happen can be complex and vary from
language to language, when integers and floating-point numbers are operands
to an arithmetic operator, the integers are usually coerced into floating-point
numbers. For example, a coercion is made from an integer to a floating-point
number when mixing an integer and a floating-point number with the addition
operator; likewise, a coercion is made from a floating-point number to an integer
when mixing an integer and a floating-point number with the division operator.
In the program just given, when adding an integer and a floating-point number on
line 21, the integer (1) is coerced into a floating-point number (1.0) and the result
is a floating-point number (line 37).
Such implicit conversions are generally a language implementation issue and
dependent on the targeted hardware platform and operating system (because of
storage implications). Consequently, language specifications and standards might
be general or silent on how coercions happen and leave such decisions to the
language implementer. In some cases, the results are predictable:
1 # include <stdio.h>
2
3 i n t main() {
4
5 i n t fourbyteint = 4;
6 double eightbytedouble = 8.22;
7
8 printf ("The storage required for an int: %d.\n", s i z e o f ( i n t ));
9 printf ("The storage required for a double: %d.\n\n",
10 s i z e o f (double));
11
12 printf ("fourbyteint: %d.\n", fourbyteint);
13 printf ("eightbytedouble: %f.\n\n", eightbytedouble);
14
15 /* int coerced into a double; */
16 /* smaller type coerced into a larger type; */
17 /* no loss of data */
18 eightbytedouble = fourbyteint;
19
20 printf ("eightbytedouble: %f.\n", eightbytedouble);
21
22 eightbytedouble = 8.0;
23
24 /* double coerced into an int; */
25 /* larger type coerced into a smaller type; */
26 /* truncation results in loss of data */
27 fourbyteint = eightbytedouble;
28
29 printf ("fourbyteint: %d.\n", fourbyteint);
30 }
31 $ gcc storage.c
32 $
33 $ ./a.out
34 The storage required f o r an int: 4.
35 The storage required f o r a double: 8.
36
7.4. TYPE CONVERSION, COERCION, AND CASTING 251
37 fourbyteint: 4.
38 eightbytedouble: 8.220000.
39
40 eightbytedouble: 4.000000.
41 fourbyteint: 8.
In this program, a value of a type requiring less storage can be generally coerced
(or cast) into one requiring more storage without loss of data (lines 18 and 40).
However, a value of a type requiring more storage cannot generally be coerced (or
cast) into one requiring less storage without loss of data (lines 27 and 41).
In the program coercion.c, when the floating-point result of adding an
integer and a floating-point number is assigned to a variable of type int (line 25),
unlike the results of the expressions on lines 8 and 14 (lines 34 and 36, respectively),
it remains a floating-point number (line 39). Thus, there are no guarantees with
coercion. The programmer forfeits a level of control depending on the language
implementation, hardware platform, and OS being used. As a result, coercion,
while offering flexibility and relieving the programmer of the burden of using
explicit conversions when deviating from the types required by an operator or
function, is generally unpredictable, rendering a program using coercion less
safe. Moreover, while coercions between values of differing types add flexibility
to a program and can be convenient from the programmer’s perspective when
intended, they also happen automatically—and so can be a source of difficult-
to-detect bugs (because of the lack of warnings or errors before run-time) when
unintended. Java does not perform coercion, as seen in this program:
1 public c l a s s NoCoercion {
2 public s t a t i c void main(String[] args) {
3
4 i n t x = 2 + 3.2;
5
6 i f ( f a l s e && (1/0))
7 System.out.println("type mismatch");
8 }
9 }
$ javac NoCoercion.java
NoCoercion.java:4: error: incompatible types:
possible lossy conversion from double to int
int x = 2 + 3.2;
^
NoCoercion.java:6: error: bad operand types f o r
binary operator '&&'
i f ( f a l s e && (1/0))
^
first type: boolean
second type: int
2 errors
0 $ cat NoCoercion2.java
1 public c l a s s NoCoercion2 {
2 public s t a t i c void main(String[] args) {
3
252 CHAPTER 7. TYPE SYSTEMS
4 F f = new F();
5
6 f.f(1.1);
7 }
8 }
9
10 class F {
11 void f ( f l o a t x) {
12 System.out.println("f accepts a value of type float.");
13 }
14 }
$ javac NoCoercion2.java
NoCoercion2.java:6: error: incompatible types:
possible lossy conversion from double to float
f.f(1.1);
^
1 error
1 # include <stdio.h>
2
3 i n t main() {
4
5 /* integer division truncates by default */
6 printf ("%d\n", 10/3);
7
8 /* must use a type cast to interpret the bit pattern
9 resulting from 10/3 as a value of type float
10 to retain the fractional part */
11 printf ("%f\n", ( f l o a t ) 10/3);
12 }
13 $ gcc cast.c
14 $
15 $ ./a.out
16 3
17 3.333333
Here, a type cast, (float), is used on line 11 so that the result of the expression
10/3 is interpreted as a floating-point number (line 17) rather than an integer
(line 16).
integer into the corresponding long integer, to convert the string "250" to the
integer 250:3
# include <stdio.h>
i n t main() {
$ gcc conversion.c
$
$ ./a.out
The string "250" is represented by the integer 250.
Since the statically typed language ML does not have coercion, it needs
provisions for converting values between types. ML supports conversions of
values between types through functions. Conversion functions are necessary in
Haskell, even though types can be mixed in some Haskell expressions.
3. Technically, the strtol function, which replaces the deprecated atoi (ascii to integer) function,
accepts a pointer to a character (which is idiom for a string in C since C does not have a primitive string
type) and returns a long, which in this example is then coerced into an int. Nevertheless, it serves to
convey the intended point here.
4. The prefixes mono and morph are of Greek origin and mean one and form, respectively.
5. The prefix poly is of Greek origin and means many.
6. The type of the (+) (i.e., prefix addition) operator in Haskell is actually Num a => a -> a -> a
because all built-in functions are fully curried in Haskell. Here, we write the type of the domain as a
tuple, and we introduce currying in Section 8.3.
7. The type variable a indicates an “arbitrary type” (as discussed in online Appendices B and C).
254 CHAPTER 7. TYPE SYSTEMS
is an operator that maps two Ints to an Int. This means that the (+) operator
is polymorphic. With this type of polymorphism, referred to as parametric poly-
morphism, a function or data type can be defined generically so that it can handle
arguments in an identical manner, no matter what their type. In other words, the
types themselves in the type signature are parameterized. In general, when we use
the term polymorphism in this text, we are referring to parametric polymorphism.
A polymorphic function type in ML or Haskell specifies that the
type of any function with that polymorphic type is one of multiple
monomorphic types. Recall that a polymorphic function type is a type
expression containing type variables. For example, the polymorphic type
reverse :: [a] -> [a] in Haskell is a shorthand for a collection of
the following (non-exhaustive) list of types: reverse :: [Int] -> [Int],
reverse :: [String] -> [String], and so on. The same holds for a
qualified polymorphic type. For example, show :: Show a => a -> String
in Haskell is shorthand for
- 3.1 + 1;
stdIn:1.2-1.9 Error: operator and operand do not agree
[overload - bad instantiation]
7.5. PARAMETRIC POLYMORPHISM 255
- 3.1 + 1.0;
v a l it = 4.1 : real
- 3 + 1.0;
stdIn:2.1-2.8 Error: operator and operand do not agree
[overload - bad instantiation]
operator domain: 'Z[INT] * 'Z[INT]
operand: 'Z[INT] * real
in expression:
3 + 1.0
- 3 + 1;
v a l it = 4 : int
This does not mean we cannot have a function in ML that accepts a combination
of ints or reals. For instance, the following is a valid function in ML:
- f(1, 1.1);
v a l it = 3 : int
Similarly, the div division operator only accepts two int operands while the /
division operator only accepts two real operands. For instance:
- 10 div 2;
v a l it = 5 : int
- 10 div 3;
v a l it = 3 : int
- 10 div 2.0;
stdIn:3.1-3.11 Error: operator and operand do not agree
[overload - bad instantiation]
operator domain: 'Z[INT] * 'Z[INT]
operand: 'Z[INT] * real
in expression:
10 div 2.0
- 10.0 div 2;
stdIn:1.2-2.1 Error: operator and operand do not agree
[overload - bad instantiation]
operator domain: real * real
operand: real * 'Z[INT]
in expression:
10.0 div 2
stdIn:1.7-1.10 Error: overloaded variable not defined at type
symbol: div
type: real
- 10.0 / 3.0;
v a l it = 3.33333333333 : real
- 4.0 / 2.0;
v a l it = 2.0 : real
- 4.2 / 2.1;
v a l it = 2.0 : real
- 4.3 / 2.5;
v a l it = 1.72 : real
- 10.0 / 3;
stdIn:7.1-7.9 Error: operator and operand do not agree
[overload - bad instantiation]
operator domain: real * real
operand: real * 'Z[INT]
in expression:
10.0 / 3
- 10 / 3.0;
stdIn:1.2-1.10 Error: operator and operand do not agree
[overload - bad instantiation]
operator domain: real * real
operand: 'Z[INT] * real
in expression:
10 / 3.0
- 10 / 3;
stdIn:1.2-1.8 Error: operator and operand do not agree
[overload - bad instantiation]
operator domain: real * real
operand: 'Z[INT] * 'Y[INT]
in expression:
10 / 3
This response indicates that if type p is in the type class Num, then 1 has the type p.
In other words, 1 is of some type in the Num class. Such a type is called a qualified
type or constrained type (Table 7.2). The left-hand side of the => symbol—which here
is in the form C —is called the class constraint or context, where C is a type class
and is a type variable.
type clss constrint
hkkkkkkkkkkkkkikkkkkkkkkkkkkj
expression type clss type vrible
hkkikkj type vrible
hkkikkj hkkikkj hkkikkj
e :: looooooooooooomooooooooooooon
C a “ą a
context
A type class is a collection of types that are guaranteed to have definitions for a set
of functions—like a Java interface.
The fromRational function is similarly implicitly applied to every literal
number with a decimal point:
General:
e :: C a => a means “If type a is in type class C, then e has type a.”
Example:
3 :: Num a => a means “If type a is in type class Num, then 3 has type a.”
Table 7.2 The General Form of a Qualified Type or Constrained Type and an Example
258 CHAPTER 7. TYPE SYSTEMS
<interactive>:1:12: e r r o r:
No i n s t a n c e for ( F r a c t i o n a l I n t ) arising from the literal '1.1'
In the second argument of '(+)', namely '1.1'
7.5. PARAMETRIC POLYMORPHISM 259
<interactive>:2:1: e r r o r :
No i n s t a n c e for ( F r a c t i o n a l I n t ) arising from the literal '1.1'
In the first argument of '(+)', namely '1.1'
In the expression: 1.1 + (1 :: I n t)
In an equation for 'it': it = 1.1 + (1 :: I n t )
point by a number without a decimal point, or vice versa, or divide a number with
a decimal point by another with a decimal point:
<interactive>:1:1: e r r o r :
Ambiguous type variable 'a0' arising from a use of 'p r i n t '
prevents the constraint '(Show a0)' from being solved.
Probable fix: use a type annotation to specify what 'a0'
should be.
These potential instances exist:
i n s t a n c e Show Ordering -- Defined in 'GHC.Show'
i n s t a n c e Show I n t e g e r -- Defined in 'GHC.Show'
i n s t a n c e Show a => Show (Maybe a) -- Defined in 'GHC.Show'
...plus 22 others
...plus 13 instances involving out-of -scope types
(use -fprint-potential-instances to see them a l l )
In a stmt of an interactive GHCi command: p r i n t it
<interactive>:2:1: e r r o r :
Ambiguous type variable 'a0' arising from a use of 'p r i n t '
prevents the constraint '(Show a0)' from being solved.
Probable fix: use a type annotation to specify what 'a0'
should be.
These potential instances exist:
i n s t a n c e Show Ordering -- Defined in 'GHC.Show'
i n s t a n c e Show I n t e g e r -- Defined in 'GHC.Show'
i n s t a n c e Show a => Show (Maybe a) -- Defined in 'GHC.Show'
...plus 22 others
...plus 13 instances involving out-of -scope types
(use -fprint-potential-instances to see them a l l )
In a stmt of an interactive GHCi command: p r i n t it
<interactive>:3:1: e r r o r :
Ambiguous type variable 'a0' arising from a use of 'p r i n t '
prevents the constraint '(Show a0)' from being solved.
Probable fix: use a type annotation to specify what 'a0'
should be.
These potential instances exist:
i n s t a n c e Show Ordering -- Defined in 'GHC.Show'
i n s t a n c e Show I n t e g e r -- Defined in 'GHC.Show'
i n s t a n c e Show a => Show (Maybe a) -- Defined in 'GHC.Show'
...plus 22 others
...plus 13 instances involving out-of -scope types
(use -fprint-potential-instances to see them a l l )
In a stmt of an interactive GHCi command: p r i n t it
Prelude > 4 / 2
2.0
This function, like the / Fractional division operator, can be passed a number
without a decimal point:
*Main> halve 2
1.0
However, consider the following definition of a function that accepts the numeric
average of a list of numbers:
<interactive>:1:23: e r r o r:
Could not deduce ( F r a c t i o n a l I n t ) arising from a use of '/'
from the context: Foldable t
bound by the inferred type of
listaverage_wrong :: Foldable t => t I n t -> I n t
at <interactive>:1:1-38
In the expression: sum l / length l
In an equation for 'listaverage_wrong':
listaverage_wrong l = sum l / length l
The problem here is that while the type of the sum function is
(Foldable t, Num a) => t a -> a, the type of the length function
is Foldable t => t a -> Int. Thus, it returns a value of type Int, not one
of type Num a => a, and the type Int is not a member of the Fractional class
required by the / Fractional division operator. The type class system with
coercion used in Haskell to deal with the rigidity of a sound type system adds
complexity to the language.
The following transcript of a session with Haskell demonstrates the same
arithmetic expressions given previously in ML, but formatted in Haskell syntax:
262 CHAPTER 7. TYPE SYSTEMS
Prelude > 3 + 1
4
Prelude > 10 / 3
3.33333333333333
Operator/function overloading refers to using the same function name for multiple
function definitions, where the type signature of each definition involves a
different return type, different types of parameters, and/or a different number
of parameters. When an overloaded function is invoked, the applicable function
definition to bind to the function call (obtained from a collection of definitions
with the same name) is determined based on the number and/or the types
of arguments used in the invocation. Function/operator overloading is also
called ad hoc polymorphism. In general, operators/functions cannot be overloaded
in ML and Haskell because every operator/function must have only one
type:
- f(1,2);
stdIn:8.1-8.7 Error: operator and operand do not agree
[overload - bad instantiation]
operator domain: int * real
operand: int * 'Z[INT]
in expression:
f (1,2)
- f(1,2.2);
v a l it = 3 : int
0 $ cat overloading.hs
1 f :: (Int , I n t ) -> I n t
2 f (x,y) = 3
3
4 f :: (Int , F l o a t ) -> I n t
5 f (x,y) = 3
$ ghci overloading.hs
GHCi, version 8.10.1: https://www.haskell.org/ghc/ :? for help
[1 of 1] Compiling Main ( overloading.hs, interpreted )
overloading.hs:4:1: e r r o r:
Duplicate type signatures for 'f'
at overloading.hs:1:1
overloading.hs:4:1
|
4 | f :: (Int , F l o a t ) -> I n t
| ^
overloading.hs:5:1: e r r o r:
Multiple declarations of 'f'
Declared at: overloading.hs:2:1
overloading.hs:5:1
|
264 CHAPTER 7. TYPE SYSTEMS
5 | f (x,y) = 3
| ^
Failed, no modules loaded.
0 $ cat nooverloading.c
1 # include <stdio.h>
2
3 void f( i n t x) {
4 printf ("f accepts a value of type int.\n");
5 }
6
7 void f(double x) {
8 printf ("f accepts a value of type double.\n");
9 }
10
11 i n t main() {
12 f(1.7);
13 }
$ gcc nooverloading.c
nooverloading.c:7:6: error: conflicting types f o r 'f'
void f(double x) {
^
nooverloading.c:3:6: note: previous definition of 'f' was here
void f(int x) {
^
Thus, ML, Haskell, and C do not support function overloading; C++ and Java do
support function overloading:
$ cat overloading.cpp
# include <iostream>
void f( i n t x) {
cout << "f accepts a value of type int." << endl;
}
void f(double x) {
cout << "f accepts a value of type double." << endl;
}
i n t main() {
f(1.7);
}
$
$ g++ overloading.cpp
$
$ ./a.out
f accepts a value of type double.
$ cat overloading2.cpp
# include <iostream>
c l a s s Overloading {
public:
void f ( i n t x) {
cout << "f accepts a value of type int." << endl;
}
void f (double x) {
cout << "f accepts a value of type double." << endl;
}
};
i n t main() {
Overloading o;
o.f(1);
o.f(1.1);
}
$
$ g++ overloading2.cpp
$
$ ./a.out
f accepts a value of type i n t .
f accepts a value of type double.
$ cat Overloading.java
public c l a s s Overloading {
public s t a t i c void main(String[] args) {
o.f(1);
o.f(1.1);
}
}
c l a s s Overload {
void f ( i n t x) {
System.out.println("f accepts a value of type int.");
}
void f (double x) {
System.out.println("f accepts a value of type double.");
}
}
$
$ javac Overloading.java
$
$ java Overloading
f accepts a value of type i n t .
f accepts a value of type f l o a t .
The extraction (i.e., input, ») and insertion (i.e., output, «) operators are
commonly overloaded in C++ to make I / O of user-defined objects convenient:
$ cat overloading_operators.cpp
# include <iostream>
c l a s s Employee {
private:
i n t id;
string name;
double rate;
public:
f r i e n d ostream& operator << (ostream &out, Employee &e);
f r i e n d istream& operator >> (istream &in, Employee &e);
};
r e t u r n in;
}
i n t main() {
Employee Mary, Lucia;
8. Some of the commonly used (arithmetic) primitive operators in ML are overloaded (e.g., binary
addition).
7.7. FUNCTION OVERRIDING 267
The data type int is the default numeric type in ML (Section 7.9). However, we
can define a square function in Haskell that accepts any numeric value:
The Haskell type class system supports the definition of what seem to be
overloaded functions like square.9 Recall that the type class system allows values
of different types to be used interchangeably if those types are properly related in
the hierarchy. The flexibility fostered by a type or class hierarchy in the definition
of functions is similar to ad hoc polymorphism (i.e., overloading), but is called
interface polymorphism.
While they take advantage of the various concepts that render a static type
system more flexible, ML and Haskell come with irremovable type checks
for safety that generate error messages for discovered type errors and type
mismatches.10 Put simply, ML and Haskell programs are thoroughly type-checked
before run-time. Almost no ML or Haskell program that can run will ever have a
type error. As a result, an ML or Haskell program that passes all of the requisite
type checks almost never fails.
1 (define overriding
2 (lambda ()
3 ( l e t ((f (lambda ()
4 ( l e t ((g (lambda ()
5 ( l e t ((f (lambda () (+ 1 2))))
6 ;; call to inner f
7 (f)))))
9. Functions like square are often generally referred to as overloaded functions in Haskell
programming books and resources.
10. Ada gives the programmer the ability to suspend type checking.
268 CHAPTER 7. TYPE SYSTEMS
8 (g)))))
9 ;; call to outer f
10 (f))))
Here, the call to function f on line 10 binds to the outermost definition of f (starting
on line 3) because the innermost definition of f (line 5) is not visible on line 10—it
is defined in a nested block. The call to function f on line 7 binds to the innermost
definition of f (line 5) because on line 7 where f is called, the innermost definition
of f (line 5) shadows the outermost definition of f. In other words, the outermost
definition of f is not visible on line 7.
Types for values, variables, function parameters, and return types are similarly
declared in Haskell:
7.9. TYPE INFERENCE 269
$ cat declaring.hs
square(n :: Double) = n*n :: Double
$ cat declaring.hs
square :: Double -> Double
square(n) = n*n
Explicitly declaring types requires effort on the part of the programmer and can be
perceived as requiring more effort than necessary to justify the benefits of a static
type system. Type inference is a concept of programming languages that represents
a compromise and attempts to provide the best of both worlds. Type inference refers
to the automatic deduction of the type of a value or variable without an explicit
type declaration. ML and Haskell use type inference, so the programmer is not
required to declare the type of any variable unless necessary (e.g., in cases where it
is impossible for type inference to deduce a type). Both languages include a built-in
type inference engine to deduce the type of a value based on context. Thus, ML and
Haskell use type inference to relieve the programmer of the burden of associating
a type with every name in a program. However, an explicit type declaration is
required when it is impossible for the inference algorithm to deduce a type. ML
introduced the idea of type inference in programming languages in the 1970s.
Both ML and Haskell use the Hindley–Milner algorithm for type inference. While
the details of this algorithm are complex and beyond the scope of this text, we
will make some cursory remarks on its use. Understanding the fundamentals of
how these languages deduce types helps the programmer know when explicit
type declarations are required and when they can be omitted. Though not always
necessary, in ML and Haskell, a programmer can associate a type with (1) values,
(2) variables, (3) function parameters, and (4) return types. The main idea in type
inference is this: Since all operands to a function or operator must be of the
required type, and since values of differing numeric types cannot be mixed as
operands to arithmetic operators, once we know the type of one or more values in
an expression (because, for example, it was explicitly declared to be of that type) by
transitive inference we can progressively determine the type of each other value.
In essence, knowledge of the type of a value (e.g., a parameter or return value)
can be leveraged as context to determine the types of other entities in the same
expression. For instance, in ML:
Declaring the parameter x to be of type real is enough for ML to deduce the type
of the function add’ as fn : real * real -> real. Since the first operand
to the + operator is a value of type real, the second operand must also be of type
7.9. TYPE INFERENCE 271
real because the types of the two operands must be the same. In turn, the return
type is a value of type real because the sum of two values of type real is a value
of type real. A similar line of reasoning is used in ML to deduce that the type
of add" and the type of add"' is fn : real * real -> real. The Haskell
analogs of these examples follow:
$ cat declaring.hs
square'(n :: Double) = n*n
add'(x :: Double, y) = x + y
add''(x, y :: Double) = x + y
add'''(x,y) = x + y :: Double
$
$ ghci -XScopedTypeVariables declaring.hs
GHCi, version 8.10.1: https://www.haskell.org/ghc/ :? for help
[1 of 1] Compiling Main ( declaring.hs, interpreted )
Ok, one module loaded.
- 3.0;
v a l it = 3.0 : real
- l e t v a l x = 3.0 in x end;
v a l it = 3.0 : real
- fun add(x,y) = x + y;
v a l add = fn : int * int -> int
In Haskell, for these examples, the inferred type is never the same as the declared
type:
Here, the type of f is inferred: Adding 0.0 to a means that a must be of type
real (because the numeric type of each operand must match), so b must be of
type real. Consider another example where information other than an explicitly
declared type is used as a basis for type inference:
- fun sum([]) = 0
= | sum(x::xs) = x + sum(xs);
v a l sum = fn : int list -> int
Here, the 0 returned in the first case of the sum function causes ML to infer the type
int list -> int for the function sum because 0 is an integer and a function
can only return a value of one type.
7.9. TYPE INFERENCE 273
- fun reverse([]) = []
= | reverse(x::xs) = reverse(xs) @ [x];
v a l reverse = fn : 'a list -> 'a list
ƒ :: A Ñ B e :: A
ƒ e :: B
For example, the typing False :: Bool can be inferred from
this rule using the fact that not::BoolÑBool and False::Bool.
(Hutton 2007, pp. 17–18)
Recall that it is not possible in ML and Haskell to deviate from the types
required by operators and functions. However, type inference offers some relief
from having to declare a type for all entities. Notably, it supports static typing
without explicit type declarations. If you know the intended type of a user-
defined function, but are not sure which type will be inferred for it, you may
explicitly declare the type of the entire function (rather than explicitly declaring
types of selective parameters or values, or the return type, to assist the inference
274 CHAPTER 7. TYPE SYSTEMS
engine in deducing the intended type), if possible, rather than risk that the
inferred type is not the intended type. Conversely, if it is clear that the type that
will be inferred is the same as the intended type, there is no need to explicitly
declare the type of a user-defined function. Let the inference engine do that work
for you.
Strong typing provides safety, but requires a type to be associated with every
name. The use of type inference in a statically typed language obviates the need to
associate a type with each identifier:
Static, Safe Type System + Type Inference Obviates the Need to Declare Types
Static, Safe Type System + Type Inference Reliability/Safety + Manifest Typing
> (f 1)
1
> (f 1 2)
procedure f: expects 1 argument, given 2: 1 2
> (f 1 2 3)
procedure f: expects 1 argument, given 3: 1 2 3
The second and third cases fail because f is defined to accept only one argument,
and not two and three arguments, respectively.
Every function in Scheme is defined to accept only one list argument. We
did not present Scheme functions in this way initially because most readers are
probably familiar with C, C++, or Java functions that can accept one or more
arguments. Arguments to any Scheme function are always received collectively
as one list, not as individual arguments. Moreover, Scheme, like ML and Haskell,
does pattern matching from this single list of arguments to the specification of the
parameter list in the function definition. For instance, in the first invocation just
given, the argument 1 is received as (1) and then pattern matched against the
parameter specification (x); as a result, x is bound to 1. In the second invocation,
the arguments 1 2 are received as the list (1 2) and then pattern matched against
the parameter specification (x), but the two cannot be matched. Similarly, in the
third invocation, the arguments 1 2 3 are received as the list (1 2 3) and then
pattern matched against the parameter specification (x), but the two cannot be
7.10. VARIABLE-LENGTH ARGUMENT LISTS IN SCHEME 275
matched. In the fourth invocation, the argument ’(1 2 3) is received as the list
((1 2 3)) and then pattern matched against the parameter specification (x); as
a result, x is bound to (1 2 3).
Scheme, like ML and Haskell, performs pattern matching from arguments
to parameters. However, since lists in ML and Haskell must contain elements
of the same type (i.e., homogeneous), the pattern matching in those languages
is performed against the arguments represented as a tuple (which can be
heterogeneous). In Scheme, the pattern matching is performed against a list
(which can be heterogeneous). This difference is syntactically transparent
since both lists in Scheme and tuples in ML and Haskell are enclosed in
parentheses.
Even though any Scheme function can accept only one list argument, because
a list may contain any number of elements, including none, any Scheme function
can effectively accept any fixed or variable number of arguments. (A function capable
of accepting a variable number of input arguments is called a variadic function.11 )
To restrict a function to a particular number of arguments, a Scheme programmer
must write the parameter specification, from which the arguments are matched,
in a particular way. For instance, (x) is a one-element list that, when used as a
parameter list, forces a function to accept only one argument. Similarly, (x y) is
a two-element list that, when used as a parameter list, forces a function to accept
only two arguments, and so on. This is the typical way in which we have defined
Scheme functions:
By removing the parentheses around the parameter list in Scheme, and thereby
altering the pattern from which arguments are matched, we can specify a function
that accepts a variable number of arguments. For instance, consider a slightly
modified definition of the identity function, and the same four invocations as
shown previously:
> (f 1)
'(1)
In the first invocation, the argument 1 is received as the list (1) and then pattern
matched against the parameter specification x; as a result, x is bound to (1). In
the second invocation, the arguments 1 2 are received as the list (1 2) and then
pattern matched against the parameter specification x; as a result, x is bound to
the list (1 2). In the third invocation, the arguments 1 2 3 are received as the
list (1 2 3) and then pattern matched against the parameter specification x; x is
bound to the list (1 2 3). In the fourth invocation, the argument ’(1 2 3) is
received as the list ((1 2 3)) and then pattern matched against the parameter
specification x; x is bound to ((1 2 3)). Thus, now the second and third cases
work because this modified identity function can accept a variable number of
arguments.
A programmer in ML or Haskell can decompose a single list argument in
the formal parameter specification of a function definition using the :: and :
operators, respectively [e.g., fun f (x::xs, y::ys) = ... in ML]. A Scheme
programmer can decompose an entire argument list in the formal parameter
specification of a function definition using the dot notation. Note that an argument
list is not the same as a list argument. A function can accept multiple list arguments,
but has only one argument list. Therefore, while ML and Haskell allow the
programmer to decompose individual list arguments using the :: and : operators,
respectively, a Scheme programmer can only decompose the entire argument list
using the dot notation.
The ability to decompose the entire argument list (and the fact that arguments
are received into any function as a single list) provides another way for a function
to accept a variable number of arguments. For instance, consider the following
definitions of argcar and argcdr, which return the car and cdr of the argument
list received:
> (argcdr 1)
'()
> (argcdr 1 2)
'(2)
> (argcdr 1 2 3)
'(2 3)
Here, the dot (.) in the parameter specifications is being used as the Scheme analog
of :: and : in ML and Haskell, respectively, albeit over an entire argument list
rather than over an individual list argument as in ML or Haskell. Again, the dot in
Scheme cannot be used to decompose individual list arguments:
Again, though transparent, Scheme, like ML and Haskell, also does pattern
matching from arguments to parameters. However, in ML and Haskell, individual
list arguments can be pattern matched as well. In Scheme, functions can accept
only a single list argument, which appears to be restrictive, but means that Scheme
functions are flexible and general—they can effectively accept a variable number
of arguments. In contrast, any ML or Haskell function can have only one type. If
such a function accepted a variable number of parameters, it would have multiple
types. Tables 7.4 and 7.5 summarize these nuances of argument lists in Scheme
vis-à-vis ML and Haskell.
278 CHAPTER 7. TYPE SYSTEMS
Table 7.4 Scheme Vis-à-Vis ML and Haskell for Fixed- and Variable-Sized
Argument Lists
Table 7.5 Scheme Vis-à-Vis ML and Haskell for Reception and Decomposition of
Argument(s)
Exercise 7.2 Is the addition operator (+) overloaded in ML? Explain why or why
not.
Exercise 7.3 Explain why the following ML expressions do not type check:
Exercise 7.4 Explain why the following Haskell expressions do not type check:
Exercise 7.5 Why does integer division in C truncate the fractional part of the
result?
7.10. VARIABLE-LENGTH ARGUMENT LISTS IN SCHEME 279
Exercise 7.6 Languages with coercion, such as Fortran, C, and C++, are less
reliable than those languages with little or no coercion, such as Java, ML,
and Haskell. What advantages do languages with coercion offer in return for
compromising reliability?
Exercise 7.7 In C++, why is the return type not considered when the compiler tries
to resolve (i.e., disambiguate) the call to an overloaded function?
Exercise 7.8 Identify a programming language suitable for each cell in the
following table:
Exercise 7.9
(c) Is duck typing the same concept as dynamic binding of messages to methods
(based on the type of an object at run-time rather than its declared type) in
languages supporting object-oriented programming (e.g., Java and Smalltalk)?
Explain.
Exercise 7.11 Given a function mystery with two parameters, the SML - NJ
environment produces the following response:
val mystery = fn: int list -> int list -> int list
List everything you can determine from this type about the definition of mystery
as well as the ways which it can be invoked.
280 CHAPTER 7. TYPE SYSTEMS
Explain what in this function definition causes the ML type inference algorithm to
deduce its type as:
Exercise 7.14 Explain why the ML function reverse (defined in Section 7.9) is
polymorphic, while the ML function sum (also defined in Section 7.9) is not.
(define f
(lambda (x)
(car x)))
(define f
(lambda (x y)
(cons x y)))
(b) Run this program in DrRacket with the language set to Racket (i.e., #lang
racket). Run it with the language set to R5RS (i.e., #lang r5rs). What do
you notice?
(explicit) Conversion
functions
Monomorphic types
(parametric)
Polymorphic types
(parametric polymorphism)
Type signatures
Function overloading
(ad hoc polymorphism)
Function overriding
contrast, can accept a fixed-size argument tuple containing one or more arguments
[e.g., (x), (x, y), and so on], but cannot accept a variable number of arguments.
(Any function in ML and Haskell must have only one type.) Arguments in ML
and Haskell are not received as a list, but rather as a tuple, and any individual
list argument can be decomposed using the :: and : operators, respectively
[e.g., fun f (x::xs, y::ys) = ... in ML]. Decomposition of individual
list arguments (using dot notation) is not possible in Scheme. The ability of a
function to accept a variable number of arguments offers flexibility. Not only does
it allow the function to be defined in a general manner, but it also empowers
the programmer to implement programming abstractions, which we explore in
Chapter 8.
Currying and
Higher-Order Functions
Table 8.1 Type Signatures and λ-Calculus for a Variety of Higher-Order Functions.
Each signature assumes a ternary function ƒ : p ˆ b ˆ cq Ñ d. All of these
functions except apply return a function. In other words, all but apply are closed
operators.
ƒ , and that applies ƒ to these (individual) arguments and returns the result:
Thus, the function apply applies a function to arguments and the function eval
evaluates an expression in environment. The functions eval and apply are at the
heart of any interpreter, as we see in Chapters 10—12.
Partial function application (also called partial argument application or partial
function instantiation), papply1, refers to the concept that if a function, which
accepts at least one parameter, is invoked with only an argument for its
first parameter (i.e., partially applied), it returns a new function accepting the
arguments for the remaining parameters; this new function, when invoked with
arguments for those parameters, yields the same result as would have been
returned had the original function been invoked with arguments for all of its
8.2. PARTIAL FUNCTION APPLICATION 287
ƒ p 1 q “ gpp 2 , p 3 , ¨ ¨ ¨ , p n q
such that
gp 2 , 3 , ¨ ¨ ¨ , n q “ ƒ p 1 , 2 , 3 , ¨ ¨ ¨ , n q
The type signature and λ-calculus for papply1 are given in Table 8.1. The
papply1 function, defined in Scheme in Table 8.2 (left), accepts a function fun
and its first argument arg and returns a function accepting arguments for the
remainder of the parameters. Intuitively, the papply1 function can partially apply
a function with respect to an argument for only its first parameter:
6
> (apply f '())
6
We can generalize partial function application from accepting only the first
argument of its input function to accepting arguments for any prefix of the
parameters of its input function. Thus, more generally, partial function application,
papply, refers to the concept that if a function, which accepts at least one
parameter, is invoked with only arguments for a prefix of its parameters (i.e.,
partially applied), it returns a new function accepting the arguments for the
unsupplied parameters; this new function, when invoked with arguments for
those parameters, yields the same result as would have been returned had the
original function been invoked with arguments for all of its parameters. Thus,
more generally, with partial function application, for any function ƒ pp1 , p2 , ¨ ¨ ¨ , pn q,
ƒ p 1 , 2 , ¨ ¨ ¨ , m q “ gpp m ` 1 , p m ` 2 , ¨ ¨ ¨ , p n q
where m ď n, such that
gp m ` 1 , m ` 2 , ¨ ¨ ¨ , n q “ ƒ p 1 , 2 , ¨ ¨ ¨ , m , m ` 1 , m ` 2 , ¨ ¨ ¨ , n q
The type signature and λ-calculus for papply are given in Table 8.1. The papply
function, defined in Scheme in Table 8.2 (right), accepts a function fun and
arguments for the first n of m parameters to ƒ where m ď n, and returns
a function accepting the remainder of the pn ´ mq parameters. Intuitively, the
papply function can partially apply a function with respect to arguments for any
prefix of its parameters, including all of them:
Thus, the papply function subsumes the papply1 function because the papply
function generalizes the papply1 function. For instance, we can replace papply1
with papply in all of the preceding examples:
pn´2q´ary-function
hkkkkkkkkkkkkkkkkkkkikkkkkkkkkkkkkkkkkkkj
pn´1q´ary function
hkkkkkkkikkkkkkkj
pppy p¨ ¨ ¨ p pppy p pppypƒ , 1q , 2q, ¨ ¨ ¨ q, nq
loooooooooooooooooooooooooooooooooooomoooooooooooooooooooooooooooooooooooon
argumentless function fixpoint
;; three applications
((papply1 (papply1 (papply1 add 1) 2) 3))
((papply1 (papply1 (papply add 1) 2) 3))
((papply1 (papply (papply1 add 1) 2) 3))
((papply1 (papply (papply add 1) 2) 3))
((papply (papply (papply1 add 1) 2) 3))
((papply (papply (papply add 1) 2) 3))
((papply (papply1 (papply1 add 1) 2) 3))
((papply (papply1 (papply add 1) 2) 3))
;; two applications
((papply1 (papply add 1 2) 3))
((papply (papply add 1 2) 3))
((papply1 (papply1 add 1) 2 3))
((papply (papply1 add 1) 2 3))
;; one application
((papply add 1 2 3))
(define pow
(lambda (e b)
(cond
8.2. PARTIAL FUNCTION APPLICATION 291
((eqv? b 0) 0)
((eqv? b 1) 1)
((eqv? e 0) 1)
((eqv? e 1) b)
(else (* b (pow (- e 1) b))))))
The disadvantages of this approach are the need to explicitly call eval (lines
16 and 30) when invoking the residual function and the need to define multiple
versions of this function, each corresponding to all possible ways of partially
applying a function of n parameters. For instance, partially applying a ternary
function in all possible ways (i.e., all possible partitions of parameters) requires
functions s111 (each argument individually), s12 (first argument individually
and last two in one stroke), and s21 (first two arguments in one stroke and
last argument individually). As n increases, the number of functions required
combinatorially explodes. However, this approach is advantageous if we desire
to restrict the ways in which a function can be partially applied since the function
papply cannot enforce any restrictions on how a function is partially applied.
292 CHAPTER 8. CURRYING AND HIGHER-ORDER FUNCTIONS
Exercise 8.2.2 Define a function s21 that enables you to partially apply the
following ternary Scheme function add using the approach illustrated at the end
of Section 8.2 (lines 1–31):
(define add
(lambda (x y z)
(+ x y z)))
Exercise 8.2.3 Define a function s12 that enables you to partially apply the ternary
Scheme function add in Programming Exercise 8.2.2 using the approach illustrated
at the end of Section 8.2 (lines 1–31).
8.3 Currying
Currying refers to converting an n-ary function into one that accepts only one
argument and returns a function, which also accepts only one argument and
returns a function that accepts only one argument, and so on. This technique was
introduced by Moses Schönfinkel, although the term was coined by Christopher
Strachey in 1967 and refers to logician Haskell Curry. For now, we can think of
a curried function as one that permits transparent partial function application
(i.e., without calling papply1 or papply). In other words, a curried function
(or a function written in curried form, as discussed next) can be partially applied
without calling papply1 or papply. Later, we see that a curried function is not
being partially applied at all.
1 Prelude > :{
2 Prelude | powucf(0, _) = 1
3 Prelude | powucf(1, b) = b
4 Prelude | powucf(_, 0) = 0
5 Prelude | powucf(e, b) = b * powucf(e-1, b)
6 Prelude |
7 Prelude | powcf 0 _ = 1
8 Prelude | powcf 1 b = b
9 Prelude | powcf _ 0 = 0
10 Prelude | powcf e b = b * powcf (e-1) b
11 Prelude | :}
8.3. CURRYING 293
These definitions are almost the same. Notice that the definition of the powucf
function has a comma between each parameter in the tuple of parameters, and that
tuple is enclosed in parentheses; conversely, there are no commas and parentheses
in the parameters tuple in the definition of the powcf function. As a result, the
types of these functions are different.
The type of the powucf function states that it accepts a tuple of values of a type in
the Num class and returns a value of a type in the Num class. In contrast, the type
of the powcf function indicates that it accepts a value of a type in the Num class
and returns a function mapping a value of a type in the Num class to a value of
the same type in the Num class. The definition of powcf is written in curried form,
meaning that it accepts only one argument and returns a function, also with only
one argument:
45 Prelude >
46 Prelude > powucf 2
47
48 <interactive>:38:1: e r r o r :
49 Non type -variable argument in the constraint: Num (a, b)
50 (Use FlexibleContexts to permit this)
51 When checking the inferred type
52 it :: forall a b. (Eq a, Eq b, Num a, Num b, Num (a, b)) => b
These examples bring us face-to-face with the fact that Haskell (and ML)
perform literal pattern matching from function arguments to parameters (i.e., the
parentheses and commas must also match).
pp 1 ˆ p 2 ˆ ¨ ¨ ¨ ˆ p n q Ñ r
p1 Ñ pp2 Ñ p¨ ¨ ¨ Ñ ppn Ñ r q ¨ ¨ ¨ qq
such that
Currying ƒncrred and running the resulting ƒcrred function has the same
effect as progressively partially applying ƒncrred. Inversely, uncurrying
transforms a function ƒcrred with the type signature
p1 Ñ pp2 Ñ p¨ ¨ ¨ Ñ ppn Ñ r q ¨ ¨ ¨ qq
into a function ƒncrred with the type signature
pp 1 ˆ p 2 ˆ ¨ ¨ ¨ ˆ p n q Ñ r
such that
Currying and uncurrying are defined as higher-order functions (i.e., curry and
uncurry, respectively) that each accept a function as an argument and return a
function as a result (i.e., they are closed functions). In Haskell, the built-in function
curry can accept only an uncurried binary function with type (a,b) -> c as
input. Similarly, the built-in function uncurry can accept only a curried function
with type a -> b -> c as input. The type signatures and λ-calculus for the
functions curry and uncurry are given in Table 8.1. Definitions of curry and
uncurry for binary functions in Haskell are given in Table 8.3. Notice that
the definitions of curry and uncurry in Haskell are written in curried form.
(Programming Exercises 8.3.22 and 8.3.23 involve defining curry and uncurry,
respectively, in uncurried form in Haskell for binary functions.) Definitions of
curry and uncurry for binary functions in Scheme are given in Table 8.4 and
applied in the following examples:
Table 8.3 Definitions of curry and uncurry in Curried Form in Haskell for Binary
Functions
8.3. CURRYING 297
Table 8.4 Definitions of curry and uncurry in Scheme for Binary Functions
The functions papply1, papply, curry, and uncurry are closed: Each
accepts a function as input and returns a function as output. It is necessary, but
not sufficient, for a function to be closed to be able to be reapplied to its result. For
instance, curry and uncurry are both closed, but neither can be reapplied to its
own result. The functions papply1 and papply are closed, however, so each can
be reapplied to its result as demonstrated previously.
returns, and so on, accept only one argument. Therefore, with respect to its
uncurried version, invoking a curried function appears to correspond to partially
applying it, and partially applying its result, and so on. Consider the following def-
initions of a ternary addition function in uncurried and curried forms in Haskell:
While the function adducf can only be invoked one way [i.e., with the same
number and types of arguments; e.g., adducf(1,2,3)], the function addcf can
effectively be invoked in the following ways, including the one and only way the
type of adducf specifies it must be invoked (i.e., with only one argument, as in
the first invocation here):
addcf 1
addcf 1 2
addcf 1 2 3
Because the type of addcf is Num a => a -> a -> a -> a, we know it
can accept only one argument. However, the second and third invocations of
addcf just given make it appear as if it can accept two or three arguments as
well. The absence of parentheses for precedence makes this illusion stronger. Let
us consider the third invocation of addcf—that is, addcf 1 2 3. The addcf
function is called as required with only one argument (addcf 1), which returns
a new, unnamed function that is then implicitly invoked with one argument
1
(ăfirst returned procą 2 or ddcƒ 2), which returns another new, unnamed
function, which is then implicitly invoked with one argument (ăsecond returned
2
procą 3 or ddcƒ 3) and returns the sum 6. Using parentheses to make
the implied precedence salient, the expression addcf 1 2 3 is evaluated as
(((addcf 1) 2) 3):
2 2
ddcƒ
hkkkkkikkkkkj ddcƒ
hkkkkkkkkikkkkkkkkj
1 1
ddcƒ
hkkkikkkj ddcƒ
hkkkkkikkkkkj
addcf 1 2 3 “ p p paddcf 1q 2q 3q
Thus, even though a function written in curried form (e.g., addcf) can appear
to be invoked with more than one argument (e.g., addcf 1 2 3), it can never
accept more than one argument because the type of a curried function (or a
function written in curried form) specifies that it must accept only one argument
(e.g., Num a => a -> a -> a -> a).
The omission of superfluous parentheses for precedence in an invocation of a
curried function must not be confused with the required absence of parentheses
around the list of arguments:
8.3. CURRYING 299
Moreover, notice that in Haskell (and ML) an open parenthesis to the immediate
right of the returned function is not required to force its implicit application, as is
required in Scheme:
• the one and only way its uncurried analog is invoked (i.e., with all arguments
as a complete application)
• the one and only way it itself can be invoked (i.e., with only one argument)
• n ´ 2 other ways corresponding to implicit partial applications of each
returned function
More generally, if a curried function, whose uncurried analog accepts more than
one parameter, is invoked with only arguments for a prefix of the parameters of
its uncurried analog, it returns a new function accepting the arguments for the
parameters of the uncurried analog whose arguments were left unsupplied; that
new function, when invoked with arguments for those parameters, yields the same
result as would have been returned had the original, uncurried function been
invoked with arguments for all of its parameters. Thus, akin to partial function
300 CHAPTER 8. CURRYING AND HIGHER-ORDER FUNCTIONS
ƒ p 1 , 2 , ¨ ¨ ¨ , m q “ gpp m` 1 , p m` 2 , ¨ ¨ ¨ , p n q
gp m ` 1 , m ` 2 , ¨ ¨ ¨ , n q “ ƒ p 1 , 2 , ¨ ¨ ¨ , m , m ` 1 , m ` 2 , ¨ ¨ ¨ , n q
Thus, any curried function can effectively be invoked with arguments for any
prefix, including all of the parameters of its uncurried analog, without parentheses
around the list of arguments or commas between individual arguments:
(define curry4ary
(lambda (f)
(lambda (a)
(lambda (b)
(lambda (c)
(lambda (d)
(f a b c d)))))))
We can build general curry and uncurry functions that accept functions of any
arity greater than 1, called implicit currying, through the use of Scheme macros,
which we do not discuss here.
Since all functions built into Haskell are curried, in online Appendix C we do
not use parentheses around the argument tuples (or commas between individual
arguments) when invoking built-in Haskell functions. For instance, consider our
final definition of mergesort in Haskell given in online Appendix C:
1 Prelude > :{
2 Prelude | mergesort(_, []) = []
3 Prelude | mergesort(_, [x]) = [x]
4 Prelude | mergesort(compop, lat) =
5 Prelude | let
6 Prelude | mergesort1([]) = []
7 Prelude | mergesort1([x]) = [x]
8 Prelude | mergesort1(lat1) =
302 CHAPTER 8. CURRYING AND HIGHER-ORDER FUNCTIONS
9 Prelude | let
10 Prelude | s p l i t ([]) = ([], [])
11 Prelude | s p l i t ([x]) = ([], [x])
12 Prelude | s p l i t (x:y:excess) =
13 Prelude | let
14 Prelude | (left, right) = s p l i t (excess)
15 Prelude | in
16 Prelude | (x:left, y:right)
17 Prelude |
18 Prelude | merge(l, []) = l
19 Prelude | merge([], l) = l
20 Prelude | merge(l:ls, r:rs) =
21 Prelude | i f compop(l, r) then l:merge(ls, r:rs)
22 Prelude | e l s e r:merge(l:ls, rs)
23 Prelude |
24 Prelude | -- split it
25 Prelude | (left, right) = s p l i t (lat1)
26 Prelude |
27 Prelude | -- mergesort each side
28 Prelude | leftsorted = mergesort1(left)
29 Prelude | rightsorted = mergesort1(right)
30 Prelude | in
31 Prelude | -- merge
32 Prelude | merge(leftsorted, rightsorted)
33 Prelude | in
34 Prelude | mergesort1(lat)
35 Prelude | :}
36 Prelude >
37 Prelude > :type mergesort
38 mergesort :: ((a, a) -> Bool , [a]) -> [a]
Neither the mergesort function nor the compop function is curried. Thus, we
cannot pass in the built-in < or > operators, because they are curried:
We cannot pass in one of the built-in, curried Haskell comparison operators [e.g.,
(<) or (>)] as is to mergesort without causing a type error:
For this version of mergesort to accept one of the built-in, curried Haskell
comparison operators as a first argument, we must replace the subexpression
compop(l, r) in line 21 of the definition of mergesort with (compop l r);
that is, we must call compop without parentheses and a comma. This
changes the type of mergesort from ((a, a) -> Bool, [a]) -> [a] to
(a -> a -> Bool, [a]) -> [a]:
While this simple change causes the following invocations to work, we are
mixing curried and uncurried functions. Specifically, the function mergesort is
uncurried, while the function compop is curried:
The following is the final, fully curried version of mergesort in curried form:
1 Prelude > :{
2 Prelude | mergesort _ [] = []
3 Prelude | mergesort _ [x] = [x]
4 Prelude | mergesort compop lat =
5 Prelude | let
6 Prelude | mergesort1 [] = []
7 Prelude | mergesort1 [x] = [x]
8 Prelude | mergesort1 lat1 =
9 Prelude | let
10 Prelude | s p l i t [] = ([], [])
11 Prelude | s p l i t [x] = ([], [x])
12 Prelude | s p l i t (x:y:excess) =
13 Prelude | let
14 Prelude | (left, right) = s p l i t excess
15 Prelude | in
16 Prelude | (x:left, y:right)
17 Prelude |
18 Prelude | merge l [] = l
19 Prelude | merge [] l = l
20 Prelude | merge (l:ls) (r:rs) =
21 Prelude | i f compop l r then l:(merge ls (r:rs))
22 Prelude | e l s e r:(merge (l:ls) rs)
23 Prelude |
24 Prelude | -- split it
25 Prelude | (left, right) = s p l i t lat1
26 Prelude |
27 Prelude | -- mergesort each side
28 Prelude | leftsorted = mergesort1 left
29 Prelude | rightsorted = mergesort1 right
30 Prelude | in
306 CHAPTER 8. CURRYING AND HIGHER-ORDER FUNCTIONS
31 Prelude | -- merge
32 Prelude | merge leftsorted rightsorted
33 Prelude | in
34 Prelude | mergesort1 lat
35 Prelude | :}
36 Prelude >
37 Prelude > :type mergesort
38 mergesort :: (a -> a -> Bool) -> [a] -> [a]
39 Prelude >
40 Prelude > mergesort (<) [9,8,7,6,5,4,3,2,1]
41 [1,2,3,4,5,6,7,8,9]
42 Prelude >
43 Prelude > ascending_sort = mergesort (<)
44 Prelude >
45 Prelude > :type ascending_sort
46 ascending_sort :: Ord a => [a] -> [a]
47 Prelude >
48 Prelude > ascending_sort [9,8,7,6,5,4,3,2,1]
49 [1,2,3,4,5,6,7,8,9]
50 Prelude >
51 Prelude > mergesort (>) [1,2,3,4,5,6,7,8,9]
52 [9,8,7,6,5,4,3,2,1]
53 Prelude >
54 Prelude > descending_sort = mergesort (>)
55 Prelude >
56 Prelude > :type descending_sort
57 descending_sort :: Ord a => [a] -> [a]
58 Prelude >
59 Prelude > descending_sort [1,2,3,4,5,6,7,8,9]
60 [9,8,7,6,5,4,3,2,1]
The first and last types are recommended (for purposes of uniformity) and the last
type is preferred.
A consequence of all functions being fully curried in Haskell is that sometimes
we must use parentheses to group syntactic entities. (We can think of this practice
as forcing order or precedence, though that is not entirely true in Haskell; see
Chapter 12.) For instance, in the expression isDigit (head "string"), the
parentheses around head "string" are required to indicate that the entire
argument to isDigit is head "string". Omitting these parentheses, as in
isDigit head "string", causes the head function to be passed to the
function isDigit, with the argument "string" then being passed to the result.
8.3. CURRYING 307
Prelude > :{
Prelude | pow e = (\b -> i f e == 0 then 1 e l s e
Prelude | i f e == 1 then b e l s e
Prelude | i f b == 0 then 0 e l s e
Prelude | b*(pow (e-1) b))
Prelude | :}
Prelude >
Prelude > :type pow
pow :: (Num t1, Num t2, Eq t1, Eq t2) => t1 -> t2 -> t2
Prelude >
Prelude > pow 2 3
9
Prelude >
Prelude > square = pow 2
Prelude >
Prelude > :type square
square :: (Num t2, Eq t2) => t2 -> t2
Prelude >
Prelude > square 3
9
Prelude >
Prelude > pow 3 3
308 CHAPTER 8. CURRYING AND HIGHER-ORDER FUNCTIONS
27
Prelude > cube = pow 3
Prelude >
Prelude > :type cube
cube :: (Num t2, Eq t2) => t2 -> t2
Prelude >
Prelude > cube 3
27
Defining functions in this manner weaves the curried form too tightly into the
definition of the function and, as a result, makes the definition of the function
cumbersome. Again, the main idea in these examples is that we can support the
definition of functions in curried form in any language with first-class closures.
For instance, because Python supports first-class closures, we can define the pow
function in curried form in Python as well:
8.3.7 ML Analogs
Curried form is the same in ML as it is in Haskell:
- fun powucf(0,_) = 1
=| powucf(1,b) = b
=| powucf(_,0) = 0
=| powucf(e,b) = b * powucf(e-1, b);
v a l powucf = fn : int * int -> int
- fun powcf 0 _ = 1
=| powcf 1 b = b
=| powcf _ 0 = 0
=| powcf e b = b * powcf (e-1) b;
v a l powcf = fn : int -> int -> int
- v a l square = powcf 2;
v a l square = fn : int -> int
8.3. CURRYING 309
- square 3;
9
- (powcf 3) 3;
v a l it = 27 : int
- powcf 3 3;
v a l it = 27 : int
- v a l cube = powcf 3;
v a l cube = fn : int -> int
- cube 3;
27
- powucf(2,3)
v a l it = 9 : int
- powucf(2);
stdIn:4.1-4.10 Error: operator and operand don't agree [literal]
operator domain: int * int
operand: int
in expression:
powucf 2
- powucf 2
stdIn:1.1-1.9 Error: operator and operand don't agree [literal]
operator domain: int * int
operand: int
in expression:
powucf 2
- powcf(2,3)
stdIn:1.1-1.11 Error: operator and operand don't agree
[tycon mismatch]
operator domain: int
operand: int * int
in expression:
powcf (2,3)
- powucf 2 3
stdIn:1.1-1.11 Error: operator and operand don't agree [literal]
operator domain: int * int
operand: int
in expression:
powucf 2
- (powcf 2) 3;
v a l it = 9 : int
- powcf 2 3;
v a l it = 9 : int
Not all built-in ML functions are curried as in Haskell. For example, map is
curried, while Int.+ is uncurried. Also, there are no built-in curry and uncurry
functions in ML. User-defined and built-in functions in ML that accept only one
argument, and which are neither uncurried or curried, can be invoked with or
without parentheses around that single argument:
More generally, when a function is defined in curried form in ML, parentheses can
be placed around any individual argument (as in Haskell):
- fun f x y z = x+y+z;
v a l f = fn : int -> int -> int -> int
- f 1 2 3;
v a l it = 6 : int
- f 1 (2) 3;
v a l it = 6 : int
- f (1) 2 (3);
v a l it = 6 : int
- f (1) (2) (3);
v a l it = 6 : int
Exercise 8.3.2 Give one reason why you might want to curry a function.
f a (b,c) d = c
This definition requires that the arguments for parameters b and c arrive together,
as would happen when calling an uncurried function. Is f curried? Explain.
Exercise 8.3.5 Would the definition of curry in Haskell given in this section work
as intended if curry was defined in uncurried form? Explain.
Exercise 8.3.6 Can a function f be defined in Haskell that returns a function with
the same type as itself (i.e., as f)? If so, define f. If not, explain why not.
Exercise 8.3.8 What might it mean to state that the curry operation acts as a virtual
compiler (i.e., translator) to λ-calculus? Explain.
Exercise 8.3.9 We can sometimes factor out constant parameters from recursive
function definitions so to avoid passing arguments that are not modified across
multiple recursive calls (see Section 5.10.3 and Design Guideline 6: Factor Out
Constant Parameters in Table 5.7).
(a) Does a recursive function with any constant parameters factored out execute
more efficiently than one that is automatically generated using partial function
application or currying to factor out those parameters?
(b) Which approach makes the function easier to define? Discuss trade-offs.
(c) Is the order of the parameters in the parameter list of the function definition
relevant to each approach? Explain.
(d) Does the programming language used in each case raise any issues? Consider
the language Scheme vis-à-vis the language Haskell.
Exercise 8.3.10 Define the function papply1 in curried form in Haskell for binary
functions.
Exercise 8.3.11 Define the function papply1 in curried form in ML for binary
functions.
Exercise 8.3.12 Define the function papply1 in uncurried form in Haskell for
binary functions.
Exercise 8.3.13 Define the function papply1 in uncurried form in ML for binary
functions.
Exercise 8.3.14 Define an ML function in curried form and then apply to its
argument to create a new function. The function in curried form and the function
resulting from applying it must be practical. For example, we could apply a sorting
function parameterized on the list to be sorted and the type of items in the list
or the comparison operator to be used, a root finding function parameterized
by the degree and the number whose nth-root we desire, or a number converter
parameterized by the base from which to be converted and the base to which to be
converted.
Exercise 8.3.16 Complete Programming Exercise 8.3.15, but this time define the
function in uncurried form and then curry it using curry.
Exercise 8.3.17 Using higher-order functions and curried form, define a Haskell
function dec2bin that converts a non-negative decimal integer to a list of zeros
and ones representing the binary equivalent of that input integer.
Examples:
Exercise 8.3.19 Define the pow function from this section in Scheme so that it
can be partially applied without the use of the functions papply1, papply, or
curry. The pow function must have the type nteger Ñ nteger Ñ nteger.
Then use that definition to define the functions square and cube. Do not define
any other named function or any named, nested function other than pow.
Exercise 8.3.20 Define the function curry in curried form in ML for binary
functions. Do not return an anonymous function.
Exercise 8.3.21 Define the function uncurry in curried form in ML for binary
functions. Do not return an anonymous function.
Exercise 8.3.22 Define the function curry in uncurried form in Haskell for binary
functions.
Exercise 8.3.23 Define the function uncurry in uncurried form in Haskell for
binary functions.
Exercise 8.3.24 Define the function curry in uncurried form in ML for binary
functions.
8.4. PUTTING IT ALL TOGETHER: HIGHER-ORDER FUNCTIONS 313
Exercise 8.3.25 Define the function uncurry in uncurried form in ML for binary
functions.
Exercise 8.3.26 Define the function curry in Python for binary functions.
Exercise 8.3.27 Define the function uncurry in Python for binary functions.
- map;
v a l it = fn : ('a -> 'b) -> 'a list -> 'b list
- map (fn (x) => x*x) [1,2,3,4,5,6];
v a l it = [1,4,9,16,25,36] : int list
- fun square x = x*x;
v a l square = fn : int -> int
314 CHAPTER 8. CURRYING AND HIGHER-ORDER FUNCTIONS
In the last two examples, while map accepts only a unary function as an argument,
that function can be curried. Notice also the difference in the following two uses
of map, even though both produce the same result:
The first use of map (line 1) is in the context of a new function definition. The
function map is called (as a complete application) in the body of the new function
every time the function is invoked, which is unnecessary. The second use of
map (line 4) involves partially applying it, which returns a function (with type
int list -> int list) that is then bound to the identifier squarelist. In
the second case, map is invoked only once, rather than every time squarelist is
invoked as in the first case. The function map has the same semantics in Haskell:
<interactive>:7:1: e r r o r :
Non type -variable argument in the constraint: Num (b, b)
(Use FlexibleContexts to permit this)
When checking the inferred type
it :: forall b. (Num b, Num (b, b)) => [b]
Prelude >
Prelude > square x = x*x
Prelude >
Prelude > :type square
square :: Num a => a -> a
Prelude >
Prelude > map square [1,2,3,4,5,6]
[1,4,9,16,25,36]
8.4. PUTTING IT ALL TOGETHER: HIGHER-ORDER FUNCTIONS 315
Prelude >
Prelude > squarelist lon = map square lon
Prelude >
Prelude > :type squarelist
squarelist :: Num b => [b] -> [b]
Prelude >
Prelude > squarelist [1,2,3,4,5,6]
[1,4,9,16,25,36]
Prelude >
Prelude > squarelist1 = map square
Prelude >
Prelude > :type squarelist1
squarelist1 :: Num b => [b] -> [b]
Prelude >
Prelude > squarelist1 [1,2,3,4,5,6]
[1,4,9,16,25,36]
- (op o);
v a l it = fn : ('a -> 'b) * ('c -> 'a) -> 'c -> 'b
- fun add3 x = x+3;
v a l add3 = fn : int -> int
- fun mult2 x = x*2;
v a l mult2 = fn : int -> int
- v a l add3_then_mult2 = mult2 o add3;
v a l add3_then_mult2 = fn : int -> int
- v a l mult2_then_add3 = add3 o mult2;
v a l mult2_then_add3 = fn : int -> int
- add3_then_mult2 4;
v a l it = 14 : int
- mult2_then_add3 4;
v a l it = 11 : int
More importantly for the discussion at hand, the converse is also possible—
parenthesizing an infix operator in Haskell converts it to the equivalent curried
prefix operator:
Uses 1 and 3 are discussed in detail in Section 8.4. Returning to the topic of
functional composition, we can define the functions using sections in Haskell:
14
Prelude > add3_then_mult2_2 4
14
Prelude > mult2_then_add3 4
11
The same is not possible in ML because built-in operators (e.g., + and *) are not
curried. In ML, to convert an infix operator (e.g., + and *) to the equivalent prefix
operator, we must enclose the operator in parentheses (as in Haskell) and also
include the lexeme op after the opening parenthesis:
Recall that while built-in operators in Haskell are curried, built-in operators in ML
are not curried. Thus, unlike in Haskell, in ML converting an infix operator to the
equivalent prefix operator does not curry the operator, but merely converts it to
prefix form:
- (op +) (1,2);
v a l it = 3 : int
- (op +) 1;
stdIn:4.1-4.9 Error: operator and operand don't agree [literal]
operator domain: 'Z * 'Z
operand: int
in expression:
+ 1
Prelude >
Prelude > -- purges space from a string
Prelude > f i l t e r (/=' ') "th e uq r q mm io p q g ra "
"theuqrqmmiopqgra"
The function foldr folds a function, given an initial value, across a list from right
to left:
foldr ‘ re0 ,e1 , ¨ ¨ ¨ en s “ e0 ‘ pe1 ‘ p¨ ¨ ¨ pen´1 ‘ pen ‘ qq ¨ ¨ ¨ qq
where ‘ is a symbol representing an operator. Although foldr captures a pattern
of recursion, in practice it is helpful to think of its semantics in a non-recursive
way. Consider the expression foldr (+) 0 [1,2,3,4]. Think of the input list
as a series of calls to cons, which we know associates from right to left:
Now replace the base of the recursion [] with 0 and the cons operator with +:
Notice that the function sumlist, through the use of foldr, implicitly captures
the pattern of recursion, including the base case, that is explicitly captured in the
definition of sumlist1. Figure 8.1 illustrates the use of foldr in Haskell.
The function foldl folds a function, given an initial value, across a list from
left to right:
foldl ‘ re0 ,e1 , ¨ ¨ ¨ en s “ pp¨ ¨ ¨ pp ‘ e0 q ‘ e1 q ¨ ¨ ¨ q ‘ en´1 q ‘ en
where ‘ is a symbol representing an operator. Notice that the initial value
appears on the left-hand side of the operator with foldl and on the right-hand
side with foldr.
Since cons associates from right to left, when thinking of foldl in a non-
recursive manner we must replace cons with an operator that associates from left to
right. We use the symbol ‘lÑr to indicate a left-associative operator. For instance,
consider the expression foldl (-) 0 [1,2,3,4]. Think of the input list as a
series of calls to ‘lÑr , which associates from left to right:
[]‘lÑr1 ‘lÑr2 ‘lÑr 3 ‘lÑr 4
((([]‘lÑr1) ‘lÑr 2) ‘lÑr 3) ‘lÑr 4
Now replace the base of the recursion [] with 0 and the ‘lÑr operator with -:
Folding Lists in ML
The types of foldr in ML and Haskell are the same.
foldr f b [1, 2, 3, 4]
: f
1 : 1 f
2 : 2 f
3 : 3 f
4 [] 4 b
foldl f b [1, 2, 3, 4]
f f
4 4
f f
3 3
f f
2 2
f f
b 1 1 b
- foldr;
v a l it = fn : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
Moreover, foldr has the same semantics in ML and Haskell. Figure 8.1 illustrates
the use of foldr in ML.1
- foldl;
v a l it = fn : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
Moreover, the function foldl has different semantics in ML and Haskell. In ML,
the function foldl is computed as follows:
foldl ‘ r0 ,1 , ¨ ¨ ¨ n s “ n ‘ pn´1 ‘ p¨ ¨ ¨ ‘ p1 ‘ p0 ‘ qq ¨ ¨ ¨ qq
Therefore, unlike in Haskell, foldl in ML is the same as foldr in ML (or Haskell)
with a reversed list:
Figure 8.2 illustrates the difference between foldl in Haskell and ML.
The pattern of recursion encapsulated in these higher-order functions is
recognized as important in other languages, too. For instance, reduce in Python,
inject in Ruby, Aggregate in C#, accumulate in C++, reduce in Clojure,
List::Util::reduce in Perl, array_reduce in PHP, inject:into: in
Smalltalk, and Fold in Mathematica are analogs of the foldl family of functions.
The reduce function in Common Lisp defaults to a left fold, but there is an option
for a right fold.
Haskell includes the built-in, higher-order functions foldl1 and foldr1 that
operate like foldl and foldr, respectively, but do not require an initial value
because they use the first and last elements of the list, respectively, as base values.
Thus, foldl1 and foldr1 are only defined for non-empty lists. The function
foldl1 folds a function across a list from left to right:
foldl1 ‘ re0 ,e1 , ¨ ¨ ¨ en s “ pp¨ ¨ ¨ ppe0 ‘ e1 q ‘ e2 q ¨ ¨ ¨ q ‘ en´1 q ‘ en
The function foldr1 folds a function across a list from right to left:
foldr1 ‘ re0 ,e1 , ¨ ¨ ¨ en s “ e0 ‘ pe1 ‘ p¨ ¨ ¨ pen´2 ‘ pen´1 ‘ en qq ¨ ¨ ¨ qq
However, since foldl and foldr have different semantics, if the folding operator
is non-associative (i.e., associates in a particular evaluation order), such as
subtraction, foldr and foldl produce different values. In such a case, we need to
use the higher-order function that is appropriate for the operator and application:
Here, the folding operator (i.e., (zacc _-> acc+1)) is non-associative. However,
since the values of the elements of the list are not considered, the length of the list
is always the same regardless of the order in which we traverse it. Thus, even
though the folding operator is non-associative, foldr is equally as applicable as
foldl here. However, to use foldr we must invert the parameters of the folding
operator. With foldl, the accumulator value (which starts at 0 in this case) always
appears on the left-hand side of the folding operator, so it is the first operand; with
foldr, it appears on the right-hand side, so it is the second operand:
Thus, when the values of the elements of the input list are not considered, even
though the folding operator is non-associative, both foldl and foldr result in
the same value, although the parameters of the folding operator must be inverted
in each application. The following is a summary of when foldl and foldr are
applicable based on the associativity of the folding operator:
While foldl and foldr may result in the same value (i.e., the last two items in
the list in our example), one typically results in a more efficient execution and,
therefore, is preferred over the other.
• In a language with an eager evaluation strategy (e.g., ML; see Chapter 12), if
the folding operator is associative (in other words, when foldl and foldr
yield the same result), it is advisable to use foldl rather than foldr for
reasons of efficiency. Sections 13.7 and 13.7.4 explain this point in more
detail.
• In a language with a lazy evaluation strategy (e.g., Haskell; see Chapter 12),
if the folding operator is associative, depending on the context of the
application, the two functions may not yield the same result, because one
may not yield a result at all. If both yield a result, that result will be the
same if the folding operator is associative. However, even though they
yield the same result, one function may be more efficient than the other.
Follow the guidelines given in Section 13.7.4 for which function to use when
programming in a lazy language.
implode
Consider the following explode and implode functions from online Appendix B:
- explode;
v a l it = fn : string -> char list
- explode "apple";
v a l it = [#"a",#"p",#"p",#"l",#"e"] : char list
- implode;
v a l it = fn : char list -> string
- implode [#"a", #"p", #"p", #"l", #"e"];
v a l it = "apple" : string
- implode (explode "apple");
v a l it = "apple" : string
The problem here is that the string concatenation operation ˆ only concatenates
strings, and not characters:
Thus, we need a helper function that converts a value of type char to value of
type string:
- str;
v a l it = fn : char -> string
Now we can use the HOFs foldr, map, and o (i.e., functional composition) to
compose the atomic elements:
v a l it = "apple" : string
- foldr op ^ "" ["a", "p", "p", "l", "e"];
v a l it = "apple" : string
- "a" ^ ("p" ^ ("p" ^ ("l" ^ ("e" ^ ""))));
v a l it = "apple" : string
string2int
Now we can define another helper function that invokes char2int and acts as an
accumulator for the integer being computed:
Since we use foldl in ML, we can think of the characters of the reversed string
as being processed from right to left. The function helper converts the current
character to an int and then adds that value to the product of 10 times the running
sum of the integer representation of the characters to the right of the current
character:
Thus, we have:
After inlining an anonymous function for helper, the final version of the
function is:
- fun string2int s =
foldl (fn (c, sum) => ord c - ord #"0" + 10*sum) 0 (explode s);
v a l string2int = fn : string -> int
- string2int "0";
v a l it = 0 : int
- string2int "1";
v a l it = 1 : int
- string2int "123";
v a l it = 123 : int
- string2int "321";
v a l it = 321 : int
- string2int "5643452";
v a l it = 5643452 : int
powerset
$ cat powerset.sml
fun powerset(nil) = [nil]
| powerset(x::xs) =
let
fun insertineach(_, nil) = nil
| insertineach(item, x::xs) =
(item::x)::insertineach(item, xs);
v a l y = powerset(xs)
in
insertineach(x, y)@y
end;
$
$ sml powerset.sml
Standard ML of New Jersey (64-bit) v110.98
[opening powerset.sml]
v a l powerset = fn : 'a list -> 'a list list
Using the HOF map, we can make this definition more succinct:
$ cat powerset.sml
fun powerset nil = [nil]
| powerset (x::xs) =
let
v a l temp = powerset xs
in
(map (fn excess => x::excess) temp) @ temp
end;
$
$ sml powerset.sml
Standard ML of New Jersey (64-bit) v110.98
[opening powerset.sml]
v a l powerset = fn : 'a list -> 'a list list
- powerset [1];
v a l it = [[1],[]] : int list list
- powerset [1,2];
v a l it = [[1,2],[1],[2],[]] : int list list
328 CHAPTER 8. CURRYING AND HIGHER-ORDER FUNCTIONS
- powerset [1,2,3];
v a l it = [[1,2,3],[1,2],[1,3],[1],[2,3],[2],[3],[]] : int list list
Use of the built-in HOF map in this revised definition obviates the need for the
nested helper function insertineach. Using sections, we can make this definition
even more succinct in Haskell (Programming Exercise 8.4.23).
Until now we have discussed the use of curried HOFs to create new functions.
Here, we briefly discuss the use of such functions to support partial application.
Recall that a function can only be partially applied with respect to its first argument
or a prefix of its arguments, rather than, for example, its third argument only. To
simulate partially applying a function with respect to an argument or arguments
other than its first argument or a prefix of its arguments, we need to first transform
the order in which the function accepts its arguments and only then partially
apply it. The built-in Haskell function flip is a step in this direction. The
function flip reverses (i.e., flips) the order of the parameters to a binary curried
function:
Exercise 8.4.4 Typically when composing functions using the functional composi-
tion operator, the two operators being composed must both be unary operators,
and the second function applied must be capable of receiving a value of the same
type as returned by the first function applied. For instance, in Haskell:
Explain why the composition on line 10 in the first listing here works in Haskell,
but not does not work on line 3 in the second listing in ML. The first function
applied—(+1) in Haskell and plus1 in ML—accepts only one argument, while
the second function applied—(:) in Haskell and (op ::) in ML—accepts two
arguments:
1 - fun plus1 x = 1 + x;
2 v a l plus1 = fn : int -> int
3 - v a l composition = ((op ::) o plus1);
330 CHAPTER 8. CURRYING AND HIGHER-ORDER FUNCTIONS
summing l = f o l d l (+) 0 l
summing = f o l d l (+) 0
Exercise 8.4.6 Explain with function type notation why Programming Exer-
cise 8.4.18 cannot be completed in ML.
Exercise 8.4.17 Use one higher-order function and one anonymous function to
define a one-line function length1 in Haskell that accepts only a list as an
argument and returns the length of the list.
Examples:
Prelude >
Prelude > length1 []
0
Prelude > length1 [1]
1
Prelude > length1 [1,2]
2
Prelude > length1 [1,2,3]
3
Prelude > length1 [1,2,3,4]
4
Prelude > length1 [1,2,3,4,5,6,7,8,9,10]
10
Examples:
Exercise 8.4.19 In one line of code, use a higher-order function to define a Haskell
function dneppa that appends two lists without using the ++ operator.
Examples:
You may assume the Haskell ord function, which returns the integer
representation of its ASCII character argument. For example:
Exercise 8.4.23 Use a section to define in Haskell, in no more than six lines of code,
a more succinct version of the powerset function defined in ML in this chapter.
Examples:
Examples:
Exercise 8.4.25 Define flip2 in Haskell using one line of code. The function
flip2 transposes (i.e., reverses) the arguments to its binary, curried function
argument.
334 CHAPTER 8. CURRYING AND HIGHER-ORDER FUNCTIONS
Examples:
Exercise 8.4.26 Define flip2 in Haskell using one line of code. The function
flip2 flips (i.e., reverses) the arguments to its binary, uncurried function
argument.
Examples:
8.5 Analysis
Higher-order functions capture common, typically recursive, programming
patterns as functions. When HOFs are curried, they can be used to automatically
define atomic functions—rendering the HOFs more powerful. Curried HOFs
help programmers define functions in a modular, succinct, and easily
modifiable/reconfigurable fashion. They provide the glue that enables these
atomic functions to be combined to construct more complex functions, as the
examples in the prior section demonstrate. The use of curried HOFs lifts us
to a higher-order style of functional programming—the third tier of functional
programming in Figure 5.10. In this style of programming, programs are composed
of a series of concise function definitions that are defined through the application
of (curried) HOFs (e.g., map; functional composition: o in ML and . in Haskell;
and foldl/foldr). For instance, in our ML definition of string2int, we use
foldl, explode, and char2int. With this approach, programming becomes
essentially the process of creating composable building blocks and combining
8.7. CHAPTER SUMMARY 335
them like LEGO® bricks in creative ways to solve a problem. The resulting
programs are more concise, modular, and easily reconfigurable than programs
where each individual function is defined literally (i.e., hardcoded).
The challenge and creativity in this style of programming require determining
the appropriate level of granularity of the atomic functions, figuring out how to
automatically define them using (built-in) HOFs, and then combining them using
other HOFs into a program so that they work in concert to solve the problem at
hand. This style of programming resembles building a library or API more than an
application program. The focus is more on identifying, developing, and using the
appropriate higher-order abstractions than on solving the target problem. Once the
abstractions and essential elements have crystallized, solving the problem at hand
is an afterthought. The pay-off, of course, is that the resulting abstractions can be
reused in different arrangements in new programs to solve future problems. Lastly,
encapsulating patterns of recursion in curried HOFs and applying them in program
is a step toward bottom-up programming. Instead of writing an all-encompassing
program, using a bottom-up style of programming involves building a language
with abstract operators and then using that language to write a concise program
(Graham 1993, p. 4).
that also accepts only one argument and returns a function that accepts only one
argument, and so on. Function currying helps us achieve the same end as partial
function application (i.e., invoking a function with arguments for only a prefix of
its parameters) in a transparent manner—that is, without having to call a function
such as papply1 every time we desire to do so. Thus, while the invocation of a
curried function might appear as if it is being partially applied, it is not because
every curried function is a unary function.
Higher-order functions support the capture and reuse of a pattern of recursion
or, more generally, a pattern of control. (The concept of programming abstractions
in this manner is explored further in Section 13.6.) Curried HOFs provide the
glue that enables programmers to compose reusable atomic functions together in
creative ways. (Lazy evaluation supports gluing whole programs together and is
the topic of Section 12.5.) The resulting functions can be used in concert to craft
a malleable/reconfigurable program. What results is a general set of (reusable)
tools resembling an API rather than a monolithic program. This style of modular
programming makes programs easier to debug, maintain, and reuse (Hughes
1989).
Data Abstraction
T ype systems support data abstraction and, in particular, the definition of user-
defined data types that have the properties and behavior of primitive
types. We discuss a variety of aggregate and inductive data types and the
type systems through which they are constructed in this chapter. A type
system of a programming language includes the mechanism for creating new
data types from existing types. A type system should support the creation
of new data types easily and flexibly. We also introduce variant records and
abstract syntax, which are of particular use in data structures for representing
computer programs. Armed with an understanding of how new types are
constructed, we introduce data abstraction, which involves factoring the conception
and use of a data structure into an interface, implementation, and application.
The implementation is hidden from the application such that a variety of
representations can be used for the data structure in the implementation without
requiring changes to the application since both conform to the interface. A data
structure created in this way is called an abstract data type. We discuss a variety
of representation strategies for data structures, including abstract syntax and
closure representations. This chapter prepares us for designing efficacious and
efficient data structures for the interpreters we build in Part III of this text
(Chapters 10–12).
• Introduce inductive data types—an aggregate data type that refers to itself—
and variant records—a data type useful as a node in a tree representing a
computer program.
• Introduce abstract syntax and its role in representing a computer program.
• Describe the design, implementation, and manipulation of efficacious and
efficient data structures representing computer programs.
• Explore the conception and use of a data structure as an interface,
implementation, and application, which render it an abstract data type.
• Recognize and use a closure representation of a data structure.
• Describe the design and implementation of data structures for language
environments using a variety of representations.
9.2.1 Arrays
An array is an aggregate data type indexed by integers:
9.2.2 Records
A record (also referred to as a struct) is an aggregate data type indexed by strings
called field names:
Here, we are declaring the variable employee to be of the nameless, literal type
preceding it, rather than naming the literal type employee. The C reserved word
typedef, with syntax typedef ătypeą ătype-identifier ą, is used to give
a new name to an existing type or to name a literal type. For instance, to give a new
name to an existing type, we write typedef int boolean;. To give a name to
a literal type, for example, we write:
In contrast, the next example assigns a name to a literal data type (lines 2–5)
and then, using the type name int_and_double given to the literal data type,
declares X to be an instance of int_and_double (line 8):
340 CHAPTER 9. DATA ABSTRACTION
ML and Haskell each have an expressive type system for creating new types
with a clean and elegant syntax. The reserved word type in ML and Haskell
introduces a new name for an existing type (akin to typedef in C or C++):
The C compiler allocates memory at least sufficiently large enough to store only
the largest of the fields since the union can only store a value of one of the types
at any time.1 The following C program, using the sizeof (ătypeą) function,
demonstrates that for a struct, the system allocates memory equal to the sum of
its types. This program also demonstrates that the system allocates memory suffi-
ciently large enough to store only the largest of the constituent types of a union:
# include <stdio.h>
i n t main() {
/* declaration of a new struct data type int_and_double */
typedef s t r u c t {
i n t id;
double rate;
} int_and_double;
/* declaration of a new union data type int_or_double */
typedef union {
/* C compiler does no checking or enforcement */
i n t id;
double rate;
} int_or_double;
/* declaration of X as type int_or_double */
int_or_double X;
printf ("An int is %lu bytes.\n", s i z e o f ( i n t ));
printf ("A double is %lu bytes.\n", s i z e o f (double));
printf ("A struct of an int and a double is %lu bytes.\n",
s i z e o f (int_and_double));
printf ("A union of an int or a double is %lu bytes.\n",
s i z e o f (int_or_double));
printf ("A pointer to an int is %lu bytes.\n", s i z e o f ( i n t *));
printf ("A pointer to a double is %lu bytes.\n",
s i z e o f (double*));
printf ("A pointer to a union of the two is %lu bytes.\n",
s i z e o f (int_or_double*));
X.rate = 7.777;
printf("%f\n", X.id);
}
1. Memory allocation generally involves padding to address an architecture’s support for aligned
versus unaligned reads; processors generally require either 1-, 2-, or 4-byte alignment for reads.
342 CHAPTER 9. DATA ABSTRACTION
$ gcc s i z e o f .c
$ ./a.out
An i n t is 4 bytes.
A double is 8 bytes.
A s t r u c t of an i n t and a double is 16 bytes.
A union of an i n t or a double is 8 bytes.
A pointer to an i n t is 8 bytes.
A pointer to a double is 8 bytes.
A pointer to a union of the two is 8 bytes.
0.000000
1 // example 1:
2 struct {
3 i n t id;
4 double rate;
5 } lucia;
6
7 // example 2:
8 s t r u c t employee_tag {
9 i n t id;
10 double rate;
11 };
12
13 // can omit the reserved word struct in C++
14 s t r u c t employee_tag lucia;
15
16 // example 3:
17 s t r u c t employee_tag {
18 i n t id;
19 double rate;
20 };
21
22 typedef s t r u c t employee_tag employee;
23
24 employee lucia;
25
26 // example 4:
27 typedef s t r u c t {
28 i n t id;
29 double rate;
30 } employee;
31
32 employee lucia;
Each of the previous four declarations in C (or C++) of the variable lucia is valid.
Use of the literal, unnamed type in the first example (lines 1–5) is recommended
only if the type will be used just once to declare a variable. Which of the other three
styles to use is a matter of preference.
While most readers are probably more familiar with records (or structs) than
unions, unions are helpful types for nodes of a parse or abstract-syntax tree
9.2. AGGREGATE DATA TYPES 343
because each node must store values of different types (e.g., ints, floats, chars),
but the tree must be declared to store a a single type of node.
# include <stdio.h>
i n t main() {
/* declaration of a discriminated union
int_or_double_wrapper */
struct {
/* declaration of flag as an enumerated type */
enum {i, f} flag;
/* declaration of a union int_or_double */
union {
/* C compiler does no checking or enforcement */
i n t id;
double rate;
} int_or_double;
} int_or_double_wrapper;
int_or_double_wrapper.flag = i;
int_or_double_wrapper.int_or_double.id = 555;
int_or_double_wrapper.flag = f;
int_or_double_wrapper.int_or_double.rate = 7.25;
i f (int_or_double_wrapper.flag == i)
printf ("%d\n", int_or_double_wrapper.int_or_double.id);
else
printf ("%f\n",
int_or_double_wrapper.int_or_double.rate);
}
$ gcc discr_union.c
$ ./a.out
7.250000
s t r u c t recordA {
i n t x;
double y;
};
s t r u c t recordB {
double y;
i n t x;
};
s t r u c t recordA A;
s t r u c t recordB B;
344 CHAPTER 9. DATA ABSTRACTION
Do variables A and B require the same amount of memory? If not, why not? Write
a program using the sizeof (ătypeą) function to determine the answer to this
question, which should be given in a comment in the program.
Exercise 9.2.2 Can a union in C be used to convert ints to doubles, and vice
versa? Write a C program to answer this question. Show your program and explain
how it illustrates that a union in C can or cannot be used for these in a comment
in the program.
Exercise 9.2.4 Rewrite the ML program in Section 9.2.2 in Haskell. The two
programs are nearly identical, with the differences resulting from the syntax in
Haskell being slightly more terse than that in ML. See Table 9.7 later in this
chapter for a comparison of the main concepts and features, including syntactic
differences, of ML and Haskell.
Technically, this example type is not an inductive data type because the type being
defined (struct node_tag) is not a member of itself. Rather, this type contains
a pointer to a value of its type (struct node_tag*). This discrepancy highlights
a key difference between a compiled language and an interpreted language. C is a
compiled language, so, when the compiler encounters the preceding code, it must
generate low-level code that allocates enough memory to store a value of type
struct node_tag. To determine the number of bytes to allocate, the compiler
must sum the constituent parts. An int is four bytes and a pointer (to any type) is
also four bytes. Therefore, the compiler generates code to allocate eight bytes. Had
the compiler encountered the following definition, which is a pure inductive data
type because a struct node_tag contains a field of type struct node_tag,
it would have no way of determining statically (i.e., before run-time) how much
memory to allocate for the variable head:
s t r u c t node_tag {
i n t id;
9.3. INDUCTIVE DATA TYPES 345
s t r u c t node_tag next;
};
s t r u c t node_tag head;
While the recursion must end somewhere (because the memory of a computer
is finite), there is no way for the compiler to know in advance how much memory
is required. C, and other compiled languages, address this problem by using
pointers, which are always a consistent size irrespective of the size of the data to
which they point. In contrast, interpreted languages do not encounter this problem
because an interpreter only operates at run-time—a point at which the size of data
type is known or can be grown or shrunk. Moreover, in some languages, including
Scheme, all denoted values are references to literal values, and references are
implicitly dereferenced when used. A denoted value is the value to which a variable
refers. For instance, if x = 1, the denotation of x is the value 1. In Scheme, since all
denoted values are references to literal values, the denotation of x is a reference to
the value 1. The following C program demonstrates that in C all denoted values are
not references, and includes an example of explicit pointer dereferencing (line 15):
1 # include <stdio.h>
2
3 i n t main() {
4
5 /* the denotation of x is the value 1 */
6 i n t x = 1;
7
8 /* the denotation of ptr_x is the address of x */
9 i n t * ptr_x = &x;
10
11 printf ("The denotation of x is %d.\n", x);
12 printf ("The denotation of ptr_x is %x.\n", ptr_x);
13
14 /* explicit dereferencing ptr_x */
15 printf ("The denotation of ptr_x points to %d.\n", *ptr_x);
16 }
17
18 $ gcc deref.c
19 $ ./a.out
20 The denotation of x is 1.
21 The denotation of ptr_x is bffff628.
22 The denotation of ptr_x points to 1.
We cannot write an equivalent Scheme program. Since all denoted values are
references in Scheme, it is not possible to distinguish between a denoted value that
is a literal and a denoted value that is a reference:
;; the denotation of x is a reference to the value 1
( l e t ((x 1))
;; x is implicitly dereferenced
(+ x 1))
Similarly, in Java, all denoted values except primitive types are references. In other
words, in Java, unlike in C++, it is not possible to refer to an object literally. All
objects must be accessed through a reference. However, since Java, like Scheme,
also has implicit dereferencing, the fact that all objects are accessed through a
reference is transparent to the programmer. Therefore, languages such as Java and
346 CHAPTER 9. DATA ABSTRACTION
1 # include <iostream>
2
3 using namespace std;
4
5 c l a s s Ball {
6
7 public:
8 void roll1();
9 };
10
11 void Ball::roll1() {
12 cout << "Ball roll." << endl;
13 }
14
15 i n t main() {
16
17 // the denotation of b is a value of type Ball
18 Ball b = Ball();
19
20 // the denotation of ref_b is a reference to
21 // the same value of type Ball
22 Ball* ref_b = &b;
23
24 // sending the message roll to the object b of type Ball
25 b.roll1();
26
27 // sending the message roll to the object b of type Ball
28 // through the reference ref_b
29 ref_b->roll1();
30
31 // explicit pointer dereferencing
32 // (*ref_b) results in a value of type Ball
33 // sending the message roll to that value
34 (*ref_b).roll1();
35 }
$ g++ BallDemo.cpp
$ ./a.out
Ball roll.
Ball roll.
Ball roll.
1 c l a s s Ball {
2 public void roll() {
3 System.out.println ("Ball roll.");
4 }
5 }
6
7 public c l a s s BallDemo {
8 public s t a t i c void main(String[] args) {
9.4. VARIANT RECORDS 347
9
10 // the denotation of b is a reference to a value of type Ball
11 Ball b = new Ball();
12
13 // the denotation of ref_b is a reference to
14 // the same value of type Ball
15 Ball ref_b = b;
16
17 // both references are implicitly dereferenced
18 b.roll();
19
20 ref_b.roll();
21 }
22 }
$ javac BallDemo.java
$ java BallDemo
Ball roll.
Ball roll.
This Java program demonstrates object access through a pointer with implicit
dereferencing (lines 18 and 20). In short, it is natural to create pure inductive data
types in languages where all denoted values are references (e.g., Scheme and Java
for all non-primitives).
$ gcc List.c
$ ./a.out
llist is 16 bytes.
Table 9.1 Support for C/C++ Style structs and unions in ML, Haskell, Python,
and Java
Table 9.1 summarizes the support for C/C++-style structs and unions in ML,
Haskell, Python, and Java.
Syntax:
#lang eopl
(define-datatype llist llist?
(aatom
(aatom_tag number?))
(aatom_llist
(aatom_tag number?)
(next llist?)))
> (aatom 3)
#(struct:aatom 3)
> (define ouraatom (aatom 3))
> (llist? ouraatom)
#t
> (aatom_llist 2 (aatom 3))
#(struct:aatom_llist 2 #(struct:aatom 3))
> (define ouraatom_llist (aatom_llist 2 (aatom 3)))
> (llist? ouraatom_llist)
#t
> (llist? (aatom_llist 1 ouraatom_llist))
#t
> (define ourllist (aatom_llist 1 ouraatom_llist))
> (llist? ourllist)
#t
The (cases ...) form, in the EOPL extension to Racket Scheme, provides
support for decomposing and manipulating the constituent parts of a vari-
ant record created with the constructors automatically generated with the
(define-datatype ...) form.
Syntax:
(cases ătype-nameą ăexpressioną
{(ăvariant-nameą ({ăfield-nameą}*) ăconsequentą)}*
(else ădefaultą))
The following function accepts a value of type llist as an argument and
manipulates its fields with the cases form to sum its nodes:
(define llist_sum
(lambda (ll)
(cases llist ll
(aatom (aatom_tag) aatom_tag)
(aatom_llist (aatom_tag next)
(+ aatom_tag (llist_sum next))))))
> (llist_sum ouraatom)
354 CHAPTER 9. DATA ABSTRACTION
3
> (llist_sum ouraatom_llist)
5
> (llist_sum ourllist)
6
Notice that the (cases ...) form binds the values of the fields of the value of
the data type to symbols (for subsequent manipulation). The define-datatype
and cases forms are the analogs of the composition and decomposition
operators, respectively. Data types defined with (define-datatype ...) can
also be mutually recursive (recall the grammar for S-expressions). In SLLGEN, the
sllgen:make-define-datatypes procedure is used to automatically generate
the define-datatype declarations from the grammar (or we can manually
define them). Table 9.2 summarizes the support for defining and manipulating
variant records in the programming languages we have discussed here.
union bintree {
struct {
i n t number;
} leaf;
struct {
i n t key;
union bintree left;
union bintree right;
} interior_node;
} B;
Show your code and explain your observations in a comment in the program.
Exercise 9.4.2 Rewrite the Haskell program in Section 9.4.1 in ML. The two
programs are nearly identical, with the differences resulting from the syntax in ML
9.4. VARIANT RECORDS 355
being slightly more verbose than that in Haskell. Table 9.7 (later in the chapter)
compares the main concepts and features, including the syntactic differences, of
ML and Haskell.
Define a variant record list in C++ for this list data structure. The data type must
be inductive and must completely conform to (i.e., naturally reflect) the grammar
shown here. Do not use more than 25 lines of code in your definition of the data
type, and do not use a class or any other object-oriented features of C++.
Exercise 9.4.5 (Ullman 1997, Exercise 6.2.8, pp. 209–210) Define a Haskell data
type for boolean expressions. Boolean expressions are made up of boolean
values, boolean variables, and operators. There are two boolean values: True or
False. A boolean variable (e.g., “p”) can be bound to either of the two boolean
values. Boolean expressions are constructed from boolean variables and values
using the operators AND, OR, and NOT. An example of a boolean expression
is (AND (OR p q) (NOT q)), where p and q are boolean variables. Another
example is (AND p True).
(a) Define a Haskell data type Boolexp whose values represent legal boolean
expressions. You may assume that boolean variables (but not the expressions
themselves) are represented as strings.
(b) Define a function eval exp env :: Boolexp -> [[Char]] -> Bool in
Haskell that accepts a boolean expression exp and a list of true boolean
variables env, and determines the truth value of exp based on the assumption
that the boolean variables in env are true and all other boolean variables are
false. You may use the Haskell elem list member function in your definition of
eval.
Bear in mind that exp is not a string, but rather a value constructed from the
Boolexp data type.
2. https://www.freepascal.org
356 CHAPTER 9. DATA ABSTRACTION
Examples:
Solve this exercise with at most one data type definition and a five-line eval
function.
(b) Define a function sum_leaves in Racket Scheme using (cases ...) to sum
the leaves of a binary tree created using the data type defined in (a).
Notably, the preceding program is more manipulable and, thus, processable when
represented using the following definition of an expression data type:
An abstract-syntax tree (AST) is similar to a parse tree, except that it uses abstract
syntax or an internal representation (i.e., it is internal to the system processing it)
rather than concrete syntax. Specifically, while the structure of a parse tree depicts
how a sentence (in concrete syntax) conforms to a grammar, the structure of an
abstract-syntax tree illustrates how the sentence is represented internally, typically
with an inductive, variant record data type. For instance, Figure 9.1 illustrates
an AST for the λ-calculus expression ((lambda (x) (f x)) (g y)). Abstract
syntax is a representation of a program as a data structure—in this case, an
inductive variant record. Consider the following grammar for λ-calculus, which
is annotated with variants of this expression inductive variant record data type
above the right-hand side of each production rule:3
variable-expression (identifier)
ăepressoną ::= ădentƒ erą
lambda-expression (identifier body)
ăepressoną ::= (lambda (ădentƒ erą) ăepressoną)
application-expression (operator operand)
ăepressoną ::= (ăepressonąăepressoną)
3. This is the annotative style used in Friedman, Wand, and Haynes (2001).
358 CHAPTER 9. DATA ABSTRACTION
application-expression
operand operator
application-expression lambda-expression
identifier identifier
operand operator
y g
variable-expression variable-expression
identifier identifier
x f
(define occurs-free?
(lambda (variable expr)
(cases expression expr
(variable-expression (identifier)
(eqv? identifier variable))
(lambda-expression (identifier body)
(and (not (eqv? identifier variable))
(occurs-free? variable body)))
(application-expression (operator operand)
(or (occurs-free? variable operator)
(occurs-free? variable operand))))))
(define concrete2abstract
(lambda (expr)
(cond
9.6. ABSTRACT-SYNTAX TREE FOR CAMILLE 359
Use of abstract syntax makes data representing code easier to manipulate and a
program that processes code (i.e., programs) more readable.
1 import re
2 import sys
3 import operator
4 import ply.lex as lex
5 import ply.yacc as yacc
6 from collections import defaultdict
7
8 # begin expression data type #
9
10 #list of node types
11 ntPrimitive = 'Primitive'
12 ntPrimitive_op = 'Primitive Operator'
13
14 ntNumber = 'Number'
15 ntIdentifier = 'Identifier'
16
17 ntIfElse = 'Conditional'
360 CHAPTER 9. DATA ABSTRACTION
18
19 ntArguments = 'Arguments'
20 ntFuncCall = 'Function Call'
21 ntFuncDecl = 'Function Declaration'
22 ntRecFuncDecl = 'Recursive Function Declaration'
23
24 ntParameters = 'Parameters'
25 ntExpressions = 'Expressions'
26
27 ntLetRec = 'Let Rec'
28 ntLetStar = 'Let Star'
29 ntLet = 'Let'
30
31 ntLetStatement = 'Let Statement'
32 ntLetStarStatement = 'Let* Statement'
33 ntLetRecStatement = 'Letrec Statement'
34
35 ntLetAssignment = 'Let Assignment'
36 ntLetRecAssignment = 'Letrec Assignment'
37 ntLetStarAssignment = 'Letstar Assignment'
38
39 ntAssignment = 'Assignment'
40
41 c l a s s Tree_Node:
42 def __init__(self,type ,children, leaf, linenumber):
43 self.type = type
44 # save the line number of the node so run-time
45 # errors can be indicated
46 self.linenumber = linenumber
47 i f children:
48 self.children = children
49 else:
50 self.children = [ ]
51 self.leaf = leaf
52 # end expression data type #
118 c l a s s ParserException(Exception):
119 def __init__(self, message):
120 self.message = message
121
122 def p_error(t):
123 i f (t != None):
124 r a i s e ParserException("Syntax error: Line %d " % (t.lineno))
125 else:
126 r a i s e ParserException("Syntax error near: Line %d" %
127 (lexer.lineno - (lexer.lineno > 1)))
128
4. The PLY lexical specification is not shown here; lines 8–72 of the lexical specification shown in
Section 3.6.2 can be used here as lines 53–117.
9.6. ABSTRACT-SYNTAX TREE FOR CAMILLE 361
192 else:
193 t[0] = Tree_Node(ntFuncCall, None, t[2], t.lineno(1))
194
195 def p_expression_rec_func_decl(t):
196 '''rec_func_decl : FUN LPAREN parameters RPAREN expression
197 | FUN LPAREN RPAREN expression'''
198 i f len(t)==6:
199 t[0] = Tree_Node(ntRecFuncDecl, [t[3], t[5]], None, t.lineno(1))
200 else:
201 t[0] = Tree_Node(ntRecFuncDecl, [t[4]], None, t.lineno(1))
202
203 def p_parameters(t):
204 '''parameters : IDENTIFIER
205 | IDENTIFIER COMMA parameters'''
206 i f len(t) == 4:
207 t[0] = Tree_Node(ntParameters, [t[1], t[3]], None, t.lineno(1))
208 e l i f len(t) == 2:
209 t[0] = Tree_Node(ntParameters, [t[1]], None, t.lineno(1))
210
211 def p_arguments(t):
212 '''arguments : expression
213 | expression COMMA arguments'''
214 i f len(t) == 2:
215 t[0] = Tree_Node(ntArguments, [t[1]], None, t.lineno(1))
216 e l i f len(t) == 4:
217 t[0] = Tree_Node(ntArguments, [t[1], t[3]], None, t.lineno(1))
218
219 def p_expressions(t):
220 '''expressions : expression
221 | expression COMMA expressions'''
222 i f len(t) == 4:
223 t[0] = Tree_Node(ntExpressions, [t[1], t[3]], None, t.lineno(1))
224 e l i f len(t) == 2:
225 t[0] = Tree_Node(ntExpressions, [t[1]], None, t.lineno(1))
226
227 def p_let_statement(t):
228 '''let_statement : let_assignment
229 | let_assignment let_statement'''
230 i f len(t) == 3:
231 t[0] = Tree_Node(ntLetStatement, [t[1], t[2]], None, t.lineno(1))
232 else:
233 t[0] = Tree_Node(ntLetStatement, [t[1]], None, t.lineno(1))
234
235 def p_letstar_statement(t):
236 '''letstar_statement : letstar_assignment
237 | letstar_assignment letstar_statement'''
238 i f len(t) == 3:
239 t[0] = Tree_Node(ntLetStarStatement, [t[1], t[2]], None,
240 t.lineno(1))
241 else:
242 t[0] = Tree_Node(ntLetStarStatement, [t[1]], None, t.lineno(1))
243
244 def p_letrec_statement(t):
245 '''letrec_statement : letrec_assignment
246 | letrec_assignment letrec_statement'''
247 i f len(t) == 3:
248 t[0] = Tree_Node(ntLetRecStatement, [t[1], t[2]], None, t.lineno(1))
249 else:
250 t[0] = Tree_Node(ntLetRecStatement, [t[1]], None, t.lineno(1))
251
252 def p_let_assignment(t):
253 '''let_assignment : IDENTIFIER EQ expression'''
254 t[0] = Tree_Node(ntLetAssignment, [t[3]], t[1], t.lineno(1))
9.6. ABSTRACT-SYNTAX TREE FOR CAMILLE 363
Figure 9.2 (left) Visual representation of TreeNode Python class. (right) A value
of type TreeNode for an identifier.
255
256 def p_letstar_assignment(t):
257 '''letstar_assignment : IDENTIFIER EQ expression'''
258 t[0] = Tree_Node(ntLetStarAssignment, [t[3]], t[1], t.lineno(1))
259
260 def p_letrec_assignment(t):
261 '''letrec_assignment : IDENTIFIER EQ rec_func_decl'''
262 t[0] = Tree_Node( ntLetRecAssignment, [t[3]], t[1], t.lineno(1))
263 # end syntactic specification #
This Camille parser generator in PLY is the same as that shown in Section 3.6.2,
but contains actions to build the abstract-syntax tree ( AST) in the pattern-action
rules. Specifically, the Camille parser builds an AST in which each node contains
the node type, a leaf, a list of children, and a line number. The TreeNode
structure is shown on the left side of Figure 9.2. For all number (ntNumber),
identifier (ntIdentifier), and primitive operator (ntPrimitive) node types,
the value of the token is stored in the leaf of the node (shown on the right side
of Figure 9.2). In the p_line_expr function (lines 135–139), notice that the final
abstract-syntax tree is assigned to the global variable global_tree (line 139)
so that it can be referenced by the function that invokes the parser—namely,
the following concrete2abstract function, which is the Python analog of the
concrete2abstract Racket Scheme function given in Section 9.5:
285
286 i f interactiveMode:
287 program = ""
288 try:
289 prompt = 'Camille> '
290 while True:
291 line = input(prompt)
292 i f (line == "" and program != ""):
293 p r i n t (concrete2abstract(line,parser))
294 lexer.lineno = 1
295 program = ""
296 prompt = 'Camille> '
297 else:
298 i f (line != ""):
299 program += (line + '\n')
300 prompt = ''
301
302 e x c e p t EOFError as e:
303 sys.exit(0)
304
305 e x c e p t Exception as e:
306 p r i n t (e)
307 sys.exit(-1)
308 else:
309 try:
310 with open(sys.argv[1], 'r') as script:
311 file_string = script.read()
312 p r i n t (concrete2abstract(file_string,parser))
313 sys.exit(0)
314 e x c e p t Exception as e:
315 p r i n t (e)
316 sys.exit(-1)
317
318 main_func()
Examples:
$ python3.8 camilleAST.py
Camille> l e t a=5 in a
<camilleTreeStruct.Tree_Node object at 0x104c6d820>
Camille> l e t a = 5 in a
<camilleTreeStruct.Tree_Node object at 0x104c6dac0>
Camille> l e t a=2 in l e t b =3 in a
<camilleTreeStruct.Tree_Node object at 0x104c6dfd0>
Camille> l e t f = fun (y, z) +(y,-(z,5)) in (f 2, 28)
<camilleTreeStruct.Tree_Node object at 0x104c6da30>
The following function list-of used in the definition of the data type is defined
in Section 5.10.3 and repeated here:
(define list-of
(lambda (predicate)
(letrec ((list-of-helper
(lambda (lst)
(or ( n u l l? lst)
(and (pair? lst)
(predicate (car lst))
(list-of-helper (cdr lst)))))))
list-of-helper)))
This function is also built into the #lang eopl language of DrRacket.
print(abstract2concrete(concrete2abstract(program, parser)))
Examples:
$ python3.8 camilleAST.py
Camille> l e t a = 5 in a
l e t a = 5 in a
Camille> l e t a=2 in l e t b =3 in a
l e t a = 2 in l e t b = 3 in a
Camille> l e t f = fun (y, z) +(y,-(z,5)) in (f 2, 28)
l e t f = fun(y, z) +(y, -(z, 5)) in (f 2, 28)
Camille>
366 CHAPTER 9. DATA ABSTRACTION
The underlying implementation can change without disrupting the client code
as long as the contractual signature of each function declaration in the interface
remains unchanged. In this way, the implementation is hidden from the application.
A data type developed this way is called an abstract data type ( ADT). Consider
a list abstract data type. One possible representation for the list used in the
implementation might be an array or vector. Another possible representation
might be a linked list. (Note that Church Numerals are a representation of
numbers in λ-calculus; see Programming Exercise 5.2.2.) A goal of a type system
is to support the definition of abstract data types that have the properties
and behavior of primitive types. One advantage of using an ADT is that the
application is independent of the representation of the data structure used in the
implementation. In turn, any implementation of the interface can be substituted
without requiring modifications to the client application. In Section 9.8, we
demonstrate a variety of possible representations for an environment ADT, all
of which satisfy the requirements for the interface of the abstract data type and,
therefore, maintain the integrity of the independence between the representation
and the application.
5. In the von Neumann architecture, we think of and represent code as data; in other words, code
and data are represented uniformly in main memory.
368 CHAPTER 9. DATA ABSTRACTION
#lang eopl
;;; closure representation of environment
(define empty-environment
(lambda ()
(lambda (identifier)
(eopl: e r r o r 'apply-environment
"No binding for ~s" identifier))))
(define extend-environment
(lambda (identifiers values environ)
(lambda (identifier)
( l e t (( p o s i t i o n (list-find-position identifier identifiers)))
(cond
((number? p o s i t i o n) (list-ref values p o s i t i o n))
(else (apply-environment environ identifier)))))))
(define apply-environment
(lambda (environ identifier)
(environ identifier)))
(define list-find-position
(lambda (identifier los)
(list-index
(lambda (identifier1) (eqv? identifier1 identifier)) los)))
(define list-index
(lambda (predicate ls)
(cond
(( n u l l? ls) #f)
((predicate (car ls)) 0)
(else ( l e t ((list-index-r
(list-index predicate (cdr ls))))
(cond
((number? list-index-r) (+ list-index-r 1))
(else #f)))))))
Getting acclimated to the reality that the data structure is a function can be a
cognitive challenge. One way to get accustomed to this representation is to reify
the function representing an environment every time one is created or extended
and unpack it every time one is applied (i.e., accessed). For instance, let us step
through the evaluation of the following application code:
(lambda (symbol)
(eopl: e r r o r 'apply-environment "No binding for ~s" symbol))
1 (define simple-environment
2 (extend-environment '(a b) '(1 2)
3 (extend-environment '(c d e) '(3 4 5)
4 (lambda (symbol)
9.8. CASE STUDY: ENVIRONMENTS 369
5 (eopl: e r r o r 'apply-environment
6 "No binding for ~s" symbol)))))
(lambda (symbol)
( l e t (( p o s i t i o n (list-find-position symbol '(c d e))))
(cond
((number? p o s i t i o n)
(list-ref '(3 4 5) p o s i t i o n))
(else (apply-environment
(lambda (symbol)
(eopl: e r r o r 'apply-environment
"No binding for ~s"
symbol))
symbol)))))
Thus, we have
1 (define simple-environment
2 (extend-environment '(a b) '(1 2)
3 (lambda (symbol)
4 ( l e t (( p o s i t i o n
5 (list-find-position symbol '(c d e))))
6 (cond
7 ((number? p o s i t i o n) (list-ref '(3 4 5) p o s i t i o n))
8 (else (apply-environment
9 (lambda (symbol)
10 (eopl: e r r o r 'apply-environment
11 "No binding for ~s" symbol))
12 symbol)))))))
(lambda (symbol)
( l e t (( p o s i t i o n (list-find-position symbol '(a b))))
(cond
((number? p o s i t i o n) (list-ref '(1 2) p o s i t i o n))
(else (apply-environment
(lambda (symbol)
( l e t (( p o s i t i o n
(list-find-position symbol '(c d e))))
(cond
((number? p o s i t i o n)
(list-ref '(3 4 5) p o s i t i o n))
(else (apply-environment
(lambda (symbol)
(eopl: e r r o r
'apply-environment
"No binding for ~s"
symbol))
symbol)))))
symbol)))))
Thus, we have
1 (define simple-environment
2 (lambda (symbol)
3 ( l e t (( p o s i t i o n (list-find-position symbol '(a b))))
370 CHAPTER 9. DATA ABSTRACTION
4 (cond
5 ((number? p o s i t i o n) (list-ref '(1 2) p o s i t i o n))
6 (else (apply-environment
7
8 (lambda (symbol)
9 ( l e t (( p o s i t i o n
10 (list-find-position symbol '(c d e))))
11 (cond
12 ((number? p o s i t i o n)
13 (list-ref '(3 4 5) p o s i t i o n))
14 (else (apply-environment
15 (lambda (symbol)
16 (eopl: e r r o r
17 'apply-environment
18 "No binding for ~s"
19 symbol))
20 symbol)))))
21 symbol))))))
(apply-environment
(lambda (symbol)
( l e t (( p o s i t i o n (list-find-position symbol '(a b))))
(cond
((number? p o s i t i o n) (list-ref '(1 2) p o s i t i o n))
(else (apply-environment
(lambda (symbol)
( l e t (( p o s i t i o n
(list-find-position symbol '(c d e))))
(cond
((number? p o s i t i o n)
(list-ref '(3 4 5) p o s i t i o n))
(else (apply-environment
(lambda (symbol)
(eopl: e r r o r
'apply-environment
"No binding for ~s"
symbol))
symbol)))))
symbol)))))
'e)
1 ((lambda (symbol)
2 ( l e t (( p o s i t i o n (list-find-position symbol '(a b))))
3 (cond
4 ((number? p o s i t i o n) (list-ref '(1 2) p o s i t i o n))
9.8. CASE STUDY: ENVIRONMENTS 371
5 (else (apply-environment
6
7 (lambda (symbol)
8 ( l e t (( p o s i t i o n
9 (list-find-position symbol '(c d e))))
10 (cond
11 ((number? p o s i t i o n)
12 (list-ref '(3 4 5) p o s i t i o n))
13 (else (apply-environment
14 (lambda (symbol)
15 (eopl: e r r o r
16 'apply-environment
17 "No binding for ~s"
18 symbol))
19 symbol)))))
20 symbol)))))
21 'e)
Since the symbol e (line 21) is not found in the list of symbols in the outermost
environment ’(a b) (line 2), this expression, when evaluated, returns
(apply-environment
(lambda (symbol)
( l e t (( p o s i t i o n (list-find-position symbol '(c d e))))
(cond
((number? p o s i t i o n) (list-ref '(3 4 5) p o s i t i o n))
(else (apply-environment
(lambda (symbol)
(eopl: e r r o r 'apply-environment
"No binding for ~s" symbol))
symbol)))))
'e)
1 ((lambda (symbol)
2 ( l e t (( p o s i t i o n (list-find-position symbol '(c d e))))
3 (cond
4 ((number? p o s i t i o n) (list-ref '(3 4 5) p o s i t i o n))
5 (else (apply-environment
6 (lambda (symbol)
7 (eopl: e r r o r 'apply-environment
8 "No binding for ~s" symbol))
9 symbol)))))
10 'e)
Since the symbol ’e (line 10) is found in the list of symbols in the intermediate
environment ’(c d e) (line 2) at position 2, this expression, when evaluated,
returned (list-ref ’(3 4 5) position), which, when evaluated, returns
5. This example brings us face to face with the fact that a program is nothing more
than data. In turn, a data structure can be represented as a program.
We can extract the interface for and the (closure representation) implementation of
an ADT from the application code:
1. Identify all of the lambda expressions in the application code whose eval-
uation yields values of the data type. Define a constructor function for each
such lambda expression. The parameters of the constructor are the free vari-
ables of the lambda expression. Replace each of these lambda expressions
in the application code with an invocation of the corresponding constructor.
2. Define an observer function such as apply-environment. Identify all
the points in the application code, including the bodies of the constructors,
where a value of the type is applied. Replace each of these applications with
an invocation of the observer function (Friedman, Wand, and Haynes 2001,
p. 58).
If we do this, then
• the interface consists of the constructor functions and the observer function
• the application is independent of the representation
• we are free to substitute any other implementation of the interface without
breaking the application code (Friedman, Wand, and Haynes 2001, p. 58)
list of values
list of identifiers 0 1 rest of environment
list of values
list of identifiers 0 1 2
rest of
environment
c d e 3 4 5
Implement this interface in Scheme using a closure representation for the stack. The
functions empty-stack and push are the constructors, and the functions pop,
top, and empty-stack? are the observers. Therefore, the closure representation
of the stack must take only a single atom argument and use it to determine
which observation to make. Call this parameter message. The messages can
be the atoms ’empty-stack?, ’top, or ’pop. The implementation requires
approximately 20 lines of code.
Exercise 9.8.3 (Friedman, Wand, and Haynes 2001) Define and implement in
Racket Scheme an abstract-syntax representation of the environment shown in
Section 9.8 (Figure 9.4).
(a) Define a grammar in EBNF (i.e., a concrete syntax) that defines a language of
environment expressions in the following form:
Table 9.3 Summary of the Programming Exercises in This Chapter Involving the
Implementation of a Variety of Representations for an Environment (Key: ASR =
abstract-syntax representation; CLS = closure; LOLR = list-of-lists representation;
and PE = programming exercise.)
Named Nameless
(Racket) Scheme
CLS CLS
ASR(Section 9.8.4; Figure 9.3) ASR(Figure 9.10; PE 9.8.9)
LOLR (Figure 9.7; PE 9.8.5.a) LOLR (Figure 9.8; PE 9.8.5.b)
ăenronmentą ::=
ăenronmentą ::=
376 CHAPTER 9. DATA ABSTRACTION
b 2 c 3
d 4
e 5
(b) Annotate that grammar (i.e., concrete syntax) with abstract syntax as shown at
the beginning of Section 9.5 for λ-calculus; in other words, represent it as an
abstract syntax.
(c) Define the environment data type using (define-datatype ...). You
may use the function list-of, which is given in Programming Exercise 9.6.1.
(d) Define the implementation of this environment; that is, define the
empty-environment, extend-environment, and apply-environment
functions. Use the function rib-find-position in your implementation:
(define list-find-position
(lambda (symbol los)
(list-index (lambda (symbol1) (eqv? symbol1 symbol)) los)))
(define list-index
(lambda (predicate ls)
(cond
(( n u l l? ls) #f)
((predicate (car ls)) 0)
(else ( l e t ((list-index-r
(list-index predicate (cdr ls))))
(cond
((number? list-index-r) (+ list-index-r 1))
(else #f)))))))
Exercise 9.8.4 (Friedman, Wand, and Haynes 2001) In this programming exercise
you implement a list representation of an environment in Scheme and make three
progressive improvements to it (Table 9.5). Start with the solution to Programming
Exercise 9.8.3.a.
This is called the ribcage representation (Friedman, Wand, and Haynes 2001).
The environment is represented by a list of lists. The lists contained
in the environment list are called ribs. The car of each rib is a list
of symbols, and the cadr of each rib is the corresponding list of
values. Define the implementation of this environment; that is, define
the empty-environment and extend-environment functions. Use the
functions list-find-position and list-index, shown in Chapter 10, in
your implementation. Also, use the following definition:
rest of environment
vector of values
list of identifiers 0 1
vector of values
list of identifiers 0 1 2
a 1 2
rest of
environment
(b) Improve the efficiency of access in the solution to (a) by using a vector for the
value of each rib instead of a list:
> abcd-environ
( ((a b) #(1 2)) ((b c d) #(3 4 5)) )
(c) Improve the efficiency of access in the solution to (b) by changing the
representation of a rib from a list of two elements to a single pair—so that the
values of each rib can be accessed simply by taking the cdr of the rib rather
than the car of the cdr (Figure 9.5):
> abcd-environ
( ((a b) . #(1 2)) ((b c d) . #(3 4 5)) )
> abcd-environ
( #(1 2) #(3 4 5) )
9.8. CASE STUDY: ENVIRONMENTS 379
vector of values
0 1 2
1 2 rest of nameless
environment
3 4 5
Improve the solution to (c) to incorporate this optimization. Use the following
interface for the nameless environment:
(define empty-nameless-environment
(lambda ()
...))
(define extend-nameless-environment
(lambda (values environ)
...))
(define apply-nameless-lexical-environment
(lambda (environ depth p o s i t i o n)
...))
Exercise 9.8.5 In this programming exercise, you build two different ribcage
representations of the environment in Python (Table 9.6).
a b 1 2 c d e 3 4 5
1 2 3 4 5
(b) (Friedman, Wand, and Haynes 2001, Exercise 3.25, p. 90) (list-of-lists
representation of a nameless environment) Build a list-of-lists (i.e., ribcage)
representation of a nameless environment (Figure 9.8) with the following
interface:
def empty_nameless_environment()
def extend_nameless_environment (values, environment)
def apply_nameless_environment (environment, depth, position)
9.8. CASE STUDY: ENVIRONMENTS 381
values environ
vector of values
0 1 2
1 2 rest of nameless
environment
3 4 5
values environ
list of values
0 1 2
1 2 rest of nameless
environment
3 4 5
9.9.1 ML Summary
ML is a statically scoped, programming language that supports primarily func-
tional programming with a safe type system, type inference, an eager evaluation
strategy, parametric polymorphism, algebraic data types, pattern matching,
automatic memory management through garbage collection, a rich and expressive
polymorphic type and module system, and some imperative features. ML inte-
grates functional features from Lisp, rule-based programming (i.e., pattern match-
ing) from Prolog, data abstraction from Smalltalk, and has a more readable syntax
than Lisp. As a result, ML is a useful general-purpose programming language.
9.9.4 Applications
The features of ML are ideally applied in language-processing systems,
including compilers and theorem provers (Appel 2004). Haskell is also being
increasingly used for application development in a commercial setting. Examples
of applications developed in Haskell include a revision control system and a
window manager for the X Window System. Galois is a software development and
computer science research company that has used Haskell in multiple projects.6
ML and Haskell are also used for artificial intelligence ( AI) applications.
Traditionally, Prolog, which is presented in Chapter 14, has been recognized as
a language for AI, particularly because it has a built-in theorem-proving algorithm
called resolution and implements the associated techniques of unification and
backtracking, which make resolution practical in a computer system. As a result,
the semantics of Prolog are more complex than those of languages such as Scheme,
C, and Java. A Prolog program consists of a set of facts and rules. An ML or
Haskell program involving a series of function definitions using pattern-directed
invocation has much the same appearance. (The built-in list data structures
in Prolog and ML/Haskell are nearly identical.) Moreover, the pattern-directed
invocation built into ML and Haskell is similar to the rule system in Prolog, albeit
6. https://galois.com/about/haskell/
384 CHAPTER 9. DATA ABSTRACTION
Concept ML Haskell
lists homogeneous homogeneous
cons :: :
append @ ++
integer equality = ==
integer inequality <> /=
not a list of characters
strings a list of Characters
use explode
renaming parameters st as (::s) st@(:s)
functional redefinition permitted not permitted
pattern-directed invocation yes, with | yes
call-by-value, call-by-need,
parameter passing strict, non-strict,
applicative-order evaluation normal-order evaluation
functional composition o .
infix to prefix (op opertor) (opertor)
sections not supported supported, use (opertor)
prefix to infix ‘opertor‘
introduced with fun
user-defined functions
can be defined at the must be defined in a script
prompt or in a script
anonymous functions (fn tpe => body) (z tpe -> body)
curried form omit parentheses, commas omit parentheses, commas
curried partially fully
type declaration : ::
type definition type type
data type definition datatype data
prefaced with ’ not prefaced with ’
type variables written before written after
data type name data type name
optional, but if used, optional, but if used,
function type embedded within precedes
function definition function definition
type inference/checking Hindley-Milner Hindley-Milner
supported through
function overloading not supported qualified types and
type classes
module system
(structures,
ADTs class system
signatures, and
functors)
Table 9.7 Comparison of the Main Concepts and Features of ML and Haskell
9.10. THEMATIC TAKEAWAYS 385
9.9.5 Analysis
Some beginner programmers find the constraints of the safe type system in ML
and Haskell to be a source of frustration. Moreover, some find type classes to be
a source of frustration in Haskell. However, once these concepts are understood
properly, advanced ML and Haskell programmers appreciate the safe, algebraic
type systems in ML and Haskell.
and closure representations. This chapter prepares us for designing efficacious and
efficient data structures for the interpreters we build in Part III (Chapters 10–12).
Chapters 10–11 and Sections 12.2, 12.4, and 12.6–12.7 are inspired by Friedman,
Wand, and Haynes (2001, Chapter 3). The primary difference between the two
approaches is in implementation language. We use Python to build environment-
passing interpreters while Friedman, Wand, and Haynes (2001) uses Scheme.
Appendix A provides an introduction to the Python programming language.
We recommend that readers begin with online Appendix D, which is a guide
to getting started with Camille and includes details of its syntax and semantics,
how to acquire access to the Camille Git repository necessary for using Camille,
and the pedagogical approach to using the language. Online Appendix E provides
the individual grammars for the progressive versions of Camille in one central
location.
Chapter 10
Les yeux sont les interprètes du coeur, mais il n’y a que celui qui y a intérêt
qui entend leur langage.
(Translation: The eyes are the interpreters of the heart, but only those
who have an interest can hear their language.)
— Blaise Pascal
book is about programming language concepts. One approach to learning
T HIS
language concepts is to implement them by building interpreters for
computer languages. Interpreter implementation also provides the operational
semantics for the interpreted programs. In this and the following two chapters we
put into practice the language concepts we have encountered in Chapters 1–9.
10.2 Checkpoint
Thus far in this course of study of programming languages, we have
explored:
392 CHAPTER 10. LOCAL BINDING AND CONDITIONAL EVALUATION
are compelling: We can be mystified by the drastic changes we can effect in the
semantics of implemented language by changing only a few lines of code in the
interpreter—sometimes as little as one line (e.g., using dynamic scoping rather
than static scoping, or using lazy evaluation as opposed to eager evaluation).
We use Python as the implementation language in the construction of these
interpreters. Thus, an understanding of Python is requisite for the construction of
interpreters in Python in Chapters 10–12. We refer readers to Appendix A for an
introduction to the Python programming language.
Online Appendix D is a guide to getting started with Camille and includes
details of its syntax and semantics, how to acquire access to the Camille Git
repository necessary for using Camille, and the pedagogical approach to using
the language. The Camille Git repository is available at https://bitbucket
.org/camilleinterpreter/camille-interpreter-in-python-release/src/master/. Its
structure and contents are described in online Appendix D and at https:
//bitbucket.org/camilleinterpreter/camille-interpreter-in-python-release/src
/master/PAPER/paper.md. Online Appendix E provides the individual gram-
mars for the progressive versions of Camille in one central location.
programming, and explored the use of multiple configuration options for both
aspects of the design of the interpreter as well as the semantics of implemented
concepts (see Table 10.3 later in this chapter).
Next, we start slowly to morph Camille, in Chapter 12, through its interpreter,
into a language with imperative programming features by adding provisions for
side effect (e.g., through variable assignment). Variable assignment requires a
modification to the representation of the environment. Now, the environment must
store references to expressed values, rather than the expressed values themselves.
This raises the issue of implicit versus explicit dereferencing, and naturally
leads to exploring a variety of parameter-passing mechanisms, such as pass-by-
reference or pass-by-name/lazy evaluation. Finally, in Chapter 12, we close the
loop on the imperative approach by eliminating the need to use recursion for
repetition by recalibrating the language, through its interpreter, to be a statement-
oriented, rather than expression-oriented, language. This involves adding support
for statement blocks, while loops, and I / O operations.
We present each of the first three of these components in Section 10.6. We first
encounter the need for supporting data types (in this case, an environment) and
libraries in Section 10.7.
1. The component of a language implementation that accepts an abstract-syntax tree and evaluates
it is called an interpreter—see Chapter 4 and the rightmost component labeled “Interpreter” in
Figure 10.1. However, we generally refer to the entire language implementation as the interpreter. To the
programmer of the source program being interpreted, the entire language implementation (Figure 4.1)
is the interpreter rather than just the last component of it.
10.5. THE CAMILLE GRAMMAR AND LANGUAGE 395
• Expressed values are the possible (return) values of expressions (e.g., numbers,
characters, and strings in Java or Scheme).
• Denoted values are values bound to variables (e.g., references to locations
containing expressed values in Java or Scheme).
• The defined programming language (or source language) is the language specified
(or operationalized) by the interpreter.
• The defining programming language (or host language) is the language in which
we implement the interpreter (for the defined language).
Here, our defined language is Camille and our defining language is Python.
ntNumber
ăepressoną ::= ănmberą
ntPrimitive_op
ăepressoną ::= ăprmteą (tăepressonąu`p,q )
ntPrimitive
ăprmteą ::= + | - | * | inc1 | dec1 | zero? | eqv?
At this point, the language only has support for numbers and primitive operations.
Sample expressions in Camille are:
32
+(33,1)
inc1(2)
dec1(4)
dec1(-(33,1))
+(inc1(2),-(6,4))
+(-(35,33), inc1(8))
Currently, in Camille,
Front End
(regular
source program grammar)
scanner
(a string or
list of lexemes)
list of tokens
(concrete
representation)
(context-free
grammar)
parser
abstract-syntax tree
Interpreter
1 import re
2 import sys
3 import operator
4 import traceback
5 import ply.lex as lex
6 import ply.yacc as yacc
7 from collections import defaultdict
8
9 # begin lexical specification #
10.6. A FIRST CAMILLE INTERPRETER 397
73 else:
74 r a i s e ParserException("Syntax error near: Line %d" %
75 (lexer.lineno - (lexer.lineno > 1)))
76
77 def p_program_expr(t):
78 '''programs : program programs
79 | program'''
80 #do nothing
81
82 def p_line_expr(t):
83 '''program : expression'''
84 t[0] = t[1]
85 p r i n t (evaluate_expr(t[0]))
86
87 def p_primitive_op(t):
88 '''expression : primitive LPAREN expressions RPAREN'''
89 t[0] = Tree_Node(ntPrimitive_op, [t[3]], t[1], t.lineno(1))
90
91 def p_primitive(t):
92 '''primitive : PLUS
93 | MINUS
94 | INC1
95 | MULT
96 | DEC1
97 | ZERO
98 | EQV'''
99 t[0] = Tree_Node(ntPrimitive, None, t[1], t.lineno(1))
100
101 def p_expression_number(t):
102 '''expression : NUMBER'''
103 t[0] = Tree_Node(ntNumber, None, t[1], t.lineno(1))
104
105 def p_expressions(t):
106 '''expressions : expression
107 | expression COMMA expressions'''
108 i f len(t) == 4:
109 t[0] = Tree_Node(ntExpressions, [t[1], t[3]], None, t.lineno(1))
110 e l i f len(t) == 2:
111 t[0] = Tree_Node(ntExpressions, [t[1]], None, t.lineno(1))
112 # end syntactic specification
113
114 def parser_feed(s,parser):
115 pattern = re.compile ("[^ \t]+")
116 i f pattern.search(s):
117 try:
118 parser.parse(s)
119 e x c e p t InterpreterException as e:
120 p r i n t ( "Line %s: %s" % (e.linenumber, e.message))
121 i f ( e.additional_information != None ):
122 p r i n t ("Additional information:")
123 p r i n t (e.additional_information)
124 e x c e p t ParserException as e:
125 p r i n t (e.message)
126 e x c e p t Exception as e:
127 p r i n t ("Unknown Error occurred "
128 "(this is normally caused by a Python syntax error)")
129 raise e
Lines 9–63 and 65–112 constitute the lexical and syntactic specifications,
respectively. Comments in Camille programs begin with the lexeme --- (i.e., three
consecutive dashes) and continue to the end of the line. Multi-line comments
10.6. A FIRST CAMILLE INTERPRETER 399
are not supported. Comments are ignored by the scanner (lines 51–53). Recall
from Chapter 3 that the lex.lex() (line 62) generates a scanner. Similarly, the
function yacc.yacc() generates a parser and is called in the interpreter from
the REPL definition (Section 10.6.4). Notice that the p_line_expr function (lines
82–85) has changed slightly from the version shown on lines 135–139 in the
parser generator listing in Section 9.6.2. In particular, lines 138–139 in the original
definition
82 def p_line_expr(t):
83 '''program : expression'''
84 t[0] = t[1]
85 p r i n t (evaluate_expr(t[0]))
Rather than assign the final abstract-syntax tree to the global variable
global_tree (line 139) so that it can be referenced by a function that invokes
the parser (e.g., the concrete2abstract function), now we pass the tree to the
interpreter (i.e., the evaluate_expr function) on line 85.
For details on PLY, see https://www.dabeaz.com/ply/. The use of a
scanner/parser generator facilitates this incremental development approach,
which leads to a more malleable interpreter/language. Thus, the lexical and
syntactic specifications given here can be used as is, and the scanner and parser
generated from them can be considered black boxes.
212 else:
213 r a i s e InterpreterException(expr.linenumber,
214 "Invalid tree node type %s" % expr.type)
215 e x c e p t InterpreterException as e:
216 # Raise exception to the next level until
217 # we reach the top level of the interpreter.
218 # Exceptions are fatal for a single tree,
219 # but other programs within a single file may
220 # otherwise be OK.
221 raise e
222 e x c e p t Exception as e:
223 # We want to catch the Python interpreter exception and
224 # format it such that it can be used
225 # to debug the Camille program.
226 p r i n t (traceback.format_exc())
227 r a i s e InterpreterException(expr.linenumber,
228 "Unhandled error in %s" % expr.type , s t r (e), e)
229 # end interpreter #
This segment of code contains both the definitions of the abstract-syntax tree
data structure (lines 144–163) and the evaluate_expr function (lines 194–228).
Notice that for each variant (lines 147–150) of the TreeNode data type (lines
152–162) that represents a Camille expression, there is a corresponding action
in the evaluate_expr function (lines 194–228). Each variant in the TreeNode
variant record2 has a case in the evaluate_expr function. This interpreter
is the component on the right-hand side of Figure 4.1, replicated here as
Figure 10.1.
ntArguments
ntParameters
ntExpressions
ărgmentsą ::= ăepressoną
ărgmentsą ::= ăepressoną, ărgmentsą
ărgmentsą ::= ε
Since all primitive operators in Camille accept arguments, the rule
ărgmentsą ::= ε applies to (forthcoming) user-defined functions that
may or may not accept arguments (as discussed in Chapter 11).
Consider the expression *(7,x) and its abstract-syntax tree presented in
Figure 10.2. The top half of each node represents the type field of the TreeNode,
the bottom right quarter of each node represents one member of the children
2. Technically, it is not a variant record as strictly defined, but rather a data type with fixed fields,
where one of the fields, the type flag, indicates the interpretation of the fields.
402 CHAPTER 10. LOCAL BINDING AND CONDITIONAL EVALUATION
ntPrimitiveOp
... ...
ntPrimitive ntExpressionList
ntNumber ntExpressionList
ntIdentifier
x None
list, and bottom left quarter of each node represents the leaf field. The
ntExpressionList variant of TreeNode represents an argument list.
The ntExpressionList variant of an abstract-syntax tree constructed by the
parser is flattened into a Python list by the interpreter for subsequent processing.
A post-order traversal of the ntExpressionList variant is conducted, with the
values in the leaf nodes being inserted into a Python list in the order in which they
appear in the application of the primitive operator in the Camille source code. Each
leaf is evaluated using evaluate_expr and its value is inserted into the Python
list. Lines 205–211 of the evaluate_expr function (replicated here) demonstrate
this process:
evaluate_expr is appended to the list created with the leaf of the node
(line 210).
The function yacc.yacc() invoked on line 232 generates a parser and returns
an object (here, named parser) that contains a function (named parse). This
function accepts a string (representing a Camille program) and parses it (line 118
in the parser generator listing).
This REPL supports two ways of running Camille programs: interactively and
non-interactively. In interactive mode (lines 238–259), the function main_func
404 CHAPTER 10. LOCAL BINDING AND CONDITIONAL EVALUATION
prints the prompt, reads a string from standard input (line 243), and passes
that string to the parser (line 245). In non-interactive mode (lines 261–268),
the prompt for input is not printed. Instead, the REPL receives one or more
Camille programs in a single source code file passed as a command-line argument
(line 262), reads it as a string (line 263), and passes that string to the parser
(line 264).
The REPL reads a string and passes it to the front end (parser.parse;
line 118). The front end parses that string, while concomitantly building an
abstract-syntax representation/tree for it, and passes that tree to the interpreter
(evaluate_expr—the entry point of the interpreter; line 85). The interpreter
traverses the tree to evaluate the program that the tree represents. Notice that this
diagram is an instantiated view of Figure 10.1 with respect to the components of
the Camille interpreter presented here.
#!/usr/bin/env bash
python3.8 camilleinterpreter.py $1
$ ./run
Camille> 32
32
Camille> +(33,1)
34
Camille> inc1(2)
3
Camille> dec1(4)
3
Camille> dec1(-(33,1))
31
Camille> +(inc1(2),-(6,4))
5
Camille> +(-(35,33),inc1(7))
10
10.7. LOCAL BINDING 405
Non-interactive mode is invoked by passing the run script a single source code
filename representing one or more Camille programs:
$ cat tests.cam
32
--- add a comment
+(33,1)
inc1(2)
dec1(4)
dec1(-(33,1))
+(inc1(2),-(6,4))
+(-(35,33),inc1(7))
$ ./run tests.cam
32
34
3
3
31
5
10
ntIdentifier
ăepressoną ::= ădentƒ erą
ăepressoną ::= ăet_epressoną
ntLet
ăet_epressoną ::= let ăet_sttementą in ăepressoną
ntLetStatement
ăet_sttementą ::= ăet_ssgnmentą
ăet_sttementą ::= ăet_ssgnmentą ăet_sttementą
ntLetAssignment
ăet_ssgnmentą ::= ădentƒ erą “ ăepressoną
We must also add the let and in keywords to the generator of the scanner on
lines 10–16 at the beginning of Section 10.6.1. The following are the corresponding
pattern-action rules in the PLY parser generator:
406 CHAPTER 10. LOCAL BINDING AND CONDITIONAL EVALUATION
def p_expression_identifier(t):
'''expression : IDENTIFIER'''
t[0] = Tree_Node(ntIdentifier, None, t[1], t.lineno(1))
def p_expression_let(t):
'''expression : LET let_statement IN expression'''
t[0] = Tree_Node(ntLet, [t[2], t[4]], None, t.lineno(1))
def p_let_statement(t):
'''let_statement : let_assignment | let_assignment let_statement'''
i f len(t) == 3:
t[0] = Tree_Node(ntLetStatement, [t[1], t[2]], None, t.lineno(1))
else:
t[0] = Tree_Node(ntLetStatement, [t[1]], None, t.lineno(1))
def p_let_assignment(t):
'''let_assignment : IDENTIFIER EQ expression'''
t[0] = Tree_Node(ntLetAssignment, [t[3]], t[1], t.lineno(1))
We also must augment the t_WORD function in the lexical analyzer generator so
that it can recognize locally bound identifiers:
1 def t_WORD(t):
2 r'[A-Za-z_][A-Za-z_0-9*?!]*'
3 pattern = re.compile ("^[A-Za-z_][A-Za-z_0-9?!]*$")
4
5 # if the identifier is a keyword, parse it as such
6 i f t.value in keywords:
7 t.type = keyword_lookup[t.value]
8 # otherwise it might be a variable so check that
9 e l i f pattern.match(t.value):
10 t.type = 'IDENTIFIER'
11 # otherwise it is a syntax error
12 else:
13 p r i n t ("Runtime error: Unknown word %s %d" %
14 (t.value[0], t.lexer.lineno))
15 sys.exit(-1)
16 return t
Lines 8–10 are the new lines of code inserted into the middle (between lines 32 and
33) of the original definition of the t_WORD function defined on lines 26–38 at the
beginning of Section 10.6.1.
To bind values to identifiers, we require a data structure in which to store the
values so that they can be retrieved using the identifier—in other words, we need
an environment. The following is the closure representation of an environment in
Python from Section 9.8 (repeated here for convenience):
try:
val = values[symbols.index(symbol)]
e x c e p t:
val = apply_environment(environment, symbol)
r e t u r n val
r e t u r n lambda symbol: tryexcept(symbol)
# end closure representation of environment #
1 # begin interpreter #
2 def evaluate_operands(operands, environ):
3 r e t u r n map(lambda x : evaluate_operand(x, environ), operands)
4
5 def evaluate_operand(operand, environ):
6 r e t u r n evaluate_expr(operand, environ)
7
8 def apply_primitive(prim, arguments):
9 r e t u r n primitive_op_dict[prim.leaf](*arguments)
10
11 def printtree(expr):
12 p r i n t (expr.leaf)
13 f o r child in expr.children:
14 printtree(child)
15
16 def evaluate_expr(expr, environ):
17 i f expr.type == ntPrimitive_op:
18 # expr leaf is mapped during parsing to
19 # the appropriate binary operator function
20 arguments = l i s t (evaluate_operands(expr.children, environ))[0]
21 r e t u r n apply_primitive(expr.leaf, arguments)
22
23 e l i f expr.type == ntNumber:
24 r e t u r n expr.leaf
25
26 e l i f expr.type == ntIdentifier:
27 try:
28 r e t u r n apply_environment(environ, expr.leaf)
29 e x c e p t:
30 r a i s e InterpreterException(expr.linenumber,
31 "Unbound identifier '%s'" % expr.leaf)
32
33 e l i f expr.type == ntLet:
34
35 temp = evaluate_expr(expr.children[0], environ) # assignment
36 identifiers = []
37 arguments = []
38 f o r name in temp:
39 identifiers.append(name)
408 CHAPTER 10. LOCAL BINDING AND CONDITIONAL EVALUATION
40 arguments.append(temp[name])
41
42 temp = evaluate_expr(expr.children[1],
43 extend_environment(identifiers, arguments, environ))
44 r e t u r n temp
45
46 e l i f expr.type == ntLetStatement:
47 # perform assignment
48 temp = evaluate_expr(expr.children[0], environ)
49 # perform subsequent assignment(s) if there are any (recursive)
50 i f len(expr.children) > 1:
51 temp.update(evaluate_expr(expr.children[1], environ))
52 r e t u r n temp
53
54 e l i f expr.type == ntLetAssignment:
55 r e t u r n { expr.leaf : evaluate_expr(expr.children[0], environ) }
56
57 e l i f expr.type == ntExpressions:
58 ExprList = []
59 ExprList.append(evaluate_expr(expr.children[0], environ))
60
61 i f len(expr.children) > 1:
62 ExprList.extend(evaluate_expr(expr.children[1], environ))
63 r e t u r n ExprList
64 else:
65 r a i s e InterpreterException(expr.linenumber,
66 "Invalid tree node type %s" % expr.type)
67 # end interpreter #
Lines 33–44 of the evaluate_expr function access the ntLet variant of the
abstract-syntax tree of type TreeNode and evaluate the let expression it
represents. In particular, line 35 evaluates the right-hand side of the = sign in each
binding, and lines 42–43 evaluate the body of the let expression (line 42) in an
environment extended with the newly created bindings (line 43). Notice that we
build support for local binding in Camille from first principles—specifically, by
defining an environment.
We briefly discuss how the bindings in a let expression are both represented
in the abstract-syntax tree and evaluated. The abstract-syntax tree that describes a
let expression is similar to the abstract-syntax tree that describes an argument
list.3 Figure 10.3 presents a simplified version of an abstract-syntax tree that
represents a let expression. Again, the top half of each node represents the type
field of the TreeNode, the bottom right quarter of each node represents one
member of the children list, and bottom left quarter of each node represents
the leaf field.4
Consider the ntLet, ntLetStatement, and ntLetAssignment cases in the
evaluate_expr function:
33 e l i f expr.type == ntLet:
34
35 temp = evaluate_expr(expr.children[0], environ) # assignment
36 identifiers = []
37 arguments = []
38 f o r name in temp:
3. The same approach is used in the abstract-syntax tree for let* (Programming Exercise 10.6) and
letrec expressions (Section 11.3).
4. This figure is also applicable for let* and letrec expressions.
10.7. LOCAL BINDING 409
ntLet
... expression
ntLetStatement
... ...
ntLetAssignment ntLetStatement
ntLetStatement
y expression
39 identifiers.append(name)
40 arguments.append(temp[name])
41
42 temp = evaluate_expr(expr.children[1],
43 extend_environment(identifiers, arguments, environ))
44 r e t u r n temp
45
46 e l i f (expr.type == ntLetStatement):
47 # perform assignment
48 temp = evaluate_expr(expr.children[0], environ)
49 # perform subsequent assignment(s) if there are any (recursive)
50 i f len(expr.children) > 1:
51 temp.update(evaluate_expr(expr.children[1], environ))
52 r e t u r n temp
53
54 e l i f expr.type == ntLetAssignment:
55 r e t u r n { expr.leaf : evaluate_expr(expr.children[0], environ) }
dictionary containing all name–value pairs is returned to the ntLet case. The
Python dictionary is then split into two lists: one containing only names (line
39) and another containing only values (line 40). These values are placed into an
environment (line 43). The body of the let expression is then evaluated using this
new environment (line 42).
It is also important to note that the last line of the p_line_expr
function in the parser generator, print(evaluate_expr(t[0])) (line 85
of the listing at the beginning of Section 10.6.1), needs to be replaced with
print(evaluate_expr(t[0], empty_environment())) so that an empty
environment is passed to the evaluate_expr function with the AST of the
program.
Example expressions in this version of Camille5 with their evaluated results
follow:
Camille> l e t
a=32
b=33
in
-(b,a)
1
Camille> --- demonstrates a scope hole
let
a=32
in
let
--- shadows the a on line 9
a = -(a,16)
in
dec1(a)
15
Camille> l e t a = 9 in i
Line 1: Unbound identifier 'i'
ntIfElse
ăcondton_epressoną ::= if ăepressoną ăepressoną else ăepressoną
def p_expression_condition(t):
'''expression : IF expression expression ELSE expression'''
t[0] = Tree_Node(ntIfElse, [t[2], t[3], t[5]], None, t.lineno(1))
We must also add the if and else keywords to the generator of the scanner on
lines 10–16 of the listing at the beginning of Section 10.6.1.
1 import re
2 import sys
3 import operator
4 import traceback
5 import ply.lex as lex
6 import ply.yacc as yacc
7 from collections import defaultdict
8
9 # begin closure representation of environment #
10 def empty_environment():
11 def raise_IE():
12 r a i s e IndexError
13 r e t u r n lambda symbol: raise_IE()
14
15 def apply_environment(environment, symbol):
16 r e t u r n environment(symbol)
17
18 def extend_environment(symbols, values, environment):
19 def tryexcept(symbol):
20 try:
21 val = values[symbols.index(symbol)]
22 e x c e p t:
23 val = apply_environment(environment, symbol)
24 r e t u r n val
25
26 r e t u r n lambda symbol: tryexcept(symbol)
27 # end closure representation of environment #
28
29 # begin implementation of primitive operations #
30 def eqv(op1, op2):
31 r e t u r n op1 == op2
32
33 def decl1(op):
34 r e t u r n op - 1
35
36 def inc1(op):
37 r e t u r n op + 1
38
39 def isZero(op):
40 r e t u r n op == 0
41 # end implementation of primitive operations #
42
43 # begin expression data type #
44
45 # list of node types
46 ntPrimitive = 'Primitive'
47 ntPrimitive_op = 'Primitive Operator'
48
49 ntNumber = 'Number'
50 ntIdentifier = 'Identifier'
51
52 ntIfElse = 'Conditional'
53
54 ntExpressions = 'Expressions'
55
56 ntLet = 'Let'
57 ntLetStatement = 'Let Statement'
58 ntLetAssignment = 'Let Assignment'
59
60 c l a s s Tree_Node:
61 def __init__(self,type ,children, leaf, linenumber):
62 self.type = type
63 # save the line number of the node so run-time
64 # errors can be indicated
65 self.linenumber = linenumber
66 i f children:
67 self.children = children
68 else:
69 self.children = [ ]
70 self.leaf = leaf
71 # end expression data type #
72
73 # begin interpreter #
74 c l a s s InterpreterException(Exception):
75 def __init__(self, linenumber, message,
76 additional_information=None, exception=None):
77 self.linenumber = linenumber
78 self.message = message
79 self.additional_information = additional_information
80 self.exception = exception
81
82 primitive_op_dict = { "+" : operator.add, "-" : operator.sub,
83 "*" : operator.mul, "dec1" : decl1,
84 "inc1" : inc1, "zero?" : isZero,
10.9. PUTTING IT ALL TOGETHER 413
85 "eqv?" : eqv }
86
87 primitive_op_dict = defaultdict(lambda: -1, primitive_op_dict)
88
89 def evaluate_operands(operands, environ):
90 r e t u r n map(lambda x : evaluate_operand(x, environ), operands)
91
92 def evaluate_operand(operand, environ):
93 r e t u r n evaluate_expr(operand, environ)
94
95 def apply_primitive(prim, arguments):
96 r e t u r n primitive_op_dict[prim.leaf](*arguments)
97
98 def printtree(expr):
99 p r i n t (expr.leaf)
100 f o r child in expr.children:
101 printtree(child)
102
103 def evaluate_expr(expr, environ):
104 try:
105 i f expr.type == ntPrimitive_op:
106 # expr leaf is mapped during parsing to
107 # the appropriate binary operator function
108 arguments = l i s t (evaluate_operands(expr.children,
109 environ))[0]
110 r e t u r n apply_primitive(expr.leaf, arguments)
111
112 e l i f expr.type == ntNumber:
113 r e t u r n expr.leaf
114
115 e l i f expr.type == ntIdentifier:
116 try:
117 r e t u r n apply_environment(environ, expr.leaf)
118 e x c e p t:
119 r a i s e InterpreterException(expr.linenumber,
120 "Unbound identifier '%s'" % expr.leaf)
121
122 e l i f expr.type == ntIfElse:
123 i f evaluate_expr(expr.children[0], environ):
124 r e t u r n evaluate_expr(expr.children[1], environ)
125 else :
126 r e t u r n evaluate_expr(expr.children[2], environ)
127
128 e l i f expr.type == ntLet:
129
130 # assignment
131 temp = evaluate_expr(expr.children[0], environ)
132 identifiers = []
133 arguments = []
134 f o r name in temp:
135 identifiers.append(name)
136 arguments.append(temp[name])
137
138 # evaluation
139 temp = evaluate_expr(expr.children[1],
140 extend_environment(identifiers, arguments,
141 environ))
142 r e t u r n temp
143
144 e l i f (expr.type == ntLetStatement):
145 # perform assignment
146 temp = evaluate_expr(expr.children[0], environ)
147 # perform subsequent assignment(s)
414 CHAPTER 10. LOCAL BINDING AND CONDITIONAL EVALUATION
211
212 # if the identifier is a keyword, parse it as such
213 i f t.value in keywords:
214 t.type = keyword_lookup[t.value]
215 # otherwise it might be a variable so check that
216 e l i f pattern.match(t.value):
217 t.type = 'IDENTIFIER'
218 # otherwise it is a syntax error
219 else:
220 p r i n t ("Runtime error: Unknown word %s %d" %
221 (t.value[0], t.lexer.lineno))
222 sys.exit(-1)
223 return t
224
225 def t_NUMBER(t):
226 r'-?\d+'
227 # try to convert the string to an int, flag overflows
228 try:
229 t.value = i n t (t.value)
230 e x c e p t ValueError:
231 p r i n t ("Runtime error: number too large %s %d" %
232 (t.value[0], t.lexer.lineno))
233 sys.exit(-1)
234 return t
235
236 def t_COMMENT(t):
237 r'---.*'
238 pass
239
240 def t_newline(t):
241 r'\n'
242 #continue to next line
243 t.lexer.lineno = t.lexer.lineno + 1
244
245 def t_error(t):
246 p r i n t ("Unrecognized token %s on line %d." % (t.value.rstrip(),
247 t.lexer.lineno))
248 lexer = lex.lex()
249 # end lexical specification #
250
251 # begin syntactic specification #
252 c l a s s ParserException(Exception):
253 def __init__(self, message):
254 self.message = message
255
256 def p_error(t):
257 i f (t != None):
258 r a i s e ParserException("Syntax error: Line %d " % (t.lineno))
259 else:
260 r a i s e ParserException("Syntax error near: Line %d" %
261 (lexer.lineno - (lexer.lineno > 1)))
262
263 def p_program_expr(t):
264 '''programs : program programs
265 | program'''
266 # do nothing
267
268 def p_line_expr(t):
269 '''program : expression'''
270 t[0] = t[1]
271 p r i n t (evaluate_expr(t[0], empty_environment()))
272
273 def p_primitive_op(t):
416 CHAPTER 10. LOCAL BINDING AND CONDITIONAL EVALUATION
337 e x c e p t ParserException as e:
338 p r i n t (e.message)
339 e x c e p t Exception as e:
340 p r i n t ("Unknown Error occurred "
341 "(This is normally caused by "
342 "a Python syntax error.)")
343 raise e
344
345 # begin REPL
346 def main_func():
347 parser = yacc.yacc()
348 interactiveMode = False
349
350 i f len(sys.argv) == 1:
351 interactiveMode = True
352
353 i f interactiveMode:
354 program = ""
355 try:
356 prompt = 'Camille> '
357 while True:
358 line = input(prompt)
359 i f (line == "" and program != ""):
360 parser_feed(program,parser)
361 lexer.lineno = 1
362 program = ""
363 prompt = 'Camille> '
364 else:
365 i f (line != ""):
366 program += (line + '\n')
367 prompt = ''
368
369 e x c e p t EOFError as e:
370 sys.exit(0)
371
372 e x c e p t Exception as e:
373 p r i n t (e)
374 sys.exit(-1)
375 else:
376 try:
377 with open(sys.argv[1], 'r') as script:
378 file_string = script.read()
379 parser_feed(file_string, parser)
380 sys.exit(0)
381 e x c e p t Exception as e:
382 sys.exit(-1)
383
384 main_func()
385 # end REPL
Exercise 10.1 Reimplement the interpreter given in this chapter for Camille 1.2.a
to use the abstract-syntax representation of a named environment given in
Section 9.8.4. This is Camille 1.2(named ASR).
418 CHAPTER 10. LOCAL BINDING AND CONDITIONAL EVALUATION
Table 10.1 New Versions of Camille, and Their Essential Properties, Created in the
Chapter 10 Programming Exercises. (Key: ASR = abstract-syntax representation;
CLS = closure; LOLR = list-of-lists representation.)
Exercise 10.2 Reimplement the interpreter given in this chapter for Camille 1.2
to use the list-of-lists representation of a named environment developed in
Programming Exercise 9.8.5.a. This is Camille 1.2(named LOLR).
82 def p_line_expr(t):
83 '''program : expression'''
84 t[0] = t[1]
85 p r i n t (evaluate_expr(t[0]))
We must replace line 85 with lines 85 and 86 in the following new definition:
82 def p_line_expr(t):
83 '''program : expression'''
84 t[0] = t[1]
85 lexical_addresser(t[0], 0, [])
86 p r i n t (evaluate_expr(t[0], empty_nameless_environment()))
Exercise 10.3 Reimplement the interpreter for Camille 1.2 to use the abstract-
syntax representation of a nameless environment developed in Programming
Exercise 9.8.9. This is Camille 1.2(nameless ASR).
Exercise 10.4 Reimplement the interpreter for Camille 1.2 to use the list-of-
lists representation of a nameless environment developed in Programming
Exercise 9.8.5.b. This is Camille 1.2(nameless LOLR).
Exercise 10.5 Reimplement the interpreter given in this chapter for Camille
1.2 to use the closure representation of a nameless environment developed in
Programming Exercise 9.8.7. This is Camille 1.2(nameless CLS).
Exercise 10.6 Implement let* in Camille (with the same semantics it has in
Scheme). For instance:
10.11. CHAPTER SUMMARY 419
Camille> l e t *
a = 3
b = +(a, 4)
in
+(a, b)
10
Figure 10.4 and Table 10.2 indicate the dependencies between the versions of
Camille developed in this chapter, including the programming exercises. Table 10.3
summarizes the concepts and features implemented in the progressive versions of
Camille developed in this chapter, including the programming exercises. Table 10.4
outlines the configuration options available in Camille for aspects of the design of
the interpreter (e.g., choice of representation of referencing environment).
420 CHAPTER 10. LOCAL BINDING AND CONDITIONAL EVALUATION
1.1
let
1.1(named CLS)
1.2
let
let, if/else
CLS env
1.2(named CLS) 1.2(named ASR) 1.2(named LOLR) 1.2(nameless CLS) 1.2(nameless) LOLR
1.2(nameless ASR) 1.3
let, if/else let, if/else let, if/else let, if/else let, if/else
let, if/else let, let*, if/else
CLS env ASR env LOLR env nameless nameless
nameless
CLS env LOLR env
ASR env
To support functions in Camille, we add the following rules to the grammar and
corresponding pattern-action rules to the PLY parser generator:
ăepressoną ::= ănonrecrse_ƒ nctoną
ăepressoną ::= ăƒ ncton_cą
ntFuncDecl
ănonrecrse_ƒ nctoną ::= fun (tădentƒ erąu‹p,q ) ăepressoną
ntFuncCall
ăƒ ncton_cą ::= (ăepressoną tăepressonąu‹p,q )
def p_expression_function_decl(t):
'''expression : FUN LPAREN parameters RPAREN expression
| FUN LPAREN RPAREN expression'''
i f len(t)==6:
t[0] = Tree_Node(ntFuncDecl, [t[3], t[5]], None, t.lineno(1))
else:
t[0] = Tree_Node(ntFuncDecl, [t[4]], None, t.lineno(1))
def p_expression_function_call(t):
'''expression : LPAREN expression arguments RPAREN
| LPAREN expression RPAREN '''
i f len(t)== 5:
t[0] = Tree_Node(ntFuncCall, [t[3]], t[2], t.lineno(1))
else:
t[0] = Tree_Node(ntFuncCall, None, t[2], t.lineno(1))
The following example Camille expressions show functions with their evaluated
results:
4
11.2. NON-RECURSIVE FUNCTIONS 425
Camille> l e t
area = fun (width,height) *(width,height)
in
(area 2,3)
1 let
2 a = 1
3 in
4 let
5 f = fun (x) +(x,a)
6 a = 2
7 in
8 (f a)
What value should be inserted into the environment and mapped to the identifier
f (line 5)? Alternatively, what value should be retrieved from the environment
when the identifier f is evaluated (line 7)? The identifier f must be evaluated when
f is applied (line 7). Thus, we must determine the information necessary to store in
the environment to represent the value of a user-defined function. The necessary
information that must be stored in a function value depends on which data is
required to evaluate that function when it is applied (or invoked). To determine
this, let us examine what must happen to invoke a function.
Assuming the use of lexical scoping (to bind each reference to a declaration),
when a function is applied, the body of the function must be evaluated in an
environment that binds the formal parameters to the arguments and binds the
free variables in the body of the function to their values at the time the function was
created (i.e., deep binding). In the Camille expression previously shown, when f is
called, its body must be evaluated in the environment
{(x,2), (a,1)} (i.e., static scoping)
and not in the environment
{(x,2), (a,2)} (i.e., dynamic scoping)
Thus, we must call
evaluate_expr (+(x,a), (x,2), (a,1))
and not call
evaluate_expr (+(x,a), (x,2), (a,2))
Thus,
Camille> l e t
a = 1
in
let
426 CHAPTER 11. FUNCTIONS AND CLOSURES
For a function to retain the bindings of its free variables at the time it was created,
it must be a closed package and completely independent of the environment in which
it is called. This package is called a closure (as discussed in Chapters 5 and 6).
11.2.2 Closures
A closure must contain:
We say that this function is closed over or closed in its creation environment. A
closure resembles an object from object-oriented programming—both have state
and behavior. A closure consists of a pair of (expression, environment) pointers.
Thus, we can think of a closure as a cons cell, which also contains two pointers
(Section 5.5.1). In turn, we can think of a function value as an abstract data type
( ADT) with the following interface:
1. Recall, from Section 5.4.1, the distinction between formal and actual parameters or, in other words,
the difference between parameters and arguments.
11.2. NON-RECURSIVE FUNCTIONS 427
c l a s s Closure:
def __init__(self, parameters, body, environ):
self.parameters = parameters
self.body = body
self.environ = environ
def is_closure(cls):
r e t u r n i s i n s t a n c e (cls, Closure)
def is_closure(cls):
r e t u r n c a l l a b l e (cls)
Using either of these representations for Camille closures, the following equality
holds:
apply_closure (make_closure(arglist, expr.children[1], environ), arguments) =
evaluate_expr(cls.body, extend_environment(cls.parameters, arguments, cls.environ))
Figures 11.2 and 11.3 illustrate how closures are stored in abstract-syntax and list-
of-lists representations, respectively, of named environments.
list of values
list of identifiers (Closure S) rest of environment
list of
list of values
square increment
identifiers (Closure S)
rest of
environment
14 arglist = []
15 body = expr.children[0]
16 r e t u r n make_closure(arglist, body, environ)
17
18 e l i f expr.type == ntParameters:
19 ParamList = []
20 ParamList.append(expr.children[0])
21
22 i f len(expr.children) > 1:
23 ParamList.extend(evaluate_expr(expr.children[1], environ))
24 r e t u r n ParamList
25
26 e l i f expr.type == ntArguments:
27 ArgList = []
28 ArgList.append(evaluate_expr(expr.children[0], environ))
29
30 i f len(expr.children) > 1:
31 ArgList.extend(evaluate_expr(expr.children[1], environ) )
32 r e t u r n ArgList
33
34 e l i f expr.type == ntFuncCall:
35 cls = evaluate_expr(expr.leaf, environ)
36 i f len (expr.children) != 0:
37 arguments = evaluate_expr(expr.children[0], environ)
38 else :
39 arguments = []
40
41 i f is_closure(cls):
42 r e t u r n apply_closure(cls,arguments)
43 else :
11.2. NON-RECURSIVE FUNCTIONS 429
list of
identifiers fun_names list of Closure values
square increment
Example expressions in this version of Camille with their evaluated results follow:
1
Camille> l e t f = fun (x) *(x,x) in (f 2)
4
Camille> l e t f = fun (width,height) *(width,height) in (f 2,3)
6
Camille> l e t a = 1 in l e t f = fun (x) +(x,a) a = 2 in (f a)
Consider the Camille rendition (and its output) of the Scheme program shown
at the start of Section 6.11 to demonstrate deep, shallow, and ad hoc binding:
Camille> l e t
y = 3
430 CHAPTER 11. FUNCTIONS AND CLOSURES
in
let
x = 10
--- create closure here: deep binding
f = fun (x) *(y, +(x,x))
in
let
y = 4
in
let
y = 5
x = 6
--- create closure here: shallow binding
g = fun (x, y) *(y, (x y))
in
let
y = 2
in
--- create closure here: ad hoc binding
(g f,x)
216
This result (216) demonstrates that Camille implements deep binding to resolve
nonlocal references in the body of first-class functions.
Note that this version of Camille does not support recursion:
Camille> l e t
sum = fun (x) if zero?(x) 0 else +(x, (sum dec1(x)))
in
(sum 5)
However, we can simulate recursion with let as done in the definition of the
function length in Section 5.9.3:
Camille> l e t
sum = fun (s, x)
if zero?(x)
0
else
+(x, (s s,dec1(x)))
in
(sum sum, 5)
15
let
--- constructor
new_stack = fun ()
fun(msg)
if eqv?(msg, 1)
-1 --- e r r o r : cannot top an empty stack
else
if eqv?(msg, 2)
-2 --- e r r o r : cannot pop an empty stack
else
1 --- represents true: stack is empty
--- constructor
push = fun (elem, stack)
fun (msg)
if eqv?(msg,1) elem
else if eqv?(msg,2) stack
else 0
--- observers
emptystack? = fun (stack) (stack 0)
top = fun (stack) (stack 1)
pop = fun (stack) (stack 2)
in
let
simplestack = (new_stack)
in
(top (push 3, (push 2, (push 1, simplestack))))
Exercise 11.2.3 As discussed in this section, this version of Camille does not
support recursion. However, we simulated recursion by passing a function to
itself—so it can call itself. Is there another method of simulating recursion in
this non-recursive version of the Camille interpreter? In particular, explore the
relationship between dynamic scoping and the let* expression (Programming
Exercise 10.6). Consider the following Camille expression:
Will this expression evaluate properly using lexical scoping in the version of the
Camille interpreter supporting only non-recursive functions? Will this expression
evaluate properly using dynamic scoping in the version of the Camille interpreter
supporting only non-recursive functions? Explain.
432 CHAPTER 11. FUNCTIONS AND CLOSURES
11.2.12 2.0(dynamic scoping) dynamic scoping 2.0 lambda expression CLS |ASR|LOLR
Table 11.1 New Versions of Camille, and Their Essential Properties, Created in the
Section 11.2.4 Programming Exercises. (Key: ASR = abstract-syntax representation;
CLS = closure; LOLR = list-of-lists representation.)
1.1
let
2.0
non-recursive functions
CLS | ASR | LOLR env
Static scoping
2.0(dynamic scoping)
make 2.0(verify) CLS | ASR | LOVR env
nameless
make nameless
Figure 11.4 Dependencies between the Camille interpreters developed thus far, including those in the programming
exercises. The semantics of a directed edge Ñ b are that version b of the Camille interpreter is an extension of
version (i.e., version b subsumes version ). (Key: circle = instantiated interpreter; diamond = abstract interpreter;
ASR = abstract-syntax representation; CLS = closure; LOLR = list-of-lists representation.)
434 CHAPTER 11. FUNCTIONS AND CLOSURES
Exercise 11.2.5 (Friedman, Wand, and Haynes 2001, Exercise 3.23, p. 90)
Implement a lexical-address calculator, like that of Programming Exercise 6.5.3, for
the version of Camille defined in this section. The calculator must take an abstract-
syntax representation of a Camille expression and return another abstract-syntax
representation of it. In the new representation, the leaf of every ntIdentifier
parse tree node should be replaced with a [var, depth, pos] list, where
pdepth, posq is the lexical address for this occurrence of the variable r,
unless the occurrence of ntIdentifier is free. Name the top-level function
of the lexical-address calculator lexical_address, and define it to accept
and return an abstract-syntax representation of a Camille program. However,
use the generated parser and concrete2abstract function in Section 9.6
to build the abstract-syntax representation of the Camille input expression.
Use the abstract2concrete function to translate the lexically addressed
abstract-syntax representation of a Camille program to a string (Programming
Exercise 9.6.2). Thus, the program must take a string representation of a Camille
expression as input and return another string representation of it where the
occurrence of each variable reference is replaced with a [v, depth, pos]
list, where pdepth, posq is the lexical address for this occurrence of the variable
, unless the occurrence of is free. If the variable reference is free, print
[‘ ’,‘free’] as shown in line 7 of the following examples.
Examples:
1 $ ./run
2 Camille> l e t a = 5 in a
3
4 l e t a = 5 in ['a',0, 0]
5 Camille> l e t a = 5 in i
6
7 l e t a = 5 in ['i','free']
8 Camille> l e t a = 2 in l e t b = 3 in a
9
10 l e t a = 2 in l e t b = 3 in ['a', 1, 0]
11 Camille> l e t f = fun (y,z) +(y,-(z,5)) in (f 2,28)
12
13 l e t f=fun(y, z) +(['y',0, 0],-(['z',0, 1], 5)) in (['f',0, 0] 2,28)
Exercise 11.2.6 (Friedman, Wand, and Haynes 2001, Exercise 3.24, p. 90) Modify
the Camille interpreter defined in this section to demonstrate that the value
bound to each identifier is found at the position given by its lexical address.
Specifically, modify the evaluate_expr function so that it accepts the
output of the lexical-address calculator function lexical_address built in
Programming Exercise 11.2.5 and passes both the identifier and the lexical
address of each reference to the apply_environment function. The function
apply_environment must look up the value bound to the identifier in the usual
way. It must then compare the lexical address to the actual rib (i.e., depth and
position) in which the value is found and print an informative message in the
format demonstrated in the following examples. If the leaf of an ntIdentifier
11.2. NON-RECURSIVE FUNCTIONS 435
parse tree node is free, print [ : free] as shown in line 9. Name the
lexical-address calculator function lexical_address and invoke it from the
main_func function (lines 46 and 69):
1 ...
2
3 global_tree = ""
4
5 ...
6
7 def p_line_expr(t):
8 '''program : expression'''
9 t[0] = t[1]
10 # save global_tree
11 g l o b a l global_tree
12 global_tree = t[0]
13
14 ...
15
16 def parser_feed(s,parser):
17 pattern = re.compile ("[^ \t]+")
18 i f pattern.search(s):
19 try:
20 parser.parse(s)
21 e x c e p t InterpreterException as e:
22 p r i n t ( "Line %s: %s" % (e.linenumber, e.message))
23 i f ( e.additional_information != None ):
24 p r i n t ("Additional information:")
25 p r i n t (e.additional_information)
26 e x c e p t Exception as e:
27 p r i n t ("Unknown Error occurred ")
28 p r i n t ("(this is normally caused by a Python syntax error)")
29 raise e
30
31 def main_func():
32 parser = yacc.yacc()
33 interactiveMode = False
34 g l o b a l global_tree
35 i f len(sys.argv) == 1:
36 interactiveMode = True
37
38 i f interactiveMode:
39 program = ""
40 try:
41 prompt = 'Camille> '
42 while True:
43 line = input(prompt)
44 i f (line == "" and program != ""):
45 parser_feed(program,parser)
46 lexical_address(global_tree[0], 0, [])
47 p r i n t (evaluate_expr(global_tree[0], empty_environment()))
48 lexer.lineno = 1
49 global_tree = []
50 program = ""
51 prompt = 'Camille> '
52 else:
53 i f (line != ""):
54 program += (line + '\n')
55 prompt = ''
56
436 CHAPTER 11. FUNCTIONS AND CLOSURES
57 e x c e p t EOFError as e:
58 sys.exit(0)
59
60 e x c e p t Exception as e:
61 p r i n t (e)
62 sys.exit(-1)
63 else:
64 try:
65 with open(sys.argv[1], 'r') as script:
66 file_string = script.read()
67 parser_feed(file_string,parser)
68 f o r tree in global_tree:
69 lexical_address(tree, 0, [])
70 p r i n t (evaluate_expr(tree, empty_environment()))
71 sys.exit(0)
72 e x c e p t Exception as e:
73 p r i n t (e)
74 sys.exit(-1)
75
76 main_func()
i f environ1.flag == "empty-environment-record":
r a i s e IndexError
e l i f environ1.flag == "extended-environment-record":
try:
pos = environ1.symbols.index(symbol)
value = environ1.values[pos]
e l i f environ1.flag == \
"recursively-extended-environment-record":
try:
pos = environ1.fun_names.index(symbol)
value = make_closure(environ1.parameterlists[pos],
environ1.bodies[pos], environ1)
p r i n t ("Just found the value %s at depth %s = %s and "
"position %s = %s." % (value,current_depth,
depth,pos,position))
r e t u r n value
e x c e p t:
r e t u r n apply_environment(environ1.environ,symbol)
r e t u r n apply_environment_with_depth(environ,0)
11.2. NON-RECURSIVE FUNCTIONS 437
Examples:
1 $ ./run
2 Camille> l e t a = 5 in a
3
4 Just found the value 5 at depth 0 = 0 and p o s i t i o n 0 = 0.
5 5
6 l e t a = 5 in [0,0]
7 Camille> l e t a = 5 in i
8
9 [i : free]
10 (3, "Unbound identifier 'i'")
11 Camille> l e t a = 2 in l e t b = 3 in a
12
13 Just found the value 2 at depth 1 = 1 and p o s i t i o n 0 = 0.
14 2
15 Camille> l e t f = fun (y,z) +(y,-(z,5)) in (f 2,28)
16
17 Just found the value <function make_closure.<locals>.<lambda> at
18 0x1085b9378> at depth 0 = 0 and p o s i t i o n 0 = 0.
19 Just found the value 2 at depth 0 = 0 and p o s i t i o n 0 = 0.
20 Just found the value 28 at depth 0 = 0 and p o s i t i o n 1 = 1.
21 25
Exercise 11.2.7 Complete Programming Exercise 11.2.6, but this time use a list-of-
lists representation of an environment from Programming Exercise 9.8.5.a.
Exercise 11.2.8 Complete Programming Exercise 11.2.6, but this time use a closure
representation of an environment from Section 9.8.3.
Exercise 11.2.9 Since lexically bound identifiers are superfluous in the abstract-
syntax tree processed by an interpreter, we can completely replace each lexically
bound identifier with its lexical-address. In this exercise, you build an interpreter
that supports functions and uses a list-of-lists representation of a nameless envi-
ronment. In other words, extend Camille 2.0(named LOLR) built in Programming
Exercise 11.2.7 to use a completely nameless environment. Alternatively, extend
Camille 1.2(nameless LOLR) built in Programming Exercise 10.4 with functions.
(a) Modify your solution to Programming Exercise 11.2.5 so that its output for a
reference contains only the lexical address, not the identifier. That is, replace
the leaf of each ntIdentifier node with a [depth, pos] list, where
pdepth, posq is the lexical address for this occurrence of the identifier, unless
the occurrence of ntIdentifier is free. If the leaf of an ntIdentifier
node is free, print [free] as shown in line 7 of the following examples.
Examples:
1 $ ./run
2 Camille> l e t a = 5 in a
3
4 l e t a = 5 in [0,0]
5 Camille> l e t a = 5 in i
6
7 l e t a = 5 in [free]
438 CHAPTER 11. FUNCTIONS AND CLOSURES
8 Camille> l e t a = 2 in l e t b = 3 in a
9
10 l e t a = 2 in l e t b = 3 in [1,0]
11 Camille> l e t f = fun (y,z) +(y,-(z,5)) in (f 2,28)
12
13 l e t f = fun(y, z) +([0,0], -([0,1], 5)) in ([0,0] 2, 28)
14 Camille>
(b) (Friedman, Wand, and Haynes 2001, Exercise 3.25, p. 90) Build a list-of-lists
(i.e., ribcage) representation of a nameless environment (Figure 11.5) with the
following interface:
def empty_nameless_environment()
def extend_nameless_environment (values, environment)
def apply_nameless_environment (environment, depth, position)
Camille> a
values environ
list of values
(Closure S)
rest of
environment
Exercise 11.2.10 Complete Programming Exercise 11.2.9, but this time use an
abstract-syntax representation of a nameless environment (Figure 11.6). In other
words, modify Camille 2.0(verify ASR) as built in Programming Exercise 11.2.6
to use a completely nameless environment. Alternatively, extend Camille
1.2(nameless ASR) as built in Programming Exercise 10.3 with functions. Start
by solving Programming Exercise 9.8.9 (i.e., developing an abstract-syntax
representation of a nameless environment).
Exercise 11.2.11 Complete Programming Exercise 11.2.9, but this time use a
closure representation of a nameless environment. In other words, modify
Camille 2.0(verify CLS) as built in Programming Exercise 11.2.8 to use a
completely nameless environment. Alternatively, extend Camille 1.2(nameless
CLS ) as built in Programming Exercise 10.5 with functions. Start by solving Pro-
gramming Exercise 9.8.7 (i.e., developing a closure representation of a nameless
environment).
Exercise 11.2.12 (Friedman, Wand, and Haynes 2001, Exercise 3.30, p. 91) Modify
the Camille interpreter defined in this section to use dynamic scoping to bind
440 CHAPTER 11. FUNCTIONS AND CLOSURES
references to declarations. For instance, in the Camille function f shown here, the
reference to the identifier s in the expression *(t,s) on line 5 is bound to 15,
not 10; thus, the return value of the call to (f s) on line 8 is 225 (under dynamic
scoping), not 150 (under static/lexical scoping).
Example:
1 Camille> l e t
2 s = 10
3 in
4 let
5 f = fun (t) *(t,s)
6 s = 15
7 in
8 (f s)
9
10 225
ntLetRec
ăetrec_epressoną ::= letrec ăetrec_sttementą in ăepressoną
ntLetRecStatement
ăetrec_sttementą ::= ăetrec_ssgnmentą
ăetrec_sttementą ::= ăetrec_ssgnmentą ăetrec_sttementą
ntLetRecAssignment
ăetrec_ssgnmentą ::= ădentƒ erą “ ărecrse_ƒ nctoną
11.3. RECURSIVE FUNCTIONS 441
ntRecFuncDecl
ărecrse_ƒ nctoną ::= fun (tădentƒ erąu‹p,q ) ăepressoną
def p_expression_let_rec(t):
'''expression : LETREC letrec_statement IN expression'''
t[0] = Tree_Node(ntLetRec, [t[2], t[4]], None, t.lineno(1))
def p_letrec_statement(t):
'''letrec_statement : letrec_assignment
| letrec_assignment letrec_statement'''
i f len(t) == 3:
t[0] = Tree_Node(ntLetRecStatement, [t[1], t[2]], None,
t.lineno(1))
else:
t[0] = Tree_Node(ntLetRecStatement, [t[1]], None, t.lineno(1))
def p_letrec_assignment(t):
'''letrec_assignment : IDENTIFIER EQ rec_func_decl'''
t[0] = Tree_Node(ntLetRecAssignment, [t[3]], t[1], t.lineno(1))
def p_expression_rec_func_decl(t):
'''rec_func_decl : FUN LPAREN parameters RPAREN expression
| FUN LPAREN RPAREN expression'''
i f len(t)==6:
t[0] = Tree_Node(ntRecFuncDecl, [t[3], t[5]], None, t.lineno(1))
else:
t[0] = Tree_Node(ntRecFuncDecl, [t[4]], None, t.lineno(1))
120
2. Else,
1
apply_environment(e , name) = apply_environment(environ, name)
11.3. RECURSIVE FUNCTIONS 443
1 c l a s s Environment:
2 def __init__(self,symbols=None,values=None, fun_names=None,
3 parameterlists=None, bodies=None, environ=None):
4 i f symbols == None and values == None and fun_names == None and \
5 parameterlists == None and bodies == None and environ==None:
6 self.flag = "empty-environment-record"
7 e l i f fun_names == None and parameterlists == None and \
8 bodies == None:
9 self.flag = "extended-environment-record"
10 self.symbols = symbols
11 self.values = values
12 self.environ = environ
13 e l i f symbols == None and values == None:
14 self.flag = "recursively-extended-environment-record"
15 self.fun_names = fun_names
16 self.parameterlists = parameterlists
17 self.bodies = bodies
18 self.environ = environ
19 def extend_environment_recursively(fun_names1,
20 parameterlists1, bodies1,
21 environ1):
22 r e t u r n Environment(fun_names=fun_names1,
23 parameterlists=parameterlists1,
24 bodies=bodies1, environ=environ1)
25
26 def apply_environment(environ, symbol):
27 i f environ.flag == "empty-environment-record":
28 e l i f environ.flag == "extended-environment-record":
29 ...
30 e l i f environ.flag == "recursively-extended-environment-record":
31 try:
32 position = environ.fun_names.index(symbol)
33 r e t u r n make_closure(environ.parameterlists[position],
34 environ.bodies[position], environ)
35 e x c e p t:
36 r e t u r n apply_environment(environ.environ,symbol)
list of values
list of identifiers (Closure S) rest of environment
list of
even odd list of values
identifiers (Closure S)
rest of
environment
environment containing the closure, not the environment in which the closure is
created.
list of
identifiers fun_names list of Closure values
even odd
Exercise 11.3.1 Even though the make-closure function is called in the defini-
tion of the extend-environment-recursively for the closure representation
of a recursive environment, the closure is still created every time the name of the
recursive function is looked up in the environment. Explain.
Exercise 11.3.2 Can a let* expression evaluated using dynamic scoping achieve
the same result (i.e., recursion) as a letrec expression evaluated using lexical
scoping? In other words, does a let* expression evaluated using dynamic scoping
simulate a letrec expression? Explain.
11.3. RECURSIVE FUNCTIONS 447
11.3.6 2.1(nameless ASR) letrec nameless 2.0(nameless ASR) or 2.1(named ASR) ASR|CLS ASR
environment
11.3.7 2.1(nameless LOLR) letrec nameless 2.0(nameless LOLR) or 2.1(named LOVR) ASR|CLS LOLR
environment
11.3.8 2.1(nameless CLS) letrec nameless 2.0(nameless CLS) or 2.1(named CLS) ASR|CLS CLS
environment
11.3.9 2.1(dynamic scoping) letrec dynamic 2.0(dynamic scoping) or 2.1 lambda expression CLS |ASR|LOLR
scoping
Table 11.2 New Versions of Camille, and Their Essential Properties, Created in the
Section 11.3.3 Programming Exercises. (Key: ASR = abstract-syntax representation;
CLS = closure; LOLR = list-of-lists representation.)
Exercise 11.3.4 (Friedman, Wand, and Haynes 2001, Exercise 3.34, p. 95) Build
a list-of-lists representation of a nameless, recursive environment (Figure 11.11).
Complete Programming Exercise 9.8.5.b or 11.2.9.b, but this time make the list-of-
lists representation of the nameless environment recursive.
Exercise 11.3.6 (Friedman, Wand, and Haynes 2001) Augment the solution to
Programming Exercise 11.2.10 with letrec. In other words, extend Camille
2.0(nameless ASR) with letrec. Alternatively, modify Camille 2.1(named ASR)
to use a nameless environment. Reuse the abstract-syntax representation of a
recursive, nameless environment built in Programming Exercise 11.3.3.
Exercise 11.3.7 (Friedman, Wand, and Haynes 2001, Exercise 3.34, p. 95) Augment
the solution to Programming Exercise 11.2.9 with letrec. In other words,
extend Camille 2.0(nameless LOLR) with letrec. Alternatively, modify Camille
2.1(named LOLR) to use a nameless environment. Reuse the list-of-lists
448 CHAPTER 11. FUNCTIONS AND CLOSURES
1.3
let, let*, if/else
2.0
non-recursive
functions
CLS | ASR |
LOLR env
Static
scoping
make
2.0 recursive
(dynamic
make 2.0(verify) scoping)
nameless CLS | ASR |
LOVR env
Recursive Functions
2.1
recursive
make functions
nameless CLS | ASR |
LOLR env
make static
recursive scoping
2.0 2.0 2.0 2.0
(verify ASR) (nameless) (verify LOLR) (verify CLS)
make
make make make nameless
nameless nameless nameless
2.1
(dynamic 2.1 2.1 2.1
scoping) recursive recursive 2.1 recursive
CLS | ASR | functions functions (nameless) functions
2.0 2.0 2.0 LOVR CLS env ASR env LOLR env
(nameless (nameless (nameless env static static static
LOLR) ASR) CLS) scoping scoping scoping
Exercise 11.3.8 (Friedman, Wand, and Haynes 2001) Augment the solution to
Programming Exercise 11.2.11 with letrec. In other words, extend Camille
2.0(nameless CLS) with letrec. Alternatively, modify Camille 2.1(named CLS)
11.3. RECURSIVE FUNCTIONS 449
values environ
list of values
(Closure S)
rest of
environment
Exercise 11.3.9 Modify the Camille interpreter defined in this section to use
dynamic scoping to bind references to declarations. For instance, in the recursive
Camille function pow shown here the reference to the identifier s in the expression
*(s, (pow -(t,1))) in line 5 is bound to 3, not 2; thus, the return value of the
call to (pow 2) on line 10 is 9 (under dynamic scoping), not 4 (under static/lexical
scoping).
Example:
1 Camille> l e t
2 s = 2
3 in
4 letrec
5 pow = fun(t) if zero?(t) 1 else *(s, (pow -(t,1)))
6 in
7 let
8 s = 3
9 in
10 (pow 2)
11
12 9
Named Nameless
Non-recursive
CLS (Section 9.8.3) CLS ( PE9.8.7)
ASR (Section 9.8.4; Figure 11.2) ASR (Figure 11.6; PE 9.8.9)
LOLR (Figure 11.3; PE 9.8.5.a) LOLR (Figure 11.5; PE 9.8.5.b/11.2.9.b)
Recursive
Named Nameless
Non-recursive
Table 11.4 Camille Interpreters in Python Developed in This Text Using All
Combinations of Non-recursive and Recursive Functions, and Named and
Nameless Environments. All interpreters identified in this table work with both the
CLS and ASR of closures (Key: ASR = abstract-syntax representation; CLS = closure;
LOLR = list-of-lists representation; PE = programming exercise.)
1.0 Chapter 10: Conditionals
simple
no env
1.1
let
1.1(named
CLS) 1.2
let let, if/else
CLS env
2.0
non-recursive
functions
CLS | ASR |
LOLR env
Static
scoping
2.0
(dynamic make
2.0(verify) scoping) recursive
CLS | ASR |
LOVR env
Make
nameless Recursive Functions
2.1
recursive
make functions
nameless CLS | ASR |
LOLR env
static
make scoping
2.0 2.0 2.0 2.0 recursive
(verify ASR) (nameless) (verify LOLR) (verify CLS)
make
make make make nameless
nameless nameless nameless
2.1
(dynamic 2.1 2.1 2.1
scoping) recursive recursive 2.1 recursive
CLS | ASR | functions functions (nameless) functions
2.0 2.0 2.0 LOVR CLS env ASR env LOLR env
(nameless (nameless (nameless env static static static
LOLR) ASR) CLS) scoping scoping scoping
Local Binding ˆ Ò let Ò Ò let Ò Ò let, let˚ Ò Ò let, let* Ò Ò let, let* Ò
Conditionals ˆ ˆ Ó if{else Ó Ó if/else Ó Ó if/else Ó Ó if/else Ó
Non-recursive Functions ˆ ˆ ˆ ˆ Ò fun Ò Ò fun Ò
Recursive Functions ˆ ˆ ˆ ˆ ˆ Ò letrec Ò
Table 11.6 Concepts and Features Implemented in Progressive Versions of Camille. The symbol Ó indicates that the concept is
supported through its implementation in the defining language (here, Python). Python keyword included in each cell, where
applicable, indicates which Python construct is used to implement the feature in Camille. The symbol Ò indicates that the concept
is implemented manually. The Camille keyword included in each cell, where applicable, indicates the syntactic construct through
which the concept is operationalized. (Key: ASR = abstract-syntax representation; CLS = closure; LOLR = list-of-lists representation.
Cells in boldface font highlight the enhancements across the versions.)
455
456 CHAPTER 11. FUNCTIONS AND CLOSURES
Parameter Passing
ntAssignment
ăepressoną ::= assign! ădentƒ erą = ăepressoną
def p_expression_assign(t):
'''expression : ASSIGN IDENTIFIER EQ expression'''
t[0] = Tree_Node(ntAssignment, [t[4]], t[2], t.lineno(1))
458 CHAPTER 12. PARAMETER PASSING
let
a = 1
b = 2
in
{ ignored = assign! a = inc1(a); --- a++;
ignored2 = assign! b = inc1(b); --- b++;
+(a,b) }
The identifier ignored receives the return value of the two assignment
statements. The return value of the assignment statement in C and C++ is the value
of the expression on the right-hand side of the assignment operator.
swap = fun(x,y)
let
temp = x
in
let
ignored1 = assign! x = y
in
assign! y = temp
in
let
ignored2 = (swap a,b)
in
-(a, b) --- returns -1, not 1
-1
Here, the values of a and b are not swapped because both are passed to the swap
function by value.
Reference
position vector
3 a Python list
0 1 2 3 4
7 5 1 3 8
c l a s s Reference:
def __init__(self,position,vector):
self.position = position
self.vector = vector
def primitive_dereference(self):
r e t u r n self.vector[self.position]
def dereference(self):
try:
r e t u r n self.primitive_dereference()
e x c e p t:
r a i s e Exception("Illegal dereference.")
def assignreference(self,value):
try:
self.primitive_assignreference(value)
e x c e p t:
r a i s e Exception("Illegal creation of reference.")
Now that we have a Reference data type, we must modify the environment
implementation so that it can make use of references. We assume that denoted
values in an environment are of the form Ref() for some . We realize this en-
vironment structure by adding the function apply_environment_reference
to the environment interface. This function is similar to apply_environment,
except that when it finds the matching identifier, it returns the “reference to its
value” instead of its value (Friedman, Wand, and Haynes 2001). Therefore, as in
Scheme, all denoted values in Camille are references:
expressed value = integer Y closure
denoted value = reference to an expressed value
Thus,
denoted value ‰ expressed value (= integer Y closure)
The function apply_environment then can be defined through the
apply_environment_reference and dereference (Friedman, Wand,
and Haynes 2001) functions:
e l i f environ.flag == "recursively-extended-environment-record":
try:
position = environ.fun_names.index(identifier)
# pass-by-value
r e t u r n Reference(0,
[make_closure(environ.parameterlists[position],
environ.bodies[position], environ)])
e x c e p t:
r e t u r n apply_environment_reference(environ.environ, identifier)
e l i f expr.type == ntAssignment:
tempref = apply_environment_reference(environ, expr.leaf)
tempref.assignreference(evaluate_expr(expr.children[0], environ))
Notice that a value is returned. Here, we explicitly return the integer 1 (as seen in
the last line of code) because the return value of the function assignreference
is unspecified and we must always return an expressed value. When using
assignment statements in a variety of programming languages, the return value
can be ignored (e.g., x--; in C). In Camille, the return value of an assignment
statement is ignored, especially when a series of assignment statements are used
within a series of let expressions to simulate sequential execution, as illustrated
in this section.
1 let
2 new_stack = fun ()
3 let*
4 empty_stack = fun(msg)
5 if eqv?(msg,1)
6 200 --- cannot top an empty stack
7 else
8 if eqv?(msg,2)
9 100 --- cannot pop an empty stack
10 else if eqv?(msg,3)
11 1 --- represents true: stack is empty
12 else
13 300 --- not a valid message
14 stack_data = empty_stack
15 prior_stack_data = empty_stack
16 in
17 let
18 --- constructor
19 push = fun (item)
20 let
21 ignore = assign! prior_stack_data = stack_data
22 in
23 assign! stack_data =
24 fun(msg)
25 if eqv?(msg,1)
26 item
27 else
28 if eqv?(msg,2)
29 assign! stack_data = prior_stack_data
30 else if eqv?(msg,3)
31 0 --- represents false:
32 --- stack is not empty
33 else
34 300 --- not a valid message
35 --- observers
36 empty? = fun () (stack_data 3)
37 top = fun () (stack_data 1)
38 pop = fun () (stack_data 2)
39 reset = fun () assign! stack_data = empty_stack
40 in
41 let
42 --- collection_of_functions uses
43 --- a closure to simulate an array
464 CHAPTER 12. PARAMETER PASSING
44 collection_of_functions = fun(i)
45 if eqv?(i,3)
46 empty?
47 else
48 if eqv?(i,1)
49 top
50 else
51 if eqv?(i,2)
52 pop
53 else
54 if eqv?(i,4)
55 push
56 else if eqv?(i,5)
57 reset
58 else
59 400
60 in
61 collection_of_functions
62 get_empty?_method = fun(stk) (stk 3)
63 get_push_method = fun(stk) (stk 4)
64 get_top_method = fun(stk) (stk 1)
65 get_pop_method = fun(stk) (stk 2)
66 get_reset_method = fun(stk) (stk 5)
67 in
68 let
69 s1 = (new_stack)
70 s2 = (new_stack)
71 in
72 let
73 empty1? = (get_empty?_method s1)
74 push1 = (get_push_method s1)
75 top1 = (get_top_method s1)
76 pop1 = (get_pop_method s1)
77 reset1 = (get_reset_method s1)
78 empty2? = (get_empty?_method s2)
79 push2 = (get_push_method s2)
80 top2 = (get_top_method s2)
81 pop2 = (get_pop_method s2)
82 reset2 = (get_reset_method s2)
83 in
84 --- main program
85 let*
86 t1 = (push1 15)
87 t2 = (push1 16)
88 t3 = (push2 inc1((top1)))
89 t4 = (push2 31)
90 in
91 if eqv?((top2),0)
92 (top1)
93 else
94 let
95 d = (pop2)
96 in
97 (top2)
In this version of the stack object, the stack is a true object because its methods
are encapsulated within it. Notice that the let expression on lines 41–61 builds
and returns a closure that simulates an array (of stack functions): It accepts
an index i as an argument and returns the stack function located at that
index.
12.2. ASSIGNMENT STATEMENT 465
Table 12.1 New Versions of Camille, and Their Essential Properties, Created in the
Programming Exercises of This Section (Key: ASR = abstract-syntax representation;
CLS = closure.)
Table 12.1 summarizes the properties of the new versions of the Camille
interpreter developed in the programming exercises in this section.
Exercise 12.2.2 Write a Camille program that defines the mutually recursive
functions iseven? and isodd? (i.e., each function invokes the other). Neither of
these functions accepts any arguments. Instead, they communicate with each other
by changing the state of a shared “global” variable n that represents the number
being checked. The functions should each decrement the variable n throughout
the lifetime of the program until it reaches 0—the base case. Thus, the functions
iseven? and isodd? communicate by side effect rather than by returning
values.
Exercise 12.2.3 (Friedman, Wand, and Haynes 2001, Exercise 3.41, p. 103) In
Scheme and Java, everything is a reference (except for primitives in Java), although
both languages use implicit (pointer) dereferencing. Thus, it may appear as
if no denoted value represents a reference in these languages. In contrast, C
has reference (e.g., int* intptr;) and non-reference (e.g., int x;) types and
uses explicit (pointer) dereferencing (e.g., *x). Thus, an alternative scheme for
variable assignment in Camille is to have references be expressed values, and
have allocation, dereferencing, and assignment operators be explicitly used by the
programmer (as in C):
Modify the Camille interpreter of this section to implement this alternative design,
with the following new primitives:
466 CHAPTER 12. PARAMETER PASSING
In this version of Camille, the counter program at the beginning of Section 12.2 is
rendered as follows:
let
g = let
count = cell(0)
in
fun()
let
ignored = assigncell(count , inc1(contents(count))
in
contents(count)
in
+((g), (g))
Exercise 12.2.4 (Friedman, Wand, and Haynes 2001, Exercise 3.42, p. 105) Add
support for arrays to Camille. Modify the Camille interpreter presented in this
section to implement arrays. Use the following interface for arrays:
Thus,
Note that the first occurrence of “reference” (on the right-hand side of the equal
sign in the first equality expression) can be a different implementation of references
than that described in this section. For example, a Python list is already a sequence
of references.
What is the result of the following Camille program?
let
a = array(2) --- allocates a two-element array
p = fun(x)
let
v = arrayreference(x,1)
in
arrayassign(x, 1, inc1(v))
in
let
12.3. SURVEY OF PARAMETER-PASSING MECHANISMS 467
ignored = arrayassign(a, 1, 0)
in
let
ignored = (p a)
in
let
ignored = (p a)
in
arrayreference(a,1)
Exercise 12.2.5 Rewrite the Camille stack object program in Section 12.2.5 so that
it uses arrays. Specifically, eliminate the closure that simulates an array (of stack
functions) built and returned through the let expression on lines 41–60 and use
an array instead to store the collection of stack functions. Use the array-creation
and -manipulation interface presented in Programming Exercise 12.2.4.
12.3.1 Pass-by-Value
Pass-by-value is a parameter-passing mechanism in which copies of the arguments
are passed to the function. For this reason, pass-by-value is sometimes referred to
as pass-by-copy. Consider the classical swap function in C:
$ cat swap_pbv.c
# include <stdio.h>
/* swap pass-by-value */
void swap ( i n t a, i n t b) {
i n t temp = a;
a = b;
b = temp;
printf("In swap: ");
printf("a = %d, b = %d.\n", a, b);
}
i n t main() {
i n t x = 3;
i n t y = 4;
$ gcc swap_pbv.c
$ ./a.out
468 CHAPTER 12. PARAMETER PASSING
C only passes arguments by value (i.e., by copy). Figure 12.2 shows the run-time
stack of this swap function with signature void swap(int a, int b):
As can be seen, the function does not swap the two integers.
a
3
b
swap 4
x x
main 3 main 3
y y
4 4
a
4
b
swap 3
x x
main 3 main 3
y y
4 4
Figure 12.2 Passing arguments by value in C. The run-time stack grows upward.
(Key: l = memory cell; ¨ ¨ ¨ = activation-record boundary.)
12.3. SURVEY OF PARAMETER-PASSING MECHANISMS 469
Java also only passes arguments by value. Consider the following swap
method in Java, which accepts integer primitives as arguments:
c l a s s NoSwapPrimitive {
p r i v a t e s t a t i c void swap( i n t a, i n t b) {
i n t temp = a;
a = b;
b = temp;
i n t x = 3;
i n t y = 4;
NoSwapPrimitive.swap(x, y);
$ javac NoSwapPrimitive.java
$ java NoSwapPrimitive
In main, before call to swap: x = 3, y = 4.
In swap: x = 4, y = 3.
In main, after call to swap: x = 3, y = 4.
The status of the run-time stack in Figure 12.2 applies to this Java swap method
with signature void swap(int a, int b) as well. Since all parameters, including
primitives, are passed by value in Java, this swap method does not swap the two
integers. Consider the following version of the swap program in Java, where the
arguments to the swap method are references to objects instead of primitives:
c l a s s NoSwapObject {
Integer temp = a;
a = b;
b = temp;
Integer x = Integer.valueOf(3);
Integer y = Integer.valueOf(4);
NoSwapObject.swap(x, y);
$ javac NoSwapObject.java
$ java NoSwapObject
In main, before call to swap: x = 3, y = 4.
In swap: x = 4, y = 3.
In main, after call to swap: x = 3, y = 4.
Figure 12.3 illustrates the run-time stack during the execution of this Java swap
method with signature void swap(Integer a, Integer b):
1. (top left) Before swap is called. Notice the denoted values of x and y are
references to objects.
2. (top right) After swap is called. Notice that copies of the references x and y are
passed in.
3. (bottom left) While swap executes. Notice that the references are swapped
rather than the objects to which they point. As before, the swap takes place
within the activation record of the swap method, not main.
4. (bottom right) After swap returns.
As can be seen, this swap method does not swap its Integer object-reference
arguments. The references to the objects in main are not swapped because “Java
manipulates objects ’by reference,’ but it passes object references to methods
’by value’” (Flanagan 2005). Consequently, a swap method intended to swap
primitives or references to objects cannot be defined in Java.
Scheme also only supports passing arguments by value. Thus, as in Java,
references in Scheme are passed by value. However, unlike in Java, all denoted
values are references to expressed values in Scheme. Consider the following
Scheme program:
(define swap
(lambda (a b)
( l e t ((temp a)) ; temp = a
12.3. SURVEY OF PARAMETER-PASSING MECHANISMS 471
(begin
( s e t ! a b) ; a = b
( s e t ! b temp) ; b = temp
(display "In swap: a=")
(display a)
(display ", b=")
(display b)
(display ".")
(newline)))))
( l e t ((x 3) (y 4))
(begin
(display "Before call to swap: x=")
(display x)
(display ", y=")
(display y)
(display ".")
(newline)
(swap x y)
Figure 12.4 depicts the run-time stack as this Scheme program executes:
1. (top left) Before swap is called. Notice the denoted values of x and y are
references to expressed values.
2. (top right) After swap is called. Notice that copies of the references x and y are
passed in.
3. (bottom left) While swap executes. Notice that it is the references that are
swapped. As before, the swap takes place within the activation record of
the swap function, not the outermost let expression.
4. (bottom right) After swap returns.
As can be seen, this swap function does not swap its reference arguments.
Passing a reference by copy has been referred to as pass-by-sharing, especially in
languages where all denoted values are references (e.g., Scheme, and Java except
for primitives), though use of that term is not common.
Notice also the primary difference between denoted values in C and Scheme
in Figures 12.2 and 12.4, respectively. In Scheme, all denoted values are references
to expressed values; in C, denoted values are the same as expressed values. We
need to explore the pass-by-reference parameter-passing mechanism to define a
swap function that successfully swaps its arguments in the calling function (i.e.,
persistently).
472 CHAPTER 12. PARAMETER PASSING
temp
b
swap
x
main 3
x
main 3 y
4
y
4
temp
b
swap
x x
main 3 main 3
y y
4 4
Figure 12.3 Passing of references (to objects) by value in Java. The run-time stack
grows upward. (Key: l = memory cell; ˝ = object; ˛Ñ = reference; ¨ ¨ ¨ = activation-
record boundary.)
12.3.2 Pass-by-Reference
In the pass-by-reference parameter-passing mechanism, the called function is passed
a direct reference to the argument. As a result, changes made to the corresponding
parameter in the called function affect the value of the argument in the calling
function. Consider the classical swap function in C++:
12.3. SURVEY OF PARAMETER-PASSING MECHANISMS 473
temp
3
a
3
b
swap 4
x x
let 3 let 3
y y
4 4
a
4
b
swap 3
x x
let 3 let 3
y y
4 4
Figure 12.4 Passing arguments by value in Scheme. The run-time stack grows
upward. (Key: l = memory cell; ˛Ñ = reference; ¨ ¨ ¨ = activation-record boundary.)
$ swap_pbv.cpp
# include <iostream>
/* swap pass-by-reference */
void swap ( i n t & a, i n t & b) {
i n t temp = a;
a = b;
b = temp;
cout << "In swap: ";
cout << "a = " << a << ", b = " << b << endl;
}
i n t main() {
474 CHAPTER 12. PARAMETER PASSING
i n t x = 3;
i n t y = 4;
$ gcc swap_pbv.cpp
$ ./a.out
In main, before call to swap: x = 1, y = 2
In swap: a = 2, b = 1
In main, after call to swap: x = 2, y = 1
1. (top left) Before swap is called. Notice the denoted values of x and y are int
and not references to integers.
2. (top right) After swap is called. Notice that references to x and y are passed
in.
3. (bottom left) While swap executes. Notice that changes to the parameters a
and b are reflected in the arguments x and y in main. Thus, unlike with pass-
by-value, the swap here takes place within the activation record of the main
function, and not swap.
4. (bottom right) After swap returns.
As can be seen, this swap function does swap its integer arguments.
As discussed previously, C supports only pass-by-value. However, we can
simulate pass-by-reference in C by passing the memory address of a variable by
value. Consider the following C program:
1 $ cat swap_pabv.c
2 # include <stdio.h>
3
4 /* swap pass address by value: simulated pass-by-reference */
5 void swap ( i n t * a, i n t * b) {
6 i n t temp = *a;
7 *a = *b;
8 *b = temp;
9
10 printf("In swap: ");
11 printf("a = %x, b = %x and ", a, b);
12 printf("*a = %d, *b = %d.\n", *a, *b);
13 }
14
12.3. SURVEY OF PARAMETER-PASSING MECHANISMS 475
15 i n t main() {
16
17 i n t x = 3;
18 i n t y = 4;
19
20 printf("In main, before call to swap: ");
21 printf("&x = %x, &y = %x and ", &x, &y);
22 printf("x = %d, y = %d.\n", x, y);
23
24 swap (&x, &y);
25
26 printf("In main, after call to swap: ");
27 printf("&x = %x, &y = %x and ", &x, &y);
28 printf("x = %d, y = %d.\n", x, y);
29 }
30
31 $ gcc swap_pabv.c
32 $ ./a.out
33 In main() before call to swap: &x = ef0816ec, &y = ef0816e8 and
34 x = 3, y = 4.
35 In swap, a = ef0816ec, b = ef0816e8 and *a = 4, *b = 3.
36 In main() after call to swap: &x = ef0816ec, &y = ef0816e8 and
37 x = 4, y = 3.
b
swap
x x
main 3 main 3
y y
4 4
b
swap
x x
main 4 main 4
y y
3 3
general, for a C function to modify (i.e., mutate) a variable that is not local to the
function, but is also not a global variable, the function must receive a copy of the
memory address of the variable it intends to modify rather than its value.
Pass-by-value and pass-by-reference are the two most widely supported
parameter-passing mechanisms in programming languages. However, a variety
of other mechanisms are supported, especially the pass-by-name and pass-by-
need approaches, which are commonly referred to as lazy evaluation (Section 12.5).
We complete this section by briefly discussing pass-by-result and pass-by-value-
result.
12.3. SURVEY OF PARAMETER-PASSING MECHANISMS 477
16ec
a
16ec
b
swap 16e8
x x
16ec
main 3 16ec main 3
y y
4 16e8 4 16e8
16ec
a
16ec
b
swap 16e8
x x
16ec
main 4 main 4 16ec
y y
3 16e8 3 16e8
12.3.3 Pass-by-Result
In the pass-by-value mechanism, copies of the values of the arguments are passed
to the called function by copy, but nothing is passed back to the caller. The pass-
by-result parameter-passing mechanism is the reverse of this approach: No data is
passed in to the called function, but copies of the values of the parameters in the
called function are passed back to the caller. Consider the following C program:
1 void f( i n t a, i n t b) {
2 printf("In f, before assignments: ");
478 CHAPTER 12. PARAMETER PASSING
12.3.4 Pass-by-Value-Result
Pass-by-value-result (sometimes referred to as pass-by-copy-restore) is a combination
of the pass-by-value (on the front end of the call) and pass-by-result (on the
back end of the call) parameter-passing mechanisms. In the pass-by-value-result
mechanism, arguments are passed into the called function in the same manner
as with the pass-by-value approach (i.e., by copy). However, the values of the
corresponding parameters within the called function are passed back to the caller
in the same manner as with the pass-by-result mechanism (i.e., by copy). Consider
the following C program:
12.3. SURVEY OF PARAMETER-PASSING MECHANISMS 479
b
f
x x
main 3 main 3
y y
4 4
a
1
b
f 2
x x
main 3 main 1
y y
4 2
Figure 12.7 Passing arguments by result. The run-time stack grows upward. (Key:
l = memory cell; ¨ ¨ ¨ = activation-record boundary.)
void f ( i n t a, i n t b) {
printf("In f, before assignments: ");
printf("a = %d, b = %d.\n", a, b);
a = a + 2;
b = b + 2;
i n t main() {
i n t x = 3;
i n t y = 4;
f(x,y);
a
3
b
f 4
x x
main 3 main 3
y y
4 4
a
5
b
f 6
x x
main 3 main 5
y y
4 6
Figure 12.8 Passing arguments by value-result. The run-time stack grows upward.
(Key: l = memory cell; ¨ ¨ ¨ = activation-record boundary.)
Again, C syntax is used here only for purposes of illustration; it is not intended to
convey that C uses the pass-by-value-result mechanism. Figure 12.8 presents the
run-time stack of this function with signature void f(int a, int b):
12.3.5 Summary
The following abbreviations identify the direction in which data flows between the
calling and called functions in parameter-passing mechanisms:
The following is a classification using these mnemonics to help think about the
parameter-passing mechanisms discussed in this section.
• pass-by-value: IN
• pass-by-result: OUT
• pass-by-reference: IN - OUT
• pass-by-value-result: IN at the front; OUT at the back
1 void f ( i n t a, i n t b) {
2 a = a + 1;
3 b = b + 1;
4 }
5
6 i n t main() {
7 i n t x = 1;
8 f (x, x);
9 printf("x = %d.\n", x); /* what is the value of x here? */
10 }
11 /* if pass-by-value is used, x=?
12 if pass-by-reference is used, x=?
13 if pass-by-result is used, x=?
14 if pass-by-value-result is used, x=?
Give the output that the printf statement on line 9 produces if the arguments
to the function f on line 8 are passed using the following parameter-passing
mechanisms:
12.3. SURVEY OF PARAMETER-PASSING MECHANISMS 483
(a) pass-by-value
(b) pass-by-reference
(c) pass-by-result
(d) pass-by-value-result
1 c l a s s Exercise {
2
3 p r i v a t e s t a t i c void increment(Integer i) {
4
5 i = Integer.valueOf(Integer.valueOf(i) + 1);
6
7 System.err.print("In increment: ");
8 System.err.println("i = " + Integer.valueOf(i) + ".");
9 }
10
11 public s t a t i c void main(String args[]) {
12
13 Integer i = Integer.valueOf(5);
14
15 System.err.print("In main, before call to increment: ");
16 System.err.println("i = " + Integer.valueOf(i) + ".");
17
18 Exercise.increment(i);
19
20 System.err.print("In main, after call to increment: ");
21 System.err.println("i = " + Integer.valueOf(i) + ".");
22 }
23 }
$ javac Exercise.java
$ java Exercise
In main, before call to increment: i = 5.
In increment: i = 6.
In main, after call to increment: i = 5.
Given that denoted values in Java are references for all identifiers declared as
objects, such as i on line 13, explain why the i on line 21 in the main method
does not reflect the incremented value of i (i.e., the value 6) after the call to the
method increment on line 18.
1 # include <stdio.h>
2
484 CHAPTER 12. PARAMETER PASSING
3 i n t i = 2;
4 i n t A[200];
5
6 void f( i n t x, i n t y) {
7 i = x+y;
8 }
9
10 i n t main() {
11 A[i] = 99;
12 f (i, A[i]);
13 printf("i = %d\n", i);
14 printf("A[i] = %d\n", A[i]);
15 }
Passing the arguments to the function f on line 12 using which of the parameter-
passing mechanisms discussed in this section produces the following output:
i = 2
A[i] = 99
Exercise 12.3.8 How can a called function, which is evaluated using the pass-by-
result parameter-passing mechanism, reference a parameter whose corresponding
argument is a literal or an expression [e.g., f(1,a+b);]?
Exercise 12.3.13 Define a Python function, and present an invocation of it, that
produces different results when its arguments are passed by value-result than
when passed by reference. Explain in comments in the program how one of the
mechanisms produces different results than the other. Keep the function to five
lines of code or less.
Exercise 12.3.14 Write a swap method in Java that successfully swaps its
arguments in the calling function. Hint: The arguments being swapped cannot
12.4. IMPLEMENTING PASS-BY-REFERENCE 485
be of type int or Integer, but rather must be references to objects whose data
members of type int are the values being swapped.
Camille> l e t
x = 3
f = fun(a) assign! a = 4
in
let
d = (f x)
in
x
The denoted value of a is a reference that initially contains a copy of the value
with which the reference x is associated, but these references are distinct. Thus, the
assignment to a in the body of the function f has no effect on the x in the outermost
let expression; as a result, the value of the expression is 3.
Let us implement pass-by-reference in Camille. We want to mutate Camille
so that literals (i.e., integers and functions/closures) are passed by value and
variables are passed by reference. The difference between the purely pass-by-
value Camille interpreter and the new hybrid pass-by-value (for literals), pass-
by-reference (for variables) Camille interpreter is summarized as follows:
In other words, unlike the prior implementation of Camille, now we only create a
new reference for literal operands. In the prior implementation, we created a new
reference for every operand. As a consequence, in Camille now, we use:
• A list element that contains an expressed value is called a direct target (i.e.,
pass-by-value).
• A list element that contains a denoted value is called an indirect target (i.e.,
pass-by-reference).
1 def expressedvalue(x):
2 r e t u r n ( i s i n s t a n c e (x, i n t ) or is_closure(x))
3
4 # begin abstract-syntax representation of Target
5
6 c l a s s Target:
7 def __init__(self, value, flag):
8
9 type_flag_dict = { "directtarget" : expressedvalue,
10 "indirecttarget" :
11 (lambda x : i s i n s t a n c e (x, Reference)) }
12
13 # if flag is not a valid flag value,
14 # build a lambda function that always
15 # returns false so we throw an error
16 type_flag_dict = \
17 defaultdict(lambda: lambda x: False, type_flag_dict)
18 i f (type_flag_dict[flag](value)):
19 self.flag = flag
20 self.value = value
21 else :
22 r a i s e Exception("Invalid Target Construction.")
23
24 # end abstract-syntax representation of Target
25
26 # begin abstract-syntax representation of Reference
27
28 c l a s s Reference:
29 # ...
30 # definitions of primitive_dereference and
31 # primitive_assignreference functions are same as earlier
32 # ...
33
34 def dereference(self):
35 target = self.primitive_dereference()
36 i f target.flag == "directtarget":
37 r e t u r n target.value
12.4. IMPLEMENTING PASS-BY-REFERENCE 487
38 e l i f target.flag == "indirecttarget":
39 innertarget = target.value.primitive_dereference()
40 i f innertarget.flag == "directtarget":
41 r e t u r n innertarget.value
42 # double indirect references not allowed
43 r a i s e Exception("Invalid dereference.")
44
45 def assignreference(self,expressedvalue):
46 target = self.primitive_dereference()
47
48 i f target.flag == "directtarget":
49 temp = self
50 e l i f target.flag == "indirecttarget":
51 innertarget = target.value.primitive_dereference()
52 i f innertarget.flag == "directtarget":
53 temp = target.value
54 e l i f innertarget.flag == "indirecttarget":
55 # double indirect references not allowed
56 r a i s e Exception("Invalid creation of reference.")
57
58 temp.primitive_assignreference(Target(expressedvalue, "directtarget"))
59
60 # end abstract-syntax representation of Reference
21 e l i f expr.type == ntLet:
22 temp = evaluate_expr (expr.children[0], environ) # assignment
23
24 identifiers = []
25 arguments = []
26
27 f o r name in temp:
28 identifiers.append (name)
29 arguments.append (evaluate_let_expr_operand (temp[name]))
30
31 temp = evaluate_expr (expr.children[1],
32 extend_environment (identifiers, arguments,
33 environ)) # evaluation
34
35 r e t u r n localbindingDereference(temp)
36
37 # rest of cases in evaluate_expr
38 e l i f ...
39 ...
def evaluate_let_expr_operand(operand):
i f i s i n s t a n c e (operand, Reference):
operand = operand.dereference()
r e t u r n Target(operand,"directtarget")
def localbindingDereference(possiblereference):
i f i s i n s t a n c e (possiblereference,Reference):
r e t u r n possiblereference.dereference()
else:
r e t u r n possiblereference
8 target = operand.primitive_dereference()
9
10 ## if the variable is bound to a "location" that
11 ## contains a direct target,
12
13 i f target.flag == "directtarget":
14
15 ## then we return an indirect target to that location
16 r e t u r n Target(operand,"indirecttarget")
17
18 ## but if the variable is bound to a "location"
19 ## that contains an indirect target, then
20 ## we return the same indirect target
21
22 e l i f target.flag == "indirecttarget":
23 innertarget = target.value.primitive_dereference()
24 i f innertarget.flag == "indirecttarget":
25
26 # double indirect references not allowed
27 r e t u r n Target(innertarget,"indirecttarget")
28 else:
29 r e t u r n innertarget
30
31 ## if the operand is a literal (i.e., integer or function/closure),
32 ## then we create a new location, as before, by returning
33 ## a "direct target" to it (i.e., pass-by-value)
34
35 e l i f i s i n s t a n c e (operand, i n t ) or is_closure(operand):
36 r e t u r n Target(operand,"directtarget")
Figure 12.10 presents the references associated with the arguments to the three
literal functions in this program. Notice that both parameters b and y are indirect
targets to parameter v, which is a direct target to the argument 7, rather than y
being an indirect target to the indirect target b—double indirect pointers are not
supported. Figure 12.11 depicts the relationship of the references b and y to each
other and to the argument 7 in more detail. Since the Camille interpreter now
supports pass-by-reference for variable arguments, a Camille function is now able
to modify the value of an argument:
Camille> l e t
x = 1
in
let
f = fun(a) assign! a = inc1(a) --- a++
in
let
d = (f x)
in --- function f has changed
x --- the value of x from 1 to 2
Now we can also define a swap function in Camille that successfully swaps its
arguments in the calling expression/function:
Camille> l e t
x = 3
y = 4
g h i
e f
a b c d
1 2 3 4
Figure 12.10 Three layers of references to indirect and direct targets representing
parameters to functions (Friedman, Wand, and Haynes 2001). (Key: l = memory
cell; ˛Ñ = reference.)
after call to f1, but before call to f2 after call to f2, but before call to f3
f (parameter)
f2 indirect target
direct target
c (parameter) c (argument/operand)
f1 3 (argument/operand) f1 3 direct target
after call to f3 but before assign! expression after assign! expression in f3, but before f3 returns
i (parameter) i (parameter)
f3 indirect target f3 indirect target
f (argument/operand) f (argument/operand)
f2 indirect target f2 indirect target
Figure 12.11 Passing variables by reference in Camille. The run-time stack grows
upward. (Key: l = memory cell; ˛Ñ = reference; ¨ ¨ ¨ = activation-record boundary.)
492 CHAPTER 12. PARAMETER PASSING
12.5.2 β-Reduction
Lazy evaluation supports the simplest possible form of reasoning about a program:
Formally, this evaluation strategy in λ-calculus is called β-reduction (or the copy
rule). More practically, we can say lazy evaluation involves simple string substitution
(e.g., substitution of a function name for the function body, and substitution of
parameters for arguments); for this reason, the lazy evaluation parameter-passing
mechanism is sometimes generally referred to as pass-by-name.
We use Scheme to demonstrate β-reduction. Consider the following
simple squaring function: (define square (lambda (x) (* x x))). Let
us temporarily forget that this is a Scheme function that can be evaluated.
Instead, we will simply think of this expression as associating the string
(lambda (x) (* x x)) with the mnemonic square. Now, consider the
following expression: (square 2). We will temporarily suspend the association
of this expression with an “invocation” of square and simply think of it as a
string. Now, let us apply the two substitutions. Step 1 involves replacing the
12.5. LAZY EVALUATION 493
mnemonic (i.e., identifier) square with the string associated with it (i.e., the body
of the function); step 2 involves replacing each x in the replacement string (from
step 1) with 2 (i.e., replacing each reference to a parameter in the body of the
function with the corresponding argument):
pply step 1
hkkikkj pply step 2
hkkikkj
(square 2) ñ ((lambda (x) (* x x)) 2) ñ (* 2 2)
Expressing the steps of β-reduction in λ-calculus, if sqre “ pλ . ˚ q, then
pply step 1
hkkikkj pply step 2
hkkikkj
square(2) ñ (λ x . x*x)(2) ñ 2*2
Thinking of these steps as a parameter-passing mechanism (i.e., pass-by-name)
may seem foreign, especially since most readers may be most familiar with pass-
by-value semantics and internally conceptualize run-time stacks visually (e.g.,
Figures 12.2–12.8). However, when viewed through a purely mathematical lens,
this “parameter-passing mechanism” is quite natural. For instance, if we told
someone without a background in computing that “ p3 ˛ 2q, and then inquired
as to the representation of ˛ , that person would likely intuitively respond with
p3 ˛ 2q ˛ p3 ˛ 2q. Thus, if “ p3 ˚ 2q, the representation of ˚ is, similarly,
p3 ˚ 2q ˚ p3 ˚ 2q, not 6 ˚ 6. Again, this interpretation is purely mathematical and
independent of any implementation approaches or constraints. Now let us con-
sider another “invocation” of square: (square (* 3 2)). Using β-reduction:
pply step 1
hkkikkj
(square (* 3 2)) ñ ((lambda (x) (* x x)) (* 3 2))
pply step 2
hkkikkj
ñ (* (* 3 2) (* 3 2)) ñ (* 6 6) ñ 36
We can compare this evaluation of the function with the typical programming
language semantics for this invocation:
(square (* 3 2)) ñ (square 6) ñ ((lambda (x) (* x x)) (6)) ñ (* 6 6) ñ 36
The following Scheme code presents another comparison of these two approaches:
;; normal-order evaluation
> (square (* 3 2))
(* (* 3 2) (* 3 2))
(* 6 6)
36
;; applicative-order evaluation
> (square (* 3 2))
(square 6)
(* 6 6)
36
Intuitively, it would seem that the use of lazy evaluation (of arguments) is
intended for purposes of efficiency. Specifically, if the argument is not needed in
the body of the function, the time that would have been spent on evaluating it is
saved. However, upon closer examination, in a (perceived) attempt to be efficient,
the evaluation of the expression (square (* 3 2)) requires double the work—
the expression (* 3 2) passed as an argument is evaluated twice! (We discuss the
relationship between lazy evaluation and space complexity in Section 13.7.4).
When considering the savings in time resulting from not evaluating an unused
argument, one might question why a programmer would define a function
that accepts an argument it does not use. In other words, it seems as if lazy
evaluation is a safeguard against poorly defined functions. However, when we
think about boolean operators as functions, and operands to boolean operators as
arguments to a function, then suddenly it makes sense not to use eager evaluation:
false && (true || false). Similarly, when thinking of an if conditional
structure as a ternary boolean function and thinking of the conditional expression,
true branch, and false branch as arguments to this function, using eager evaluation
is unreasonable:
Table 12.3 Terms Used to Refer to Evaluation Strategies for Function Arguments
in Three Progressive Contexts
12.5. LAZY EVALUATION 495
1 $ cat macros.c
2 # include <stdio.h>
3
4 # define FIVE 5
5
6 # define SQUARE(X) ((X)*(X))
7
8 /* #A in a replacement string of a macro:
9 1. replace by argument (i.e., actual parameter)
10 2. enclose it in quotes */
11 # define PRINT(A, B) printf(#A ": %d, " #B ": %d\n", A, B)
12
13 /* max of two ints macro */
14 # define MAX(a,b) ((a) > (b) ? (a) : (b))
15
16 i n t main() {
17 i n t x = SQUARE(3);
18 i n t y = SQUARE(x+1);
19
20 printf("%d\n", FIVE);
21
22 PRINT(x, y);
23
24 printf ("The max of %d and %d is %d.\n", 1, 2, MAX(1,2));
25 printf ("The max of %d and %d is %d.\n", x, y, MAX(x,y));
26 printf ("The max of %d and %d is %d.\n", y, x, MAX(y,x));
27 printf ("The max of %d and %d is %d.\n", x+1, y+1,
28 MAX(++x,++y));
29 x--; y--;
30
31 printf ("The max of %d and %d is %d.\n", x+1, y,
32 MAX(x++,y));
33 }
1. The examples of C macros in this chapter are not intended to convey that C macros correspond
to lazy evaluation. “Macros do not correspond to lazy evaluation. Laziness is a property of when the
implementation evaluates arguments to functions. . . . Indeed, macro expansion (like type-checking)
happens in a completely different phase than evaluation, while laziness is very much a part of
evaluation. So please don’t confuse the two” (Krishnamurthi 2003). The examples of C macros used
here are simply intended to help the reader get a feel for the pass-by-name parameter-passing
mechanism and β-reduction; they are used entirely for purposes of demonstration.
496 CHAPTER 12. PARAMETER PASSING
The following code is the result of the expansion of the four macros on lines 4, 6,
11, and 14:
1 $ gcc -E macros.c > macros.E # cpp macros.c > macros.E can be used as well
2 $ cat macros.E
3 i n t main() {
4 i n t x = ((3)*(3));
5 i n t y = ((x+1)*(x+1));
6
7 printf("%d\n", 5);
8
9 printf("x" ": %d, " "y" ": %d\n", x, y);
10
11 printf ("The max of %d and %d is %d.\n", 1, 2, ((1) > (2) ? (1) : (2)));
12 printf ("The max of %d and %d is %d.\n", x, y, ((x) > (y) ? (x) : (y)));
13 printf ("The max of %d and %d is %d.\n", y, x, ((y) > (x) ? (y) : (x)));
14 printf ("The max of %d and %d is %d.\n", x+1, y+1,
15 ((++x) > (++y) ? (++x) : (++y)));
16 x--; y--;
17
18 printf ("The max of %d and %d is %d.\n", x+1, y,
19 ((x++) > (y) ? (x++) : (y)));
20 }
with
In other words, FIVE is textually replaced with 5. This substitution can be thought
of as solely step 1 of β-reduction.
Expanding the macros defined on lines 6, 11, and 14 involves both steps 1 and
2 of β-reduction. For instance, consider the SQUARE macro defined on line 6 of the
unexpanded version. Using this macro to demonstrate β-reduction:
1. All occurrences of the string SQUARE in the program (e.g., lines 17 and 18)
are replaced with ((X)*(X));.
2. All occurrences of X in the replacement string are substituted with 3.
1 $ gcc macros.c
2 $ ./a.out
3 5
4 x: 9, y: 100
5 The max of 1 and 2 is 2.
6 The max of 9 and 100 is 100.
7 The max of 100 and 9 is 100.
8 The max of 10 and 101 is 102.
9 The max of 10 and 101 is 101.
1 # include <stdio.h>
2
3 /* pass-by-name swap macro */
4 # define swap(a, b) { i n t temp = (a); (a) = (b); (b) = temp; }
5
6 i n t main() {
7
8 i n t x = 3;
9 i n t y = 4;
10 i n t temp = 5;
11
12 printf ("Before pass-by-name swap(x,y) macro: x = %d, y = %d\n", x, y);
13
14 swap(x,y)
15
16 printf (" After pass-by-name swap(x,y) macro: x = %d, y = %d\n\n", x, y);
17 }
498 CHAPTER 12. PARAMETER PASSING
The swap macro is defined on line 4. The preprocessed version of this program
with the SWAP macro expanded is
1 i n t main() {
2
3 i n t x = 3;
4 i n t y = 4;
5 i n t temp = 5;
6
7 printf ("Before pass-by-name swap(x,y) macro: x = %d, y = %d\n", x, y);
8
9 { i n t temp = (x); (x) = (y); (y) = temp; }
10
11 printf ("After pass-by-name swap(x,y) macro: x = %d, y = %d\n\n", x, y);
12 }
$ gcc swap_pbn.c
$ ./a.out
Before pass-by-name swap(x,y) macro: x = 3, y = 4
After pass-by-name swap(x,y) macro: x = 4, y = 3
The output indicates that the pass-by-name swap macro worked. However,
another use of this swap macro tells a different story:
# include <stdio.h>
i n t main() {
i n t a[6];
i n t i = 1;
a[1] = 5;
swap(i, a[i]);
printf (" After pass-by-name swap(i, a[i]) macro: i = %d, a[1] = %d\n",
i, a[1]);
}
$ gcc swap_pbn2.c
$ ./a.out
Before pass-by-name swap(i, a[i]) macro: i = 1, a[1] = 5
After pass-by-name swap(i, a[i]) macro: i = 5, a[1] = 5
The values of i and a[1] are not swapped after the expanded code from the swap
macro executes: a[1] is 5 both before and after the replacement code of the macro
executes. This outcome occurs because of side effect. The expansion of the macro
replaces the statement
swap(i, a[i]);
12.5. LAZY EVALUATION 499
with
{ int temp = (i); (i)= (a[i]); (a[i])= temp; };
The side effect of the second assignment statement (i) = (a[i]) changes the
value of i from 1 to 5. Thus, the third assignment, (a[i]) = temp;, places the
original value of i (i.e., 1) in array element a[5] rather than a[1]. Consequently,
after the replacement code of the macro executes, a[1] is unchanged. Side effect
caused a similar problem in the execution of the replacement code of the MAX
macro on line 28 in the first C program in Section 12.5.3, which produced the
following output: The max of 10 and 101 is 102. Thus, we rephrase the
first sentence of Section 12.5.2 as “lazy evaluation in a language without side effects
supports the simplest possible form of reasoning about a program.” We explore
the implications of side effect for the pass-by-name parameter-passing mechanism
further in the Conceptual Exercises.
If the argument inc_x() is passed by name to the double function on line 12, then
the double function returns 3 (= 1 + 2) because the parameter x is referenced
twice in the body of the double function (line 10). Thus, the argument expression
inc_x() is evaluated twice: The first time it is evaluated inc_x() returns 1, and
the second time it returns 2 because inc_x has a side effect (i.e., it increments the
global variable x). In contrast, if the argument inc_x() is passed by need to the
double function on line 12, then the double function returns 2 (= 1 + 1) because
the argument expression inc_x() is evaluated only once: The first time inc_x()
returns the value 1, which is stored so that it can be retrieved the second time the
parameter x is referenced.
Contrast the definitions of the double (lines 9–10) and add (line 16–17)
functions: The double function accepts one parameter, which it references twice
in its body (line 10); the add function accepts two parameters, each of which it
references once in its body (line 17). However, the add function is invoked with the
same expression for each argument (line 22). If each argument inc_x() is passed
by name to the add function on line 22, then the add function returns 2 (= 1 + 2).
While the parameters x and y are each referenced only once in the body of the add
function (line 17), the argument expression inc_x() is evaluated twice, but for a
different reason than it is for the pass-by-name invocation of the double function
on line 12. Here, the argument expression inc_x() is evaluated once for each of
the x and y parameters because the same argument expression is passed for both
parameters. The first time inc_x() is evaluated, it returns 1, and the second time
it returns 2 because inc_x has a side effect. Evaluating the invocation of the add
function on line 22 using pass-by-need semantics yields the same result. Since each
parameter is referenced only once in the body of the add function (line 17), there is
no opportunity to retrieve the return value of each argument expression recorded.
In other words, there is no opportunity to obviate a reevaluation during a
subsequent reference because there are no subsequent references. Thus, unlike the
invocation of the double function on line 12, the invocation of the add function on
line 22 yields the same result when using pass-by-name or pass-by-need semantics.
Note that the Java or C analog of the invocation to the add function on line 22
is x=0; x++ + x++;, where x++ is an argument that is passed (by name or by
need) twice to the + function. In summary,
12.5. LAZY EVALUATION 501
To avoid this run-time error, we can pass the second argument to f by name. Thus,
instead of passing the expression (1/0) as the second argument, we must pass a
thunk:
11 >>> # This function is a thunk (or a shell) for the expression 1/0.
12 >>> def divbyzero():
13 ... r e t u r n 1/0
14 ...
15 >>> # invoking f with a named function as the second argument
16 >>> f(0, divbyzero)
502 CHAPTER 12. PARAMETER PASSING
forming a thunk (or a promise) = freezing an expression operand = delaying its evaluation
evaluating a thunk (or a promise) = thawing a thunk = forcing its evaluation
17 1
18 >>> # invoking f with a lambda expression as the second argument
19 >>> f(0, lambda: 1/0)
20 1
When the argument being passed involves references to variables [e.g., (x/y)
instead of (1/0)], the thunk created for the argument requires more information.
Specifically, the thunk needs access to the referencing environment that contains
the bindings to the variables being referenced.
Rather than hard-code a thunk every time we desire to delay the evaluation
of an argument (as shown in the preceding example), we desire to develop a pair
of functions for forming and evaluating a thunk (Table 12.4). We can then invoke
the thunk-formation function each time the evaluation of an argument expression
should be delayed (i.e., each time a pass-by-name argument is desired). Thus, we
want to abstract away the process of thunk formation. Since a thunk is simply a
nullary (i.e., argumentless) function, evaluating it is straightforward:
1. Record the value of the argument expression the first time it is evaluated
(line 36).
2. Record the fact that the expression was evaluated once (line 37).
3. Look up and return the recorded value for all subsequent evaluations (line 41).
34 ... i f first[0]:
35 ... p r i n t ("first and only computation")
36 ... result[0] = e v a l(expr)
37 ... first[0] = False
38 ... else:
39 ... p r i n t ("lookup, no recomputation")
40 ...
41 ... r e t u r n result[0]
42 ...
43 ... # return a thunk
44 ... r e t u r n thunk
Notice that the delay function builds the thunk as a first-class closure so that it can
“remember” the return value of the evaluated argument expression in the variable
result after delay returns. First-class closures are an important construct for
implementing a variety of concepts from programming languages.
Since delay is a user-defined function and uses applicative-order evaluation,
we must pass a string representing an expression, rather than an expression itself,
to prevent the expression from being evaluated. For instance, in the invocation
delay (1/0), the argument to be delayed [i.e., (1/0)] is a strict argument and
will be evaluated eagerly (i.e., before it is passed to delay). Thus, we must only
pass strings (representing expressions) to delay:
48 >>> x = 0
49
50 >>> def inc_x():
51 ... global x
52 ... x = x + 1
53 ... return x
54
55 >>> # two references to one parameter in body of function
56 >>> def double(x):
57 ... r e t u r n force(x) + force(x)
58
59 >>> double(delay("inc_x()"))
60 first and only computation
61 lookup, no recomputation
62 2
63
64 >>> # one reference to each parameter in body of function,
65 >>> # but each parameter is same
66 >>> def add(x,y):
67 ... r e t u r n force(x) + force(y)
68
504 CHAPTER 12. PARAMETER PASSING
69 >>> x = 0
70
71 >>> add(delay("inc_x()"), delay("inc_x()"))
72 first and only computation
73 first and only computation
74 3
The second reference to x does not cause a reevaluation of the thunk. The output
of the invocation of add on line 71 is
In the invocation of the add function, one thunk is created for each argument and
each thunk is separate from the other. While the two thunks are duplicates of each
other, each thunk is evaluated only once.
The Scheme delay and force syntactic forms (which use pass-by-need
semantics, also known as memoized lazy evaluation) are the analogs of the Python
function delay and force defined here. Programming Exercise 12.5.19 entails
implementing the Scheme delay and force syntactic forms as user-defined
Scheme functions.
The Haskell programming language was designed as an intended standard for
lazy, functional programming. In Haskell, pass-by-need is the default parameter-
passing mechanism and, thus, the use of syntactic forms like delay and force is
unnecessary. Consider the following transcript with Haskell:2
2. We cannot use the simpler argument expression 1/0 to demonstrate a non-strict argument in
Haskell because 1/0 does not generate a run-time error in Haskell—it returns Infinity.
12.5. LAZY EVALUATION 505
The Haskell function fix returns the least fixed point of a function in the domain
theory interpretation of a fixed point. A fixed point of a function is a value such
?
that ƒ pq “ .
? For instance, a fixed point of a square root function ƒ pq “
is 1 because 1 “ 1. Since there is no least fixed point of an identity function
ƒ pq “ , the invocation fix (\x -> x) never returns—it searches indefinitely
(lines 2–3). Haskell supports pass-by-value parameters as a special case. When an
argument is prefaced with $!, the argument is passed by value or, in other words,
the evaluation of the argument is forced. In this case, the argument is treated as a
strict argument and evaluated eagerly:
The built-in Haskell function seq evaluates its first argument before returning its
second. Using seq, we can define a function strict:
We can define functions take1 and drop1 to access parts of list comprehensions:3
11 Prelude > :{
12 Prelude | take1 0 _ = []
13 Prelude | take1 _ [] = []
14 Prelude | take1 n (h:t) = h : take1 (n-1) t
15 Prelude |
16 Prelude | drop1 0 l = l
17 Prelude | drop1 _ [] = []
18 Prelude | drop1 n (_:t) = drop1 (n-1) t
19 Prelude | :}
Since only enough of the list comprehension is explicitly realized when needed, we
can think of this as laying down railroad track as we travel rather than building
3. We use the function names take1 and drop1 because these functions are defined in Haskell as
take and drop, respectively.
12.5. LAZY EVALUATION 507
53 Prelude > :{
54 Prelude | -- implementation of Sieve of Eratosthenes algorithm
55 Prelude | -- for enumerating prime numbers
56 Prelude | sieve [] = []
57 Prelude | sieve (two:lon) = two : sieve [n | n <- lon, (mod n two) /= 0]
58 Prelude | :}
59 Prelude >
60 Prelude > primes = sieve [2..]
61 Prelude >
62 Prelude > take1 100 primes
63 [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,...,523,541]
64 Prelude >
65 Prelude > :{
66 Prelude | quicksort [] = []
67 Prelude | quicksort (h:t) = quicksort [x | x <- t, x <= h]
68 Prelude | ++ [h] ++
69 Prelude | quicksort [x | x <- t, x > h]
70 Prelude | :}
71 Prelude >
508 CHAPTER 12. PARAMETER PASSING
quicksort [5,1,9,2,8,3,7,4,6,10] =
quicksort 5 : [1,9,2,8,3,7,4,6,10] =
quicksort []))))
++ [5] ++
((quicksort [6] ++ [7] ++ []) ++ [8] ++ [])
++ [9] ++
([] ++ [10] ++ [])) =
([1,2,3,4,5,6,7,8,9,10]) = [1,2,3,4,5,6,7,8,9,10]
While Python evaluates arguments eagerly, it does have facilities that enable
the program to define infinite streams, thereby obviating the enumeration of a
large list in memory. Python makes a distinction between a list comprehension and
a generator comprehension or generator expression. In Python, a generator expression
is what we call a list comprehension in Haskell—that is, a function that generates
list elements on demand. List comprehensions in Python, however, are syntactic
sugar for defining an enumerated list without a loop using set-former notation.
Consider the following transcript with Python:
Syntactically, the only difference between lines 3 and 14 is the use of square
brackets in the definition of the list comprehension (line 3) and the use of
parentheses in the definition of the generator expression (line 14). However,
lines 9 and 20 reveal a significant savings in space required for the generator
expression. In terms of space complexity, a list comprehension is preferred if the
programmer intends to iterate over the list multiple times; a generator expression
is preferred if the list is to be iterated over once and then discarded. Thus, if only
the sum of the list is desired, a generator expression (line 30) is preferable to a
list comprehension (line 27). Generator expressions can be built using functions
calling yield:
1 >>> # the use of yield turns the function into a generator expression;
2 >>> # naturals() is a generator expression
3 >>> def naturals():
4 ... i = 1
5 ... while True:
6 ... yield i
7 ... i += 1
8
9 >>> from itertools import islice
10
11 >>> # analog of Haskell's take function
12 >>> def take(n, iterable):
13 ... #returns first n elements of iterable as a list
14 ... r e t u r n l i s t (islice(iterable, n))
15
16 >>> take(10, naturals())
17 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
12.5. LAZY EVALUATION 511
Lines 1–7 define a generator for the natural numbers (e.g., [1..] in Haskell).
Without the yield statement on line 6, this function would spin in an infinite
loop and never return. The yield statement is like a return, except that the next
time the function is called, the state in which it was left at the end of the previous
execution is “remembered” (see the concept of coroutine in Section 13.6.1). The
take function defined on lines 11–14 realizes in memory a portion of a generator
and returns it as a list (lines 16–17).
> if
if: bad syntax in: if
> cond
cond: bad syntax in: cond
• The boolean operators and and or are also special syntactic forms and use
normal-order evaluation:
> and
and: bad syntax in: and
> or
or: bad syntax in: or
>
• Arithmetic operators such as + and > are procedures (i.e., functions). Thus,
like user-defined functions, they use applicative-order evaluation:
> +
#<procedure:+>
> >
#<procedure:>>
The Scheme syntactic forms delay and force permit the programmer to define
and invoke functions that use normal-order evaluation. A consequence of this
impurity is that programmers cannot extend (or modify) control structures (e.g.,
12.5. LAZY EVALUATION 513
if, while, or for) in such languages using standard mechanisms (e.g., a user-
defined function).
Why is lazy evaluation not more prevalent in programming languages?
Certainly there is overhead involved in freezing and thawing thunks, but that
overhead can be reduced with memoization (i.e., pass-by-need semantics) in the
absence of side effects. In the presence of side effects, pass-by-need cannot be
used. More importantly, in the presence of side effects, lazy evaluation renders
a program difficult to understand. In particular, lazy evaluation generally makes
it difficult to determine the flow of program control, which is essential to
understanding a program with side effects. An attempt to conceptualize the
control flow of a program with side effects using lazy evaluation requires digging
deep into layers of evaluation, which is contrary to a main advantage of lazy
evaluation—namely, modularity (Hughes 1989). Conversely, in a language with
no side effects, flow of control has no effect on the result of a program. As a result,
lazy evaluation is most common in languages without provisions for side effects
(e.g., Haskell) and rarely found elsewhere.
Exercise 12.5.3 Consider the following swap macro using pass-by-name semantics
defined on line 4 (replicated here) of the second C program in Section 12.5.3:
#define swap(a, b){ int temp = (a); (a)= (b); (b)= temp; }
For each of the following main programs in C, give the expansion of the swap
macro in main and indicate whether the swap works.
(a)
i n t main() {
i n t a[6];
i n t i = 1;
i n t j = 2;
a[i] = 3;
a[j] = 4;
swap(a[i], a[j]);
}
514 CHAPTER 12. PARAMETER PASSING
(b)
i n t main() {
i n t a[6];
i n t i = 1;
i n t j = 1;
a[1] = 5;
swap(i, a[j]);
}
1 # include <stdio.h>
2
3 /* swap macro: pass-by-name */
4 # define swap(x, y) { i n t temp = (x); (x) = (y); (y) = temp; }
5
6 i n t main() {
7
8 i n t x = 3;
9 i n t y = 4;
10 i n t temp = 5;
11
12 printf ("Before pass-by-name swap(x,temp) macro: x = %d, temp = %d\n",
13 x, temp);
14
15 swap(x, temp)
16
17 printf (" After pass-by-name swap(x,temp) macro: x = %d, temp = %d\n",
18 x, temp);
19 }
The preprocessed version of this program with the swap macro expanded is
1 i n t main() {
2
3 i n t x = 3;
4 i n t y = 4;
5 i n t temp = 5;
6
7 printf ("Before pass-by-name swap(x,temp) macro: x = %d, temp = %d\n",
8 x, temp);
9
10 { i n t temp = (x); (x) = (temp); (temp) = temp; }
11
12 printf (" After pass-by-name swap(x,temp) macro: x = %d, temp = %d\n",
13 x, temp);
14 }
$ gcc collision.c
$ ./a.out
Before pass-by-name swap(x,temp) macro: x = 3, temp = 5
After pass-by-name swap(x,temp) macro: x = 3, temp = 5
arguments—the values of x and temp are the same both before and after the code
from the expanded swap macro executes. This outcome occurs because there is an
identifier in the replacement string of the macro (line 4 of the unexpanded version)
that is the same as the identifier for one of the variables being swapped, namely
temp. When the macro is expanded in main (line 10), the identifier temp in main
is used to refer to two different entities: the variable temp declared in main on
line 5 and the local variable temp declared in the nested scope on line 10 (from
the replacement string of the macro). The identifier temp in main collides with the
identifier temp in the replacement string of the macro. What can be done to avoid
this type of collision in general?
i n t main() {
i n t a[6];
i n t i=0;
f(i, a[i]);
}
Expand the f macro in main and give the values of i and a[i] before and after
the statement f(i, a[i]).
i n t main() {
i n t i=0;
i n t j=0;
i n t k=0;
f(k+1, j, i);
}
Expand the f macro in main and give the values of i, j, and k before and after the
statement f(k+1, j, i).
i n t main() {
f(read());
}
Assume the invocation of read() reads an integer from an input stream. Give the
expansion of the f macro in main.
516 CHAPTER 12. PARAMETER PASSING
Exercise 12.5.9 Verify which semantics of lazy evaluation Racket uses through
the delay and force syntactic forms: pass-by-name or pass-by-need. Specifically,
modify the following Racket expression so that the parameters are evaluated
lazily. Use the return value of the expression to determine which semantics of lazy
evaluation Racket implements.
( l e t ((n 0))
( l e t ((counter (lambda ()
;; the function counter has a side effect
( s e t ! n (+ n 1))
n)))
((lambda (x) (+ x x)) (counter))))
Given that Scheme makes provisions for side effects (through the set! operator),
are the semantics of lazy evaluation that Scheme implements what you expected?
Explain.
Exercise 12.5.11 The second argument to each of the Haskell built-in boolean op-
erators && and || is non-strict. Define the (&&) :: Bool -> Bool -> Bool
and (||) :: Bool -> Bool -> Bool operators in Haskell.
Exercise 12.5.13 Give an expression that returns different results when evaluated
with applicative-order evaluation and normal-order evaluation.
(b) ML
Exercise 12.5.16 Recall that Haskell is a (nearly) pure functional language (i.e.,
provision for side effect only for I / O) that uses lazy evaluation. Since Haskell has
no provision for side effect and pass-by-name and pass-by-need semantics yield
the same results in a function without side effects, it is reasonable to expect that
any Haskell interpreter would use pass-by-need semantics to avoid reevaluation of
thunks. Since a provision for side effect is necessary to implement the pass-by-need
semantics of lazy evaluation, can a self-interpreter for Haskell (i.e., an interpreter
for Haskell written in Haskell) be defined? Explain. What is the implementation
language of the Glasgow Haskell Compiler?
Rewrite this Scheme expression using the Scheme delay and force syntactic
forms so that the arguments passed to the two anonymous functions on lines 7
and 8 are passed by need. The return value of this expression is ’(2 5) using pass-
by-need.
518 CHAPTER 12. PARAMETER PASSING
The thaw and freeze functions are the Scheme analogs of the Python functions
force and delay presented in Section 12.5.5. The thaw and freeze functions are
also the user-defined function analogs of the Scheme built-ins force and delay,
respectively.
In this implementation, an expression subject to lazy evaluation is not evaluated
until its value is required; once evaluated, it is never reevaluated, (i.e., pass-
by-need semantics). Specifically, the first time the thunk returned by freeze
is thawed, it evaluates expr and remembers the return value of expr as
demonstrated in Section 12.5.5. For each subsequent thawing of the thunk, the
saved value of the expression is returned without any additional evaluation.
Add print statements to the thunk formed by the freeze function, as done in
Section 12.5.5, to distinguish between the first and subsequent evaluations of the
thunk.
Examples:
Be sure to quote the argument expr passed to freeze (line 1) to prevent it from
being evaluating when freeze is invoked (i.e., eagerly). Also, the body of the
thunk formed by the freeze function must invoke the Scheme function eval (as
discussed in Section 8.2). So that the evaluation of the frozen expression has access
to the base Scheme bindings (e.g., bindings for primitives such as car and cdr)
and any other user-defined functions, place the following lines at the top of your
program:
(define-namespace-anchor a)
(define ns (namespace-anchor->namespace a))
12.5. LAZY EVALUATION 519
Then pass ns as the second argument to eval [e.g., (eval expr ns)]. See
https://docs.racket-lang.org/guide/eval.html for more information on using
Racket Scheme namespaces.
Exercise 12.5.20 (Scott 2006, Exercise 6.30, pp. 302–303) Use lazy evaluation
through the syntactic forms delay and force to implement a lazy iterator object
in Scheme. Specifically, an iterator is either the null list or a pair consisting of
an element and a promise that, when forced, returns an iterator. Define an
uptoby function that returns an iterator, and a for-iter function that accepts
a one-argument function and an iterator as arguments and returns an empty
list. The functions for-iter and uptoby enable the evaluation of the following
expressions:
;; print the numbers from 1 to 10 in steps of 1, i.e., 1, 2, ..., 9, 10
(for-iter (lambda (e) (display e) (newline)) (uptoby 1 10 1))
The function for-iter, unlike the built-in Scheme form for-each, does not
require the existence of a list containing the elements over which to iterate. Thus,
the space required for (for-iter f (uptoby 1 n 1)) is O(1), rather than
Opnq.
Exercise 12.5.21 Use lazy evaluation (delay and force) to solve Programming
Exercise 5.10.12 (repeated here) in Scheme. Define a function samefringe in
Scheme that accepts an integer n and two S-expressions, and returns #t if the first
non-null n atoms in each S-expression are equal and in the same order and #f
otherwise.
Examples:
> (samefringe 2 '(1 2 3) '(1 2 3))
#t
> (samefringe 2 '(1 1 2) '(1 2 3))
#f
> (samefringe 5 '(1 2 3 (4 5)) '(1 2 (3 4) 5))
#t
> (samefringe 5 '(1 ((2) 3) (4 5)) '(1 2 (3 4) 5))
#t
> (samefringe 5 '(1 6 3 (7 5)) '(1 2 (3 4) 5))
#f
> (samefringe 3 '(((1)) 2 ((((3))))) '((1) (((((2))))) 3))
#t
> (samefringe 3 '(((1)) 2 ((((3))))) '((1) (((((2))))) 4))
#f
> (samefringe 2 '(((((a)) c))) '(((a) b)))
#f
520 CHAPTER 12. PARAMETER PASSING
Examples:
the Glasgow Haskell Compiler (GHC) open so you can enter the expressions as
you read them, which will help you to better understand them. You will need
to make some minor adjustments, such as replacing cons with :. The GHC is
available at https://www.haskell.org/ghc/. Study Sections 1–3 of the article. Then
implement one of the numerical algorithms from Section 4 in Haskell (e.g., Newton-
Raphson square roots, numerical differentiation, or numerical integration). If you
are interested in artificial intelligence, implement the search described in Section 5.
Your code must run using GHCi—the interactive interpreter that is part of GHC.
1 c l a s s Target:
2 def __init__(self, value, flag):
3
4 type_flag_dict = { "directtarget" : expressedvalue,
5 "indirecttarget" : (lambda x : i s i n s t a n c e (x, Reference)),
6 "frozen_expr" : (lambda x : i s i n s t a n c e (x, l i s t )) }
7
8 # if flag is not a valid flag value, construct a lambda expression
9 # that always returns false so we throw an error
10 type_flag_dict = \
11 defaultdict (lambda: lambda x: False, type_flag_dict)
12
13 i f (type_flag_dict[flag](value)):
14 self.flag = flag
15 self.value = value
16 else:
17 r a i s e Exception("Invalid Target Construction.")
Note that we added frozen_expr flag to the dictionary of possible target types.
If the dereference function is passed a reference containing a thunk, it
evaluates the thunk using the thaw_thunk function. This function evaluates the
expression in the thunk and returns the corresponding value:
1 def dereference(self):
2 target = self.primitive_dereference()
3
4 i f target.flag == "directtarget":
5 r e t u r n target.value
6 e l i f target.flag == "indirecttarget":
7 innertarget = target.value.primitive_dereference()
8
9 i f innertarget.flag == "directtarget":
12.6. IMPLEMENTING PASS-BY-NAME/NEED 523
10 r e t u r n innertarget.value
11
12 e l i f innertarget.flag == "frozen_expr":
13 r e t u r n target.value.thaw_thunk()
14
15 e l i f target.flag == "frozen_expr":
16 r e t u r n self.thaw_thunk()
17
18 r a i s e Exception("Invalid dereference.")
19
20 def thaw_thunk(self):
21
22 # self.vector[self.position].frozen_expr[1] is the root of the tree
23 # self.vector[self.position].frozen_expr[1] is environment
24 # at time of call
25 # print ("Thaw")
26
27 i f (camilleconfig.__lazy_switch__ == camilleconfig.pass_by_name):
28 r e t u r n evaluate_expr(self.vector[self.position].value[0],
29 self.vector[self.position].value[1])
30
31 e l i f (camilleconfig.__lazy_switch__ == camilleconfig.pass_by_need):
32 # the first time we evaluate the thunk we save the result
33 i f i s i n s t a n c e (self.vector[self.position].value, l i s t ):
34 self.vector[self.position].value = evaluate_expr(
35 self.vector[self.position].value[0],
36 self.vector[self.position].value[1])
37 self.vector[self.position].flag = "directtarget"
38 r e t u r n self.vector[self.position].value
39
40 else:
41 r a i s e Exception("Configuration Error.")
e l i f expr.type == ntArguments:
ArgList = []
ArgList.append(evaluate_expr(expr.children[0], environ))
i f len(expr.children)> 1:
ArgList.extend(evaluate_expr(expr.children[1], environ))
r e t u r n ArgList
with
e l i f expr.type == ntArguments:
r e t u r n freeze_function_arguments(expr.children, environ)
This function recurses argument lists. However, now only literals and identifiers
are evaluated. The root TreeNode of every other expression is saved into a list
with the corresponding environment to be evaluated later. Lastly, we must update
the evaluate_operand function:
27 r e t u r n Target(innertarget, "indirecttarget")
28 else:
29 r e t u r n innertarget
30
31 e l i f target.flag == "frozen_expr":
32 r e t u r n Target(operand, "indirecttarget")
33
34 ## if the operand is a literal (i.e., integer or function/closure),
35 ## then we create a new location, as before, by returning
36 ## a "direct target" to it (i.e., pass-by-value)
37
38 e l i f i s i n s t a n c e (operand, i n t ) or is_closure(operand):
39 r e t u r n Target(operand, "directtarget")
40
41 e l i f i s i n s t a n c e (operand, l i s t ):
42 r e t u r n Target(operand, "frozen_expr")
Table 12.5 New Versions of Camille, and Their Essential Properties, Created
in Sections 12.6 and 12.7 Programming Exercises (Key: ASR = abstract-syntax
representation; CLS = closure.)
Example:
Example:
Camille> l e t
a = /(1,0)
in
2
2
12.7. SEQUENTIAL EXECUTION IN CAMILLE 527
Exercise 12.6.2 (Friedman, Wand, and Haynes 2001, Exercise 3.56, p. 117) Extend
the solution to Programming Exercise 12.6.1 so that arguments to primitive
operations are evaluated lazily. Then, implement if as a primitive instead of
a syntactic form. Also, add a division (i.e., /) primitive to Camille so the lazy
Camille interpreter can evaluate the following programs demonstrating lazy
evaluation:
11
Camille> l e t
p = fun (x, y)
if (zero? (x), x, y)
in
(p 0,4)
0
Camille> l e t
d = fun (x, y) /(x,y)
p = fun (x, y)
if (zero? (x), 10, y)
in
(p 0, /(1,0))
10
ntAssignmentStmt
ăsttementą ::= ădentƒ erą = ăepressoną
ntOutputStmt
ăsttementą ::= writeln (ăepressoną)
ntCompoundStmt
ăsttementą ::= {tăsttementąu˚p;q }
ntIfElseStmt
ăsttementą ::= if ăepressoną ăsttementą else ăsttementą
528 CHAPTER 12. PARAMETER PASSING
ntWhileStmt
ăsttementą ::= while ăepressoną do ăsttementą
ntBlockStmt
ăsttementą ::= variable tădentƒ erąu˚p,q ; ăsttementą
1 def p_line_stmt(t):
2 '''program : statement'''
3 t[0] = t[1]
4 execute_stmt(t[0], empty_environment())
5
6 def p_statement_assignment(t):
7 '''statement : IDENTIFIER EQ functional_expression'''
8 t[0] = Tree_Node(ntAssignmentStmt, [t[3]], t[1], t.lineno(1))
9
10 def p_statement_writeln(t):
11 '''statement : WRITELN LPAREN functional_expression RPAREN'''
12 t[0] = Tree_Node(ntOutputStmt, [t[3]], None, t.lineno(1))
13
14 def p_statement_compound(t):
15 '''statement : LCURL statement_list RCURL
16 | LCURL RCURL'''
17 i f len(t) == 4:
18 t[0] = Tree_Node(ntCompoundStmt, [t[2]], None, t.lineno(1))
19 else:
20 t[0] = Tree_Node(ntCompoundStmt, [None], None, t.lineno(1))
21
22 def p_statement_list(t):
23 '''statement_list : statement SEMICOLON statement_list
24 | statement'''
25 i f len(t) > 2:
26 t[0] = Tree_Node(ntStmtList, [t[1], t[3]], None, t.lineno(1))
27 else:
28 t[0] = Tree_Node(ntStmtList, [t[1]], None, t.lineno(1))
29
30 def p_statement_if(t):
31 '''statement : IF functional_expression statement ELSE statement'''
32 t[0] = Tree_Node(ntIfElseStmt, [t[3],t[5]], t[2], t.lineno(1))
33
34 def p_statement_while(t):
35 '''statement : WHILE functional_expression DO statement'''
36 t[0] = Tree_Node(ntWhileStmt, [t[4]], t[2], t.lineno(1))
37
38 def p_statement_block(t):
39 '''statement : VARIABLE id_list SEMICOLON statement
40 | VARIABLE SEMICOLON statement'''
41 i f len(t) == 5:
42 t[0] = Tree_Node(ntBlockStmt, [t[2],t[4]], None, t.lineno(1))
43 else:
44 t[0] = Tree_Node(ntBlockStmt, [None,t[3]], None, t.lineno(1))
45
46 def p_identifier_list(t):
47 '''id_list : IDENTIFIER COMMA id_list
48 | IDENTIFIER'''
49 i f len(t) > 2:
50 t[0] = Tree_Node(ntIdList, [t[1], t[3]], None, t.lineno(1))
51 else:
52 t[0] = Tree_Node(ntIdList, [t[1]], None, t.lineno(1))
53
12.7. SEQUENTIAL EXECUTION IN CAMILLE 529
54 def p_functional_expression(t):
55 '''functional_expression : expression'''
56 t[0] = Tree_Node(ntFunctionalExpression, None, t[1], t.lineno(1))
Statements are executed for their (side) effect, not their value. The following are
some example Camille programs involving statements:
Camille> variable x, y, z; {
x = 1;
y = 2;
z = +(x,y);
writeln (z)
}
3
Camille> variable i, j, k; {
i = 3;
j = 2;
k = 1;
writeln (+(i,-(j,k)))
}
4
Camille> if 1
if 0
writeln(5)
else
writeln(6)
else
writeln(7)
6
Camille> --- while loop: 1 .. 5
Camille> variable i, j; {
i = 1;
j = 5;
while j do {
writeln(i);
j = dec1(j);
i = inc1(i)
}
}
1
2
3
4
5
Camille> --- an alternate while loop: 1 .. 5
Camille> variable i; {
i = 1;
530 CHAPTER 12. PARAMETER PASSING
1
2
3
4
5
Camille> --- nested blocks and scoping
Camille> variable i; {
i = 2;
writeln(i);
variable j; {
j = 1;
writeln(j)
};
writeln(i)
}
2
1
2
Camille> --- nested blocks and a scope hole
Camille> variable i; {
i = 1;
writeln(i);
variable i; {
i = 3;
writeln(i)
};
writeln(i)
}
1
3
1
Camille> --- use of statements and expressions
Camille> variable increment, i; {
increment = fun(n) inc1(n);
i = 0;
writeln ((increment i))
}
12
Camille> if 1 {
if 0 {
writeln(5)
} else {
writeln(6);
writeln(6)
}
12.7. SEQUENTIAL EXECUTION IN CAMILLE 531
} else {
writeln(7)
}
Syntax e r r o r: Line 2
We must define an execute_stmt function to run programs like those just shown
here:
25
26 elif stmt.type == ntBlockStmt:
27
28 # building id l i s t
29 IdList = execute_stmt(stmt.children[0], environ)
30
31 ListofZeros = l i s t (map (lambda identifier: 0, IdList))
32
33 TargetListofZeros = l i s t (map (evaluate_let_expr_operand,
34 ListofZeros))
35
36 localenv = extend_environment(IdList, TargetListofZeros, environ)
37
38 execute_stmt(stmt.children[1], localenv)
39
40 elif stmt.type == ntStmtList:
41 execute_stmt(stmt.children[0], environ)
42
43 if len(stmt.children)> 1:
44 execute_stmt(stmt.children[1], environ)
45
46 elif stmt.type == ntIdList:
47 IdList = []
48 IdList.append(stmt.children[0])
49
50 if len(stmt.children)> 1:
51 IdList.extend(execute_stmt(stmt.children[1], environ))
52 r e t u r n IdList
53
54 elif stmt.type == ntFunctionalExpression:
55 t = evaluate_expr(stmt.leaf, environ)
56 r e t u r n localbindingDereference(t)
57
58 else:
59 raise InterpreterException(expr.linenumber,
60 "Invalid tree node type %s" % expr.type)
61
62 except Exception as e:
63 if(isinstance(e, InterpreterException)):
64 # raise exception to the next level u n t i l we reach the top level
65 # of the interpreter; exceptions are fatal for a single tree,
66 # but other programs within a single file may
67 # otherwise be OK
68 raise e
69 else:
70 # we want to catch the Python interpreter exception and
71 # format it such that it can be used to debug the Camille program
72 if(debug_mode__ == detailed_debug):
73 p r i n t (traceback.format_exc())
74 raise InterpreterException(expr.linenumber,
75 "Unhandled error in %s" % expr.type, str(e), e)
do {
writeln(x)
} while x;
writeln(y)
}
0
1
Camille> variable x, y; {
x = 11;
y = -(11,4);
do {
y = +(y,x);
x = dec1(x)
} while x;
writeln (y)
}
73
2.1
recursive functions
CLS | ASR | LOLR env
static scoping
3.0
references
4.0
3.0 3.1
Imperative
3.0(cells) 3.0(arrays) (pass-by- (pass-by-
Camille
value-result) reference)
(statements)
3.2(lazy funs)
4.0(do while)
Lazy Camille
3.2(lazy let)
Lazy Camille
3.2(full lazy)
Full Lazy Camille
2.1(nameless LOLR) 2.0(nameless LOLR) or 2.1(named LOLR) letrec, nameless LOLR environment
2.1(dynamic scoping) 2.0(dynamic scoping) or 2.1 letrec, dynamic scoping, (named|nameless) (CLS|ASR|LOLR) environment
Chapter 12: Parameter Passing
3.0 2.1 references, named ASR environment, ASR|CLS closure
3.0(cells) 3.0 cells, named ASR environment, ASR|CLS closure
3.0(arrays) 3.0 arrays, named ASR environment, ASR|CLS closure
3.0(pass-by-value-result) 3.0 pass-by-value-result, named ASR environment, ASR|CLS closure
3.1 3.0 pass-by-reference, named ASR environment, ASR|CLS closure
Lazy Camille
3.2(lazy funs) 3.1 lazy evaluation for fun args only, named ASR environment, ASR|CLS closure
3.2(lazy let) 3.2 lazy evaluation for fun args and let expr, named ASR environment, ASR|CLS closure
3.2(full lazy) 3.2 lazy evaluation for fun args, let expr, and primitives, named ASR environment, ASR|CLS closure
Imperative Camille
4.0 3.0 statements, named ASR environment, ASR|CLS closure
4.0(do while) 4.0 do while, named ASR environment, ASR|CLS closure
535
Table 12.6 Complete Suite of Camille Languages and Interpreters (Key: ASR = abstract-syntax representation; CLS = closure;
LOLR = list-of-lists representation.)
Data from Perugini, Saverio, and Jack L. Watkin. 2018. “ChAmElEoN: A customizable language for teaching programming languages.” Journal of Computing
Sciences in Colleges (USA) 34(1): 44–51.
1.0 Chapter 10: Conditionals
simple
no env
1.1
let
1.1(named
CLS) 1.2
let let, if/else
CLS env
2.0
non-recursive
functions
CLS | ASR |
LOLR env
Static
scoping
make
recursive
2.0
(dynamic
make 2.0(verify) scoping) Recursive Functions
nameless CLS | ASR | 2.1
LOVR env recursive
functions
CLS | ASR |
LOLR env
static
make scoping
nameless make
recursive
make recursive
make recursive
2.1 2.1 2.1
(nameless (nameless (nameless
LOLR) ASR) CLS)
Figure 12.13 Dependencies between the Camille interpreters developed in this text,
including those in the programming exercises. The semantics of a directed edge
Ñ b are that version b of the Camille interpreter is an extension of version
(i.e., version b subsumes version ). (Key: ASR = abstract-syntax representation;
CLS = closure; LOLR = list-of-lists representation.)
12.8. CAMILLE INTERPRETERS: A RETROSPECTIVE 537
of the language (Table 12.7). (Note that the nameless environments are available
for use with neither the interpreter supporting dynamic scoping nor any of the
interpreters in this chapter. Furthermore, not all environment representations are
available with all implementation options. For instance, all of the interpreters in
this chapter use exclusively the named ASR environment.)
Exercise 12.8.2 Write a Camille program using any valid combination of the
features and concepts covered in Chapters 10–12 and use it to stress test—in other
words, spin the wheels of—the Camille interpreter. Your program must be at least
30 lines of code and original (i.e., not an example from the text). You are welcome
to rewrite a program you wrote in the past and use it to flex the muscles of your
interpreter. For instance, you can use Camille to build a closure representation
3.0
references
4.0
3.1 Imperative 3.0
(pass-by- Camille 3.0 3.0 (pass-by-
reference) (state- (cells) (arrays) value-
ments) result)
3.2
(lazy funs) 4.0
Lazy (do while)
Camille
3.2
(lazy let)
Lazy
Camille
3.2
(full lazy)
Full Lazy
Camille
Table 12.7 Concepts and Features Implemented in Progressive Versions of Camille. The symbol Ó indicates that the concept is
supported through its implementation in the defining language (here, Python). The Python keyword included in each cell, where
applicable, indicates which Python construct is used to implement the feature in Camille. The symbol Ò indicates that the concept
is implemented manually. The Camille keyword included in each cell, where applicable, indicates the syntactic construct through
which the concept is operationalized. (Key: ASR = abstract-syntax representation; CLS = closure; LOLR = list-of-lists representation.
Cells in boldface font highlight the enhancements across the versions.)
12.9. METACIRCULAR INTERPRETERS 539
4. The System Browser in the Squeak implementation of Smalltalk catalogs the source code for the
entire Smalltalk class hierarchy.
540 CHAPTER 12. PARAMETER PASSING
that are borrowed from the defining language can be more directly and, therefore,
easily expressed in the interpreter—language concepts can be restated in terms of
themselves! (Sometimes this is called bootstrapping a language.) A more compelling
benefit of this direct correspondence between host and source language results
when, conversely, we do not implement features in the defined language using
the same semantics as in the defining language. In that case, a self-interpreter is
an avenue toward modifying language semantics in a programming language. By
implementing pass-by-name semantics in Camille, we did not alter the parameter-
passing mechanism of Python. However, if we built an interpreter for Python in
Python, we could.
A self-interpreter for a homoiconic language—one where programs and data
objects in the language are represented uniformly—is called a metacircular
interpreter. While a metacircular interpreter is a self-interpreter—and, therefore,
has all the benefits of a self-interpreter—since the program being interpreted
in the defined language is expressed as a data structure in the defining
language, there is no need to convert between concrete and abstraction
representations. For instance, the concrete2abstract (in Section 9.6) and
abstract2concrete (in Programming Exercise 9.6.1) functions from Chapter 9
are unnecessary.
Thus, the homoiconic property simplifies the ability to change the semantics of
a language from within the language itself! This idea supports a bottom-up style
of programming where a programming language is used not as a tool to write
a target program, but to define a new targeted (or domain-specific) language
and then develop a target program in that language (Graham 1993, p. vi). In
other words, bottom-up programming involves “changing the language to suit
the problem” (Graham 1993, p. 3)—and that language can look quite a bit different
than Lisp. (See Chapter 15 for more information.) It has been said that “[i]f you give
someone Fortran, he has Fortran. If you give someone Lisp, he has any language
he pleases” (Friedman and Felleisen 1996b, Afterword, p. 207, Guy L. Steele Jr.)
and “Lisp is a language for writing Lisp.” Programming Exercise 5.10.20 builds a
metacircular interpreter for a subset of Lisp.
#lang racket
(define-namespace-anchor anc)
(define ns (namespace-anchor->namespace anc))
( r e q u i r e rnrs/mutable-pairs-6)
12.9. METACIRCULAR INTERPRETERS 541
Once you have the interpreter running, you will self-apply it, repeatedly, until it
churns to a near halt, using the following code:
60
61 ;; what follows is: ((((((I I) I) I) I) I) expr)
62 (define copy-of-copy-of-copy-of-copy-of-copy-of-interpreter
63 (apply copy-of-copy-of-copy-of-copy-of-interpreter ( l i s t int)))
64
65 (apply copy-of-copy-of-copy-of-copy-of-copy-of-interpreter test1)
66 (apply copy-of-copy-of-copy-of-copy-of-copy-of-interpreter test2)
67 (apply copy-of-copy-of-copy-of-copy-of-copy-of-interpreter test3)
68 (apply copy-of-copy-of-copy-of-copy-of-copy-of-interpreter test4)
Control and
Exception Handling
Alice: “Would you tell me, please, which way I ought to go from here?”
The Cheshire Cat: “That depends a good deal on where you want to
get to.”
— Lewis Carroll, Alice in Wonderland (1865)
(lambda (returnvalue)
(* 2 returnvalue))
(lambda (returnvalue)
(* 3 (+ 5 (* 2 returnvalue))))
• (rightmost) +
• (+ 1 4)
• (rightmost) *
• (* 2 (+ 1 4))
• (leftmost) +
• (+ 5 (* 2 (+ 1 4)))
• (leftmost) *
• (* 3 (+ 5 (* 2 (+ 1 4))))
(lambda (returnvalue)
(* 3 (+ 5 (returnvalue 2 (+ 1 4)))))
(cond
((eqv? (* 3 (+ 5 (* 2 (+ 1 4)))) 45) "Continuez")
(else "Au revoir"))
is
(lambda (returnvalue)
(cond
((eqv? (* 3 (+ 5 (* 2 returnvalue))) 45) "Continuez")
(else "Au revoir")))
1. The term call/cc in this quote is letcc in Friedman and Felleisen (1996b).
550 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
(lambda (x) (+ x 1). Thus, the invocation of call/cc here captures the
continuation (lambda (x) (+ x 1). The semantics of call/cc are to call its
function argument with the current continuation captured. Thus, the expression
(call/cc (lambda (k) (k 3)))
translates to
((lambda (k) (k 3)) (lambda (x) (+ x 1)))
The latter expression passes the current continuation , (lambda (x) (+ x 1)),
to the function (lambda (k) (k 3)), which is passed to call/cc in the
former expression. That expression evaluates to ((lambda (x) (+ x 1)) 3)
or (+ 3 1) or 4.
Now let us consider additional examples:
> (call/cc
(lambda (k)
(* 2 (+ 1 4))))
10
> (call/cc
(lambda (k)
(* 2 (k 20))))
20
> (call/cc
(lambda (k)
(* 2 (k "break out"))))
"break out"
552 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
Now we modify the original expression so that the continuation being captured
by call/cc is no longer the identity function:
(lambda (returnvalue)
(/ 100 returnvalue))
Instead of continuing with the value used as the divisor, we can continue with the
value used as the dividend:
> (/ (call/cc
(lambda (k)
(* 2 (k 20)))) 5)
4
When k is not invoked in the body of the function ƒ passed to call/cc, the
return value of the call to call/cc is the return value of ƒ . In general, a call to
(call/cc (lambda (k) E)), where k is not called in E, is the same as a call to
(call/cc (lambda (k) (k E))) (Haynes, Friedman, and Wand 1986, p. 145).
In the other form demonstrated, the captured continuation is invoked in the body
of the function passed to call/cc:
If the continuation is invoked inside ƒ , then control returns from the call
to call/cc using the value passed to the continuation as a return value.
Control does not return to the function ƒ and all pending computations are
left unfinished—this is called a nonlocal exit and is explored in Section 13.3.1.
The examples of continuations in this section demonstrate that, once captured, a
programmer can use (i.e., call) the captured continuation to replace the current
continuation elsewhere in a program, when desired, to circumvent the normal
flow of control and thereby affect, manipulate, and direct control flow. Figure 13.1
illustrates the general process of capturing the current continuation k through
call/cc in Scheme and later replacing the current continuation k 1 with k.
(k x)
Figure 13.1 The general call/cc continuation capture and invocation process.
554 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
return value: 10
k= (lambda (rv)
(+ 4 (* 2 rv))) (+ 4
the pending computations (* 2 3 )
on the stack
(i.e., (+ 4 (* 2 rv) ))
(+ 4 (k 3)
(* 2
replaces the (new) current continuation
k′= captures the (call/cc k′= (+ 4 (* 2 (+ 5 )))
(lambda (rv) current continuation in k (lambda (k) 3
(+ 4 with k = (+ 4 (* 2 ))
(+ 5
(* 2 (k 3)))))) with k and returns to k with value 3
(+ 5 rv))))
(k 3)
((lambda (rv)
(+ 4
(* 2 rv))) 3)
((lambda (rv)
the pending computations
(+ 5 rv) (+ 4
(* 2 rv))) 3)
k′=
on the stack;
(lambda (rv)
continuation in k
k=
the captured
(+ 4 (* 2 rv) (* 2 3 )
(* 2 (lambda (rv)
(+ 5 rv)))) (+ 4 (* 2 rv)))
(+ 4 rv) (+ 4 6)
Figure 13.3 The run-time stack during the continuation replacement process
depicted in Figure 13.2.
Figure 13.2 provides an example of the process, and Figure 13.3 depicts the run-
time stack during the continuation replacement process from that example.
(a) *
(b) 2
(c) 3
( s q r t (* (call/cc
(lambda (k)
(cons 2 (k 20)))) 5))
Explain, by appealing to transfer of control and the run-time stack, why the return
value of this expression is 2 and not 3. Also, reify the continuation captured by
the call to call/cc in this expression. Does a continuation ever return (like a
function)?
> (call/cc
(lambda (k)
(* 2 (k 20))))
20
Modify this expression to also capture the continuation of the expression (k 20)
with call/cc. Name this continuation k2 and use it to complete the entire
computation with the default continuation (now captured in k2).
Exercise 13.2.6 The interface for capturing continuations used in The Seasoned
Schemer (Friedman and Felleisen 1996b) is called letcc. Although letcc has
a slightly different syntax than call/cc, both have approximately the same
semantics (i.e., they capture the current continuation). The letcc function
only accepts an identifier and an expression, in that order, and it captures the
continuation of the expression and binds it to the identifier. For instance, the
following two expressions are analogs of each other:
556 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
(a) Give a general rewrite rule that can be used to convert an expression using
letcc to an equivalent expression using call/cc. In other words, give an
expression using only call/cc that can be used as a replacement for every
occurrence of the expression (letcc k e).
Exercise 13.2.7 Investigate and experiment with the interface for first-class
continuations in ML (see the structure SMLofNJ.Cont):
- open SMLofNJ.Cont;
[autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
[autoloading done]
opening SMLofNJ.Cont
type 'a cont = 'a ?.cont
v a l callcc : ('a cont -> 'a) -> 'a
v a l throw : 'a cont -> 'a -> 'b
v a l isolate : ('a -> unit) -> 'a cont
type 'a control_cont = 'a ?.InlineT.control_cont
v a l capture : ('a control_cont -> 'a) -> 'a
v a l escape : 'a control_cont -> 'a -> 'b
Replicate any three of the examples in Scheme involving call/cc given in this
section in ML.
This function exhibits recursive control behavior, meaning that when the function is
called its execution causes the stack to grow until the base case of the recursion is
reached. At that point, the computation is performed as recursive calls return and
pop off the stack. The following series of expressions depicts this process:
As soon as a zero is encountered in the list, the final return value of the function is
known to be zero. However, the recursive control behavior continues to build up
the stack of pending computations until the base case is reached, which signals the
commencement of the computations to be performed. This function is inefficient
558 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
in space whether the input contains a zero or not. It is inefficient in time only when
the input list contains a zero—unnecessary multiplications are performed.
The presence of a zero in the input list can be considered an exception or
exceptional case. Exceptions are unusual situations that happen at run-time, such
as erroneous input. One application of first-class continuations is for exception
handling.
We want to break out of the recursion as soon as we encounter a zero in the
input list of numbers. Consider the following new definition of product (Dybvig
2003):
1 (define product
2 (lambda (lon)
3 (call/cc
4 ;; break stores the current continuation
5 (lambda (break)
6 (letrec ((P (lambda (l)
7 (cond
8 ;; base case
9 (( n u l l? l) 1)
10 ;; exceptional case; abnormal flow of control
11 ((zero? (car l)) (break 0))
12 ;; inductive case; normal flow of control
13 (else (* (car l) (P (cdr l))))))))
14 (P lon))))))
The case where the list does not contain a zero proceeds as usual, using the current
continuation of pending multiplications on the stack rather than the captured
continuation of the initial call to product. Like the examples in Section 13.2,
this product function demonstrates that once a continuation is captured through
call/cc, a programmer can use (i.e., call) the captured continuation to replace
the current continuation elsewhere in a program, when desired, to circumvent the
normal flow of control and, therefore, alter control flow.
Notice that in this example, the definition of the nested function P within
the letrec expression (lines 6–13) is necessary because we want to capture the
continuation of the first call to product, rather than recapturing a continuation
every time product is called recursively. For instance, the following definition
of product does not achieve the desired effect because the continuation break
is rebound on each recursive call and, therefore, is not the exceptional/abnormal
continuation, but rather the normal continuation of the computation:
1 (define product
2 (lambda (lon)
3 (call/cc
4 ;; break is rebound to the current continuation
5 ;; on every recursive call to product
6 (lambda (break)
7 (cond
8 ;; base case
9 ((n u l l? lon) 1)
10 ;; exceptional case; abnormal flow of control
11 ((zero? (car lon)) (break 5))
12 ;; inductive case; normal flow of control
13 (else (* (car lon) (product (cdr lon)))))))))
We continue with 5 (line 11) to demonstrate that the continuation stored in break
is actually the normal continuation:
To break out of this type letrec-free style of function definition, the function
could be defined to accept an abnormal continuation, but the caller would be
responsible for capturing and passing it to the called function. For instance:
0
> (+ 100 (call/cc (lambda (break) (product break '(1 2 3 0 4 5)))))
100
Factoring out the constant parameter break (using Functional Programming Design
Guideline 6 from Table 5.7 in Chapter 5) again renders a definition of product
using letrec expression:
(define product
(lambda (break lon)
(letrec ((P (lambda (l)
(cond
;; base case
(( n u l l? l) 1)
;; exceptional case; abnormal flow of control
((zero? (car l)) (break 0))
;; inductive case; normal flow of control
(else (* (car l) (P (cdr l))))))))
(P lon))))
13.3.2 Breakpoints
Consider the following recursive definition of a Scheme factorial function that
accepts an integer n and returns the factorial of n:
(define factorial
(lambda (n)
(cond
((zero? n) 1)
(else (* n (factorial (- n 1)))))))
Now consider the same definition of factorial using call/cc to capture the
continuation of the base case (i.e., where n is 0) (Dybvig 2009, pp. 75–76):
Unlike the continuation captured in the product example in Section 13.3.1, where
the continuation captured is of the initial call to the recursive function product
(i.e., the identity function), here the continuation captured includes all of the
pending multiplications built up on the stack when the base of the recursion (i.e.,
13.3. GLOBAL TRANSFER OF CONTROL WITH CONTINUATIONS 561
(lambda (returnvalue)
(* 5 (* 4 (* 3 (* 2 (* 1 returnvalue))))))
> (redo 1)
120
> (redo 0)
0
> (redo 2)
240
> (redo 3)
360
> (redo 4)
480
> (redo 5)
600
The natural base case of recursion for factorial is 1. However, by invoking the
continuation captured through the use of call/cc, we can dynamically change
the base case of the recursion at run-time. Moreover, this factorial example
vividly demonstrates the—perhaps mystifying—unlimited extent of a first-class
continuation.
The thought of transferring control to pending computations that no
longer exist on the run-time stack hearkens back the examples of first-class
closures returned from functions (in Chapter 6) that “remembered” their lexical
environment even though that environment no longer existed because the
activation record for the function that created and returned the closure had been
popped off the stack (Section 6.10).
The continuation captured by call/cc is, more generally, a closure—a pair
of (code, environment) pointers—where the code is the actual continuation and
the environment is the environment in which the code is to be later evaluated.
However, when invoked, the continuation (in the closure) captured with call/cc,
562 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
unlike a regular closure (i.e., one whose code component is not a continuation),
does not return a value, but rather transfers control elsewhere. Similarly, when
we invoke redo, we are jumping back to activation records (i.e., stack frames)
that no longer exist on the stack because the factorial function has long
since terminated, and been popped off the stack, by the time redo is called.
The key connection back to our discussion of first-class closures in Chapter 6
is that the first-class continuations captured through call/cc are only possible
because closures in Scheme are allocated from the heap and, therefore, have
unlimited extent. If closures in Scheme were allocated from the run-time stack, an
example such as factorial, which uses a first-class continuation to jump back
to seemingly “phantom” stack frames, would not be possible.
The factorial example illustrates the use of first-class continuations for
breakpoints and can be used as a basis for a breakpoint facility in a debugger. In
particular, the continuation of the breakpoint can be saved so that the computation
may be restarted from the breakpoint—more than once, if desired, and, with
different values.
Unlike in the prior examples, here we store the captured continuation in a
variable through assignment, using the set! operator. This demonstrates the first-
class status of continuations in Scheme. Once a continuation is captured through
call/cc, a programmer can store the continuation in a variable (or data structure)
for later use. The programmer can then use the captured continuation to replace
the current continuation elsewhere in a program, when and as many times as
desired (now that it is recorded persistently in a variable), to circumvent the
normal flow of control and, therefore, manipulate control flow. There is no limit
on the number of times a continuation can be called, which implies that heap-
allocated activation records must exist.
1 require "continuation"
2
3 def product(lon)
4
5 # base case
6 i f lon == [] then 1
7
8 # exceptional case
9 e l s i f lon[0] == 0 then $break.call "Encountered a zero. Break out."
10
11 # inductive case
12 else
13.3. GLOBAL TRANSFER OF CONTROL WITH CONTINUATIONS 563
Ruby does not support nested methods. Thus, instead of capturing the
continuation of a local, nested function P (as done in the second definition
of product in Section 13.3.1), here the caller saves the captured continuation
k with callcc3 of each called function (lines 22–23 and 28–29) in a global
variable $break (lines 22 and 28) so that the called function has access to it.
The continuation captured in the local variable k on line 22 represents the set of
program statements on lines 24–31. Similarly, the continuation captured in the local
variable k on line 28 represents the set of program statements on lines 30–31. In
each case, the captured continuation in the local variable k is saved persistently in
the global variable $break so that it can be accessed in the definition of product
by using $break and called by using $break.call with the string argument
"Encountered a zero. Break out." (line 9). The output of this program is
1 $ ruby product.rb
2 before recursive call
3 before recursive call
4 before recursive call
5 before recursive call
6 after recursive call
7 after recursive call
8 after recursive call
9 after recursive call
10 24
11 before recursive call
12 before recursive call
13 Encountered a zero. Break out.
Lines 2–10 of the output demonstrate that the product of a list of non-zero numbers
is computed while popping out of the (four) layers of recursive calls. Lines 11–
13 of the output demonstrate that no multiplications are performed when a zero
is encountered in the input list of numbers (i.e., the nonlocal exit abandons the
recursive calls on the stack).
3. While the examples in Ruby in this chapter run in the current version of Ruby, callcc is currently
deprecated in Ruby.
564 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
Exercise 13.3.1 Does the following definition of product perform any unneces-
sary multiplications? If so, explain how and why (with reasons). If not, explain
why not (with reasons).
(define product
(lambda (lon)
(call/cc
(lambda (break)
(cond
((n u l l? lon) 1)
((zero? (car lon)) (break 0))
(else (* (car lon) (product (cdr lon)))))))))
Exercise 13.3.2 Can the factorial function using call/cc given in this section
be redefined to remove the side effect (i.e., without use set!), yet retain the ability
to dynamically alter the base of the recursion? If so, define it. If not, explain why
not. In other words, why is side effect necessary in that example (if it is)?
Exercise 13.3.3 Explain why the letrec expression is necessary in the definition
of product using call/cc in this section. In other words, why can’t product be
defined just as effectively as follows? Explain.
(define product
(lambda (lon)
(call/cc
(lambda (break)
(cond
((n u l l? lon) 1)
((zero? (car lon)) (break 0))
(else (* (car lon) (product (cdr lon)))))))))
Exercise 13.3.4 Consider the following attempt to remove the side effect (i.e.,
the use of set!) from the factorial function using call/cc given in this
section:
> (factorial 5)
'(120 . #<continuation>)
The approach taken is to have factorial return a pair whose car is an integer
representing the factorial of its argument and whose cdr is the redo continuation,
rather than just an integer representing the factorial. As can be seen from the
preceding transcript, this approach does not work.
(a) Notice that (cdr (factorial 5)) returns the continuation of the base case
(i.e., the redo continuation). Explain why that rather than passing a single
number to it, as done in the example in this section, now a pair must be passed
instead—for example, the list (cons 2 "ignore") in this case.
(c) Explain why the invocation to factorial and subsequent use of the contin-
uation as ((cdr (factorial 5)) (cons 5 (cdr (factorial 5))))
never terminates.
(define product
(lambda (lon)
(call/cc
(lambda (break)
(cond
((n u l l? lon) 1)
((zero? (car lon)) (break 0))
(else (* (car lon) (product (cdr lon)))))))))
(a) Indicate how many (i.e., the number of) continuations are captured when this
function is called as (product ’(9 12 7 3)).
(b) Indicate how many (i.e., the number of) continuations are captured when this
function is called as (product ’(42 11 0 2 -1)).
Exercise 13.3.6 Define a recursive Scheme function member1 that accepts only an
atom a and a list of atoms lat and returns the integer position of a in lat (using
zero-based indexing) if a is a member of lat and #f otherwise. Your definition
of member1 must use call/cc to avoid returning back through all the recursive
calls when the element a is not found in the list, but it must not use the captured
continuation when the element a is found in the list.
566 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
No Unnecessary
Programming Start Input Nonlocal Exit for Operations
Exercise from LoN S-Expression 1 in List Intermediate gcd = 1 Computed
‘ ‘ ‘
13.3.13 N/A ˆ ˆ
‘ ‘ ‘ ‘
13.3.14 13.3.13 ˆ
‘ ‘ ‘
13.3.15 N/A ˆ ˆ
‘ ‘ ‘ ‘
13.3.16 13.3.15 ˆ
Table 13.1 Mapping from the Greatest Common Divisor Exercises in This Section
to the Essential Aspects of First-Class Continuations and call/cc
Examples:
Exercise 13.3.10 Rewrite the Ruby program in Section 13.3.3 so that the caller
passes the captured continuation k of the called function product on lines 23
and 29 to the called function itself (as done in the third definition of product in
Section 13.3.1).
Exercise 13.3.11 Define a Scheme function product that accepts a variable number
of arguments and returns the product of them. Define product using call/cc
such that no multiplications are performed if any of the arguments are zero.
Exercise 13.3.12 (Friedman, Wand, and Haynes 2001, Exercise 1.17.1, p. 27) Con-
sider the following BNF specification of a binary search tree.
13.3. GLOBAL TRANSFER OF CONTROL WITH CONTINUATIONS 567
ăbnserchtreeą ::= ()
ăbnserchtreeą ::= (ăntegerą ăbnserchtreeą ăbnserchtreeą)
Define a Scheme function path that accepts only an integer n and a list bst
representing a binary search tree, in that order, and returns a list of lefts and
rights indicating how to locate the vertex containing n. If the integer is not found
in the binary search tree, use call/cc to avoid returning back through all the
recursive calls and return the atom ’notfound.
Examples:
Exercise 13.3.15 Define a function gcd* in Scheme using call/cc that accepts
only a non-empty S-expression of positive, non-zero integers, which contains no
empty lists, and returns the greatest common divisor of those integers. If a
1 is encountered in the list, through the use of call/cc, return the string
"1: encountered a 1 in the S-expression" immediately without ever
executing gcd and without returning through each of the recursive calls on the
stack.
Examples:
lists. Your function must not perform any unnecessary computations. Specifically,
if the input list contains an empty list, immediately return () without returning
through each of the recursive calls on the stack. Further, if the input list does
not contain an empty list, but contains two lists whose set intersection is empty,
immediately return (). You may assume that each list in the input list represents
a set (i.e., contains no duplicate elements). Your solution must follow Design
Guidelines 4 and 6 from Table 5.7 in Chapter 5.
1 $ cat goto.c
2 # include <stdio.h>
3
4 i n t main() {
5 printf ("%d\n", repetez);
6 again:
7 printf ("%d\n", encore);
8 goto again;
9 }
10 $
11 $ gcc goto.c
12 $ ./a.out
13 repetez
14 encore
15 encore
16 encore
17 ...
This simple example illustrates the use of a label again: (line 6) and a goto (line
8) to create a repeated transfer of control resulting in an infinite loop.
Programmers are generally advised to avoid gotos because they violate the
spirit of structured programming. This style of (typically imperative) programming
is aimed at improving the readability and maintainability, and reducing the
potential for errors, of a computer program through the use of functions and
block control structures (e.g., if, while, and for) with only one entry and exit
point as opposed to tests and jumps (e.g., goto) found in assembly programs. Use
of goto statements can result in “spaghetti code” that is difficult to follow and,
thus, challenging to debug and maintain. Programming languages that originally
13.4. OTHER MECHANISMS FOR GLOBAL TRANSFER OF CONTROL 571
lacked structured programming constructs but now support them include Fortran,
COBOL, and BASIC.
Edsger W. Dijkstra wrote a letter titled “Go To Statement Considered Harmful”
in 1968 arguing against the use of the goto statement. His letter (Dijkstra 1968)
and the emergence of imperative languages with suitably expressive control
structures, including ALGOL, supported a shift toward structured programming.
Later, Donald E. Knuth (1974b), in his paper “Structured Programming with go
to Statements,” identified cases where a jump leads to clearer and more efficient
code. Notwithstanding, goto statements cannot be used to jump across functions
on the stack:
$ cat goto_fun.c
# include <stdio.h>
i n t f() {
printf ("avant\n");
again:
printf ("apres\n");
}
i n t main() {
i n t i=0;
f();
while (i++ < 10) {
printf ("%d\n", i);
goto again;
}
}
$
$ gcc goto_fun.c
goto_fun.c:15:12: error: use of undeclared label 'again'
goto again;
^
The goto statement can only be used to transfer control within one lexical closure.
Therefore, we cannot replicate the previous examples using call/cc with gotos.
In other words, a goto statement is not as powerful as a first-class continuation.
1 $ cat simple_setjmp.c
2 # include <stdio.h>
3 # include <setjmp.h>
4
5 i n t main() {
6 jmp_buf env;
7 i n t x = setjmp(env);
The setjmp function saves its calling environment in its only argument (named
env here) and returns 0 the first time it is called. Notice the first line of output
on line 14 is x = 0. The setjmp function serves the same purpose as the label
again:; that is, it marks a destination for a subsequent transfer of control.
However, unlike a label and more like capturing a continuation using call/cc,
this function saves the current environment at the time it is called (for later
restoration by longjmp). In this example, the environment is empty, meaning
that it does not contain any name–value pairs. The longjmp function acts like a
goto in that it transfers control. However, unlike goto, the longjmp function also
restores the original environment (captured when setjmp was called) to the point
where control is transferred. The longjmp function never returns. Instead, when
longjmp is called, the call to setjmp sharing the buffer passed in each invocation
returns (line 7), but this time with the value passed as a second argument to
longjmp (in this case 5; line 9). Notice the lines of output from line 15 onward
contain x = 5. Thus, the setjmp and longjmp functions communicate through
a shared struc buffer of type jmp_buf that represents the captured environment.
When used in the manner just described in the same function (i.e., main)
and with an empty environment, setjmp and longjump act like a label and a
goto, respectively, and effect a simple nonlocal transfer of control. The captured
environment is unnecessary in this example; that is, it simply serves to convey the
semantics of setjmp/longjump.
The setjmp function is similar to call/cc; the longjmp function is similar
to (k ) (i.e., it invokes the continuation captured in k with the value ); and
jmp_buf env is similar to the captured continuation k (Table 13.2). Recall that
a closure is a pair consisting of an expression [e.g., (lambda (y) (+ x y))]
and an environment [e.g., (x 8)]. In other words, a closure is program code that
“remembers” its lexical environment. A continuation is also a closure: The “what
to do with the return value” is the expression component of the closure, and
the environment to be restored after the transfer of control is the environment
Semantics Scheme C
captures branch point and environment call/cc setjmp
restores branch point and environment (k ) longjmp
environment k jmp_buf env
component. The call/cc function returns a closure that, when called, never
returns.
There is, however, a fundamental difference between setjmp/longjmp and
call/cc. This difference is a consequence of the location where Scheme and C
store closures in the run-time system, or alternatively the extent of closures in
Scheme and C. Consider the following C program using setjmp/longjmp, which
is an attempt to replicate the factorial example using call/cc in Scheme in
Section 13.3.2 to help illustrate this difference:
1 $ cat factorial.c
2 # include <stdio.h>
3 # include <setjmp.h>
4
5 jmp_buf env;
6
7 i n t factorial( i n t n) {
8 i n t x;
9
10 i f (n == 0) {
11 x = setjmp(env);
12 printf ("Inside the factorial function.\n");
13 i f (x == 0)
14 r e t u r n 1; /* normal base of recursion */
15 else
16 r e t u r n x; /* new base of recursion passed from longjmp */
17 } else
18 r e t u r n n*factorial(n-1);
19 }
20
21 i n t main() {
22 printf ("%d\n", factorial(5));
23 longjmp(env, 3); /* (k 3) */
24 }
25 $
26 $ gcc factorial.c
27 $ ./a.out
28 Inside the factorial function.
29 120
30 Inside the factorial function.
31 Segmentation fault: 11
In this example, unlike in the simple example at the beginning of Section 13.4.2, the
environment captured through setjmp comes into focus. Here, the factorial
function invokes setjmp in the base case (line 11) where its parameter n is 0 (line
10). It then returns normally back through all of the recursive calls, progressively
computing the factorial (i.e., performing the multiplications) as the activation
records for factorial pop off the stack. By the time control returns to main
at line 22 where the factorial is printed, those stack frames for factorial are
gone. The invocation of longjmp on line 23 seeks to transfer control back to the
invocation of factorial corresponding to the base case (when the parameter n
is 0) and to return from the call to setjmp on line 11 with the value 3, effectively
changing the base of the recursion from 1 to 3 and ultimately returning 360.
However, when longjmp is called at line 23, main is the only function on the stack.
The invocation of longjmp on line 23 is tantamount to jumping to a phantom stack
frame, meaning a stack frame that is no longer there (Figure 13.4).
574 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
factorial(0) factorial(0);
x = setjmp(env); x = setjmp(env);
factorial(1) factorial(1);
return 1 * return 1 *
factorial(0); factorial(0); => 1
factorial(2) factorial(2);
return 2 * return 2 *
factorial(1); factorial(1); => 2
factorial(3) factorial(3);
return 3 return 3
factorial(2); factorial(2); => 6
factorial(4) factorial(4);
return 4 * return 4 *
factorial(3); factorial(3); => 24
factorial(5) factorial(5);
return 5 * return 5 *
factorial(4); factorial(4); => 120
top of stack
main { main {
factorial(5); factorial(5); => 120
longjmp(env, 5); longjmp(env, 5);
} }
1 $ cat jumpstack.c
2 # include <stdio.h>
3 # include <setjmp.h>
4
5 jmp_buf env;
6
7 i n t d( i n t x) {
8 /* exceptional case; need to break out,
9 but do not want to return back through
10 all of the calls on the stack */
11 fprintf(stderr, "Jumping back to main without ");
13.4. OTHER MECHANISMS FOR GLOBAL TRANSFER OF CONTROL 575
Here, we can jump directly back to main because the activation record for main
is still active on the run-time stack (i.e., it still exists). By doing so, we bypass the
functions a, b, and c. The stack frames for d, c, b, and a are removed from the stack
and disposed of properly as if each function had exited normally, in that order,
when the longjmp happens. In other words, setjmp/longjmp can be used to
jump down the stack, but not back up it.
The setjmp function is the analog of a statement :be, whereas the longjmp
function is the analog of the goto statement. The main difference between a
:be/goto and the setjmp/longjmp pair is that longjmp cleans up the stack
in addition to transferring control; goto just transfers control.
Let us compare the factorial example in Section 13.4.2 with this example. In
the factorial example, we attempt to jump from main directly back to a stack
frame for the last invocation of factorial (i.e., for the base case where n is 0), which
no longer exists. Here, we are jumping directly back to the stack frame for main,
from the stack frame for d, which still exists on the stack because it is waiting for d,
c, b, and a to return normally and complete the continuation of the computation.
At the time d is called [as d(12)], the stack is main Ñ a Ñ b Ñ c Ñ d, where the
stack grows left-to-right. Thus, the top of the stack is on the right. The continuation
of pending computations is
1 + return value of b(1+1) =
1 + (2 * return value of c(2+2) =
1 + (2 * (3 + return value of d(4*3)) =
1 + (2 * (3 + return value of d(12))
This scenario is illustrated through the stacks presented in Figure 13.5.
576 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
c(4) c(4)
return 3 + d(12) return 3 + d(12)
b(2) b(2)
return 2 * c(4) return 2 * c(4)
a(1) a(1)
return 1 + b(2) return 1 + b(2) top of (unwound) stack
status of stack status of stack during call to status of stack after call to
during execution of d(12) longjmp(env, −1) longjmp(env, −1)
Facility Semantics
least flexible/ :be and goto only nonlocal transfer of control within a single function;
general
İ does not clean up the stack
§ nonlocal transfer of control both within and between functions
§
§
§ setjmp/longjmp currently on the stack (i.e., active extent)
§
§ in C +
§
§ restored context/environment;
§
§
§ +
§
§ unwinds the stack, but does not restore it
§
§ nonlocal transfer of control both within and between
§
§ call/cc and (k )
§ any functions
§
§ in Scheme +
§
đ restored context/environment
most flexible/ +
general unwinds and restores the stack
have popped off the stack. Whenever that continuation is called, we are transferred
directly into the middle of the base case call to factorial, which is executing
normally, with the illusion of all of its parent activation records still on the stack
waiting for the call to the base case to terminate. Moreover, that continuation can
be reused as many times as desired without error—the same is not possible in C. In
essence, the setjmp and longjmp functions represent a middle ground between
the unwieldiness of gotos and the generality of call/cc for nonlocal transfer of
control (Table 13.3).
The important point to observe here is that the combination of
(call/cc (lambda (k) ...)) and (k ) does not just capture
the current continuation and transfer control, respectively. Instead,
(call/cc (lambda (k) ...)) captures the current continuation, including the
environment and the status of the stack, and (k ) transfers control while restoring
the environment and the stack. The setjmp function captures the environment, but
does not capture the status of the stack. Consequently, the longjmp function,
unlike (k ), requires any stack frame to which it is to jump to be active. Thus,
the setjmp and longjmp functions can be implemented in Scheme using first-
class continuations to simulate their semantics (Programming Exercise 13.4.8),
illustrating the generality, power, and flexibility of first-class continuations.
Nonetheless, the setjmp and longjmp functions are helpful for exception
handling within this limitation. The following is a common programming idiom
for using these functions for exception handling:
i f (setjmp(env) == 0)
/* protected code block;
call longjmp when an exception is encountered */
else
578 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
/* exception handler;
return point from a longjmp */
}
A return value of 0 for setjmp indicates a normal return, while a non-zero return
value indicates a return from longjmp. If longjump is called anywhere within
the protected block, or in any function called within that block, then setjmp will
return (again), causing control to be transferred to the exception handler. Again, a
call to longjmp after the protected code block completes (and pops off the stack)
is undefined and generally results in a memory error.
Exercise 13.4.5 Write a C program with three functions: main, A, and B. The main
function calls A, which then calls B. Low-level computation that might result in
an error is performed in functions A and B. All error handling is done in main.
Use setjmp and longjmp for error handling. The main function must be able to
discern which of the other two functions (i.e., A or B) generated the error. Hint: Use
a switch statement.
Exercise 13.4.6 The Common Lisp functions catch and throw have nearly the
same semantics as setjmp and longjmp in C, respectively. Moreover, catch
and throw expressions in Common Lisp can be easily translated into equivalent
Scheme expressions involving (call/cc (lambda (k) ...)) and (k ),
respectively (Haynes and Friedman 1987, p. footnote :, p. 11):
13.5. LEVELS OF EXCEPTION HANDLING: A SUMMARY 579
Exercise 13.4.8 Define the functions setjmp and longjmp in Scheme with the
same functional signatures as they have in C. Use a Scheme vector to store the
jmp_buf.
Exercise 13.4.9 Solve Programming Exercise 13.4.5 in Scheme using the Scheme
functions setjmp and longjmp defined in Programming Exercise 13.4.8. Do not
invoke the call/cc function outside of the setjmp function.
i n t B() {
/* perform some low-level computation */
/* return valid result from B or an error code from B */
}
i n t A() {
i n t result;
/* perform some computation */
i f (error)
r e t u r n error code from A;
else {
result = B();
i n t main() {
switch (A()) {
handlerForExceptionIn_B();
}
}
1 >>> i = 1
2 >>> while i <= 10:
3 ... p r i n t (i)
4 ... i f i == 3:
5 ... break
6 ... i += 1
7 ...
8 1
9 2
10 3
The while loop on lines 2–6 executes only three times because the break on line
5 terminates the execution of loop when i equals 3.
i n t B() {
/* perform some computation */
i f (error)
longjmp(env, 2);
else
/* return normal result */
}
582 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
i n t A() {
/* perform some computation */
i f (error)
longjmp(env, 1);
else
r e t u r n B();
}
i n t main() {
switch (setjmp(env)) {
import traceback
c l a s s SampleException(Exception):
def __init__(self, msg):
self.msg = msg
def getMessage(self):
r e t u r n self.msg
13.5. LEVELS OF EXCEPTION HANDLING: A SUMMARY 583
def A():
p r i n t ("Inside A.")
B()
def B():
p r i n t ("Inside B.")
try:
C()
e x c e p t SampleException as e:
p r i n t (e.getMessage())
p r i n t ( s t r (e))
traceback.print_exc(limit=None)
# jump back to D here so the computation can complete
def C():
p r i n t ("Inside C.")
D()
def D():
p r i n t ("Inside D.")
r a i s e SampleException("D raised an exception.")
# do more computation here
p r i n t ("Main program.")
A()
Main program.
Inside A.
Inside B.
Inside C.
Inside D.
D raised an exception.
D raised an exception.
Traceback (most recent call last):
File "exception_example.py", line 17, in B
C()
File "exception_example.py", line 26, in C
D()
File "exception_example.py", line 30, in D
raise SampleException("D raised an exception.")
SampleException: D raised an exception.
nonlocal exits and, unlike first-class continuations, are limited in their use for
implementing other types of control structures (e.g., the breakpoint illustrated
in the factorial example in Section 13.3.2). In contrast, first-class continuations
with heap-allocated activation records can be used as the basis for a general, high-
level mechanism for exception handling in programming languages. For instance,
first-class continuations can be used to build an exception-handling system using
calling semantics. First-class continuations with heap-allocated activation records
have been referred to as reinvocable continuations or reentrant continuations, whereas
escape continuations can only be used to escape the current control context to a
surrounding one (e.g., exception-handling systems with terminating semantics
or setjmp/longjmp in C). The first-class continuation approach to simple
exception handling is encoded/sketched in Scheme in the following programming
idiom:
(call/cc
(lambda (break)
;; perform some computations
;; if an exception happens, invoke (break ...)
;; perform more computations
))
Table 13.4 summarizes the mechanisms for handling exceptions in the pro-
gramming languages discussed in this section.
Exercise 13.5.2 Rewrite the Python program in Section 13.5.4, which demonstrates
the programming idiom for the terminating semantics of exception-handling
systems, in Java.
and function calls. Moreover, the ability to leverage our understanding of how
control is fundamentally imparted to a program as a basis for building new control
structures facilitates an improved understanding of traditional control structures.
A criticism of facilities for capturing a first-class continuation (e.g., call/cc) is
that they are only necessary for dealing with the problems endemic to a functional
style of programming. In other words, they might mitigate the effects of recursion
for repetition in programming (e.g., breaking out of layers of recursion), but are
not really needed in languages whose control flows along a sequential execution
of statements and that support repetition through iterative control structures (e.g.,
a while loop). This perspective views call/cc primarily as a mechanism to break
out of recursion in a clean way (i.e., jumping several activation records down the
stack), which is unnecessary in languages whose primary mode of repetition is
iteration.
This criticism also highlights a fundamental difference between functional
and imperative programming. In imperative programming, we generally program
iteratively and, therefore, have no need for a first-class continuation. However,
the perspective just mentioned presumes that a first-class continuation is only
intended for nonlocal exits for exception handling or, more generally, for jumping
down the run-time stack. That is a limited view of a first-class continuation. While
using a first-class continuation for nonlocal exits is common practice, and we used
nonlocal exits to initially demonstrate the use of call/cc, exceptional handling
is only one instance of a much more general use of first-class continuations—
namely, for control abstraction. While some languages provide a variety of (typically
low-level) mechanisms to transfer control (e.g., gotos, function calls, stack
unwinding or crawling), other languages recognize the benefits of and provide
general facilities for control abstraction. In addition to nonlocal exits for exception
handling, first-class continuations can be used to create new control abstractions.
For instance, we demonstrate their use for implementing coroutines in the
following subsection.
13.6.1 Coroutines
A coroutine is a function whose execution can be suspended and resumed in
cooperation with other coroutines. Coroutines are an instance of cooperative
multitasking or nonpreemptive multitasking—the coroutine itself, and not some
external factor, decides when to suspend its execution. In this sense, coroutines
13.6. CONTROL ABSTRACTION 587
cooperate.
cooperate.
cooperate.
cooperate.
cooperate.
...
Note that these coroutines are cooperatively (i.e., nonpreemptively) scheduled be-
cause they suspend themselves after each atomic operation—in this case, printing
a character. Coroutines are nonpreemptive and exist at the program level; threads,
however, are preemptive and exist at the operating system level. Thus, unlike
threads, multiple coroutines in a program cannot utilize more than one of the
system’s cores. The Lua and Kotlin programming languages support coroutines.
Recall that Ruby supports first-class continuations. The following is a Ruby
analog of this Scheme implementation of coroutines:
$readyQ = Queue.new
def spawn_coroutine(coroutine)
$readyQ.push(coroutine)
end
def start_next_ready_coroutine()
# check for non-empty queue
coroutine = $readyQ.pop
coroutine.call()
end
def pause_coroutine()
callcc{|cc|
$readyQ.push(cc)
start_next_ready_coroutine() }
end
def new_coroutine(c)
f = Proc.new {
pause_coroutine()
13.6. CONTROL ABSTRACTION 589
print(c)
f.call()
}
end
spawn_coroutine(new_coroutine("c"))
spawn_coroutine(new_coroutine("o"))
spawn_coroutine(new_coroutine("o"))
spawn_coroutine(new_coroutine("p"))
spawn_coroutine(new_coroutine("e"))
spawn_coroutine(new_coroutine("r"))
spawn_coroutine(new_coroutine("a"))
spawn_coroutine(new_coroutine("t"))
spawn_coroutine(new_coroutine("e"))
spawn_coroutine(new_coroutine("."))
spawn_coroutine(new_coroutine("\n"))
start_next_ready_coroutine()
The power of first-class continuations is derived from both their first-class nature
and the ability to call a continuation from outside of its stack lifetime.
“Coroutines, threads, and generators are all conceptually similar: they are
all mechanisms to create ‘many little stacks’ instead of having a single, global
stack” (Krishnamurthi 2017, p. 122). Further, notice that continuations, call-by-
name/need parameters (i.e., lazy evaluation), and coroutines conceptually share
common complementary operations for suspending and resuming computation
(Table 13.6). Both coroutines (Section 13.6.1) and call-by-name/need parameters
can be implemented with continuations (Wang 1990).
Exercise 13.6.2 (Dybvig 2009, Exercise 3.3.3, p. 77) Explain what happens if a
coroutine created by spawn-coroutine in the implementation of coroutines
in Section 13.6.1 terminates normally (i.e., simply returns without calling
pause-coroutine again) as demonstrated in the following program. Also,
explain why a is repeatedly printed twice on each line of output after the first
line of output from the following program:
> (start-next-ready-coroutine)
abc
cba
"end"
>
As observed in the output, this proposed solution is not correct. Explain why it
is incorrect. Also, explain why the second line of output is the first line of output
reversed. Hint: Use the Racket debugging facility.
Exercise 13.6.4 Consider the following Scheme program, which appears in Feeley
(2004) with minor modifications:
(define fail
(lambda () 'end))
592 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
(define in-range
(lambda (a b)
(call/cc
(lambda (k)
(enumerate a b k)))))
(define enumerate
(lambda (a b k)
(if (> a b)
(fail)
( l e t ((save fail))
( s e t ! fail
(lambda ()
;; restore fail to its immediate previous value
( s e t ! fail save)
(enumerate (+ a 1) b k)))
(k a)))))
000
001
002
003
...
996
997
998
999
# include <stdio.h>
i n t main() {
i n t i, j, k;
Trace the Scheme program manually or use the tracing (step-through) feature in
the built-in Racket debugging facility to help develop an understanding of how
this program functions.
13.6. CONTROL ABSTRACTION 593
Provide an explanation of how the Scheme program works. Do not restate the
obvious (e.g., “the in-range function invokes call/cc with lambda (k) . . . ”).
Instead, provide insight into how this program works.
1 (define ns (make-base-namespace))
2 ( e v a l '(define i 0) ns)
3
4 (define while-loop
5 (lambda (condition body)
6 ...))
The following call to while-loop prints the integers 0 through 9, one per line,
without recursion (e.g., letrec):
Include lines 1–2 in your program so that calls to eval (in the definition of
while-loop) find bindings for both the < function and the identifier i in the
environment from this example.
Exercise 13.6.8 The following are two coroutines that cooperate to print I love
Lucy.:
594 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
(define coroutine1
(lambda ()
(display "I ")
(pause)
(display "Lucy.")))
(define coroutine2
(lambda ()
(display "love ")
(pause)
(newline)))
The first coroutine prints I and Lucy. and the second coroutine prints love and
a newline. The activities of these coroutines are coordinated (i.e., synchronized) by
the use of the function pause, so that the interleaving of their output operations
writes an intelligible sentence to standard output: I love Lucy.
Use continuations to provide definitions for pause and resume, without using
recursion (e.g., letrec), so that the following main program prints I love
Lucy.:
Exercise 13.6.9 (Dybvig 2009, Exercise 3.3.3, p. 77) Define a function quit in the
implementation of coroutines in Section 13.6.1 that allows a coroutine to terminate
gracefully without affecting the other coroutines in the program. Be sure to handle
the case in which the only remaining coroutine terminates through quit.
Exercise 13.6.10 Modify the program from Conceptual Exercise 13.6.4 so that it
prints the x, y, and z values where 4 ď x, y, z ď 12, and x2 = y2 + z2 .
Exercise 13.6.11 Implement the program from Conceptual Exercise 13.6.4 in Ruby
using the callcc facility.
1 (define factorial
2 (lambda (n)
3 (cond
4 ((zero? n) 1) ; base case
5 (else (* n (factorial (- n 1))))))) ; inductive step
13.7. TAIL RECURSION 595
Each call to factorial is made with a promise to multiply the value returned
by n at the time of the call. Examining the run-time behavior of this function with
respect to the stack reveals the essence of recursive control behavior:
1 (factorial 5)
2 (* 5 (factorial 4))
3 (* 5 (* 4 (factorial 3)))
4 (* 5 (* 4 (* 3 (factorial 2))))
5 (* 5 (* 4 (* 3 (* 2 (factorial 1)))))
6 (* 5 (* 4 (* 3 (* 2 (* 1 (factorial 0)))))) ; base case
7 (* 5 (* 4 (* 3 (* 2 (* 1 1)))))
8 (* 5 (* 4 (* 3 (* 2 1))))
9 (* 5 (* 4 (* 3 2)))
10 (* 5 (* 4 6))
11 (* 5 24)
12 120
base case
len
of s
jumps
sion
wth
of s
gro
tack
time time
recursive control behavior iterative control behavior
Figure 13.7 Recursive control behavior (left) vis-à-vis iterative control behavior
(right).
1 (define factorial
2 (lambda (n)
3 (letrec ((fact
4 (lambda (n a)
5 (cond
6 ((zero? n) a)
7 (else (fact (- n 1) (* n a))))))) ; a tail call
8 (fact n 1))))
This version defines a nested, recursive function fact that accepts an additional
parameter a, which serves as an accumulator. Unlike in the first definition, in this
version of factorial, successive calls to fact do not communicate through a
return value (i.e., the factorial resulting from each smaller instance of the problem).
Instead, the successive recursive calls now communicate through the additional
accumulator parameter.
On line 7, notice that no computation is waiting for each recursive call to fact
to return; that is, the recursive call to factorial is no longer in operand position.
In other words, when fact calls itself, it does so at the tail end of a call to fact.
Such a recursive call is said to be in tail position—in contrast to operand position in
which the recursive call to factorial is found in the first version—and referred
to as a tail call. A function call is a tail call if there is no promise to do anything
with the returned value. In this version of factorial, no promise is made to
do anything with the return value other than return it as the result of the current
call to fact. When the tail call invokes the same function in which it occurs, the
approach is referred to as tail recursion. Thus, the tail call in this revised version of
the factorial function uses tail recursion.
13.7. TAIL RECURSION 597
(factorial 5)
(fact 5 1)
(fact 4 5)
(fact 3 20)
(fact 2 60)
(fact 1 120)
(fact 0 120)
120
Figure 13.7 (right) illustrates this graph. Unlike with the execution pattern of the
first definition of factorial, rotating this textual depiction of the control context
90 degrees to the left reveals a straight line, which indicates the control context
remains constant as the function executes. That pattern is a result of iterative control
behavior, where a recursive function uses a bounded control context. In this case,
the function has the potential to run in constant memory space and without the use
of a run-time stack because a “procedure call that does not grow control context
is the same as a jump” (Friedman, Wand, and Haynes 2001, p. 262). (The strategy
used to define this revised version of factorial is introduced in Section 5.6.3—
through the definition of a list reverse function—as Design Guideline 7: Difference
Lists Technique.)
The use of word tail in this context is slightly deceptive because it is not
used in the lexicographical context of the function, but rather in the run-
time context. In other words, a function that calls itself at the tail end of its
definition lexicographically is not necessarily a tail call. For instance, consider
line 5 in the first definition of factorial in Section 13.7.1 (repeated here):
(else (* n (factorial (- n 1))))))). The recursive call to factorial
in this line of code appears to be the last step of the function because it is positioned
at the rightmost end of the function definition lexicographically, but it is not the
final step. The key to determining whether a call is in tail or operand position is
the pending continuation. If there is a continuation waiting for the recursive call
to return, then the call is in operand position; otherwise, it is in tail position.
As we conclude this section, let us examine two new (tail-recursive) definitions
of the product function from Section 13.3.1. The following definition is the tail-
recursive version of the definition without a nonlocal exit for the exceptional case
(i.e., a zero in the input list) from that section:
(define product
(lambda (lon)
(letrec ((P (lambda (a l)
(cond
;; base case
(( n u l l? l) a)
;; exceptional case; abnormal = normal flow of control
((zero? (car l)) 0)
;; inductive case; normal flow of control
(else (P (* (car l) a) (cdr l)))))))
(P 1 lon))))
598 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
While this function is tail recursive and exhibits iterative control behavior, it may
perform unnecessary multiplications if the input list contains a zero. The following
definition is the tail-recursive version of the definition using a continuation
captured with call/cc to perform a nonlocal exit in the exceptional case from
Section 13.3.1:
(define product
(lambda (lon)
(call/cc
;; break stores the current continuation
(lambda (break)
(letrec ((P (lambda (a l)
(cond
;; base case
(( n u l l? l) a)
;; exceptional case; abnormal != normal flow of control
((zero? (car l)) (break 0))
;; inductive case; normal flow of control
(else (P (* (car l) a) (cdr l)))))))
(P 1 lon))))))
This definition, like the first one, is tail recursive, exhibits iterative control
behavior, and may perform unnecessary multiplications if the input list contains a
zero. However, this version avoids returning through all of the activation records
built up on the call stack when a zero is encountered in the list.
6. Tail-call optimization is also referred to as tail-call elimination. Since the caller jumps to the callee,
the tail call is essentially eliminated.
7. It is tail-call optimization, not tail-recursion optimization.
13.7. TAIL RECURSION 599
a language to support functions. Thus, TCO should be used not just in languages
where recursion is the primary means of repetition (e.g., Scheme and ML), but
in any language that has functions. Consider the following isodd and iseven
Python functions:
The call to isodd in the body of the definition of iseven is not tail recursion—
it is simply a tail call. The same is true for the call to iseven in the body of
isodd. Thus, neither of these functions is recursive independently of each other
(i.e., neither function has a call to itself). They are just mutually dependent on each
other or mutually recursive. Since Python does not use TCO on these non-recursive
functions, this program does not run in constant memory space or without a
stack.
The Scheme rendition of this Python program runs in constant space without a
stack:
Thus, not only can TCO be used to optimize non-recursive functions, but it
should be applied so that the programmer can use both individual non-recursive
functions and recursion without paying a performance penalty.
Tail-call optimization makes functions using only tail calls iterative (in run-
time behavior) and, therefore, more efficient. The revised definition of factorial
using tail recursion and exhibiting iterative control behavior does not have a
growing control context, so it now has the potential to be optimized to run in
constant space. However, it no longer mirrors the recursive specification of the
problem. By using tail recursion, we trade off function readability/writability for
the possibility of space efficiency. Even so, it is possible to make recursion iterative
while maintaining the correspondence of the code to the mathematical definition
of the function (Section 13.8). Table 13.7 summarizes the relationship between the
type of function call and the control behavior of a function.
The programming technique called trampolining (i.e., converting a program to
trampolined style) can be used to achieve the same effect as tail-call optimization
600 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
in a language that does not implement TCO. The underlying idea is to replace a
tail-recursive call to a function with a thunk to invoke that function. The thunk is
then subsequently applied in a loop. Consider the following trampolined version
of the previous odd/even program in Python that would not run:
>>> p r i n t (iseven(1000000000))
True
13.7. TAIL RECURSION 601
1 Prelude > :{
2 Prelude | len [] acc = acc
3 Prelude | len (x:xs) acc = len xs (acc + 1)
4 Prelude | :}
len [1,2,3..20000] 0
len [2,3..20000] (0 + 1)
len [3..20000] (0 + 1 + 1)
len [..20000] (0 + 1 + 1 + 1)
len [20000] (0 + 1 + 1 + 1 ... + 1)
len [] (0 + 1 + 1 + 1 ... + 1)
20000
This function is tail recursive and appears to run in constant memory space—the
stack never grows beyond one frame. However, the size of the second argument
to len is expanding because of the lazy (as opposed to eager) evaluation strategy
used. Although the interpreter no longer must save the pending computations—
in this case, the additions—on the stack, the interpreter stores a new thunk for
the expression (acc + 1) for every recursive call to len. Forcing the evaluation
of the second parameter to len (i.e., making the second parameter to len strict)
prevents the stack overflow. We can force a parameter to be strict by prefacing it
with $! (as demonstrated in Section 12.5.5):
Prelude > :{
Prelude | len [] acc = acc
Prelude | len (x:xs) acc = len xs $! (acc + 1)
Prelude | :}
Prelude > len [1..1000000000] 0
1000000000
The following trace illustrates how the evaluation of the second parameter to len
is forced for each recursive call:
len [1,2,3..1000000000] 0
len [2,3..1000000000] (0 + 1)
len [2,3..1000000000] 1
len [3..1000000000] (1 + 1)
len [3..1000000000] 2
len [..1000000000] (2 + 1)
len [..1000000000] 3
602 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
Even though this definition of len uses the accumulator approach in the
combining function passed to foldr (i.e., its first parameter), its invocation results
in a stack overflow:
f o l d r f i [] = i
f o l d r f i (x:xs) = f x ( f o l d r f i xs)
f o l d l f i [] = i
f o l d l f i (x:xs) = f o l d l f (f i x) xs
Notice that in this definition of len, we must reverse the order of the parameters
to the combining function (i.e., acc and x). However, this version produces a stack
overflow:
13.7. TAIL RECURSION 603
While foldl does use tail recursion, it also uses lazy evaluation. Thus, this
invocation of len results in a stack overflow because a thunk is created for the
second parameter to foldl—that is, the evaluation of the combining function
(f i x)—for every recursive call and the second parameter continues to grow.
The invocation of len builds up a lengthy chain of thunks that will eventually
evaluate to the length of the list rather than maintaining a running length. Thus,
this version of len behaves the same as the first version of len in this subsection.
To solve this problem, we need a version of foldl that is both tail recursive
and strict in its second parameter:
1 Prelude > :{
2 Prelude | f o l d l ' f i [] = i
3 Prelude | f o l d l ' f i (x:xs) = ( f o l d l ' f $! f i x) xs
4 Prelude | :}
5
6 Prelude > :type f o l d l '
7 f o l d l ' :: (a -> t -> a) -> a -> [t] -> a
While foldr should be avoided for computing the length of a list because it is
not defined using tail recursion, foldr should not be avoided in all cases. For
instance, consider the following function, which determines whether all elements
of a list are True:
Since (&&) is non-strict in its second parameter, use of foldr obviates further
exploration of the list as soon as a False is encountered:
In this case, foldr does not build up the ability to perform the remaining
computations. The same is not true of foldl’. For instance:
Even though this version runs in constant space because foldl’ is defined using
tail recursion, it examines every element of the input list. Thus, foldr is preferred
in this case. Similarly, the built-in Haskell function concat uses foldr even
though foldr is not defined using tail recursion:
Tracing this invocation of concat reveals why foldr is used in its definition:
Unlike the expansion for the invocation of the definition of len using foldr
in this subsection, the expression on line 16 is as far as the interpreter will
evaluate the expression until the program seeks to examine an element in the
tail of the result. Since we can garbage collect the first cons cell of this result
before we traverse the second, concat not only runs in constant stack space,
but also accommodates infinite lists. By contrast, neither foldl’ nor foldl can
handle infinite lists because the left-recursion in the definition of either would
lead to infinite recursion. For instance, the following invocation of foldl does
not terminate (until the stack overflows):
(Note that repeat e is an infinite list, where every element is e.) However, the
following invocation of foldr returns False immediately:
Since (&&) is non-strict in its second parameter, we do not have to evaluate the
rest of the foldr expression to determine the result of allTrue. Similarly, since
(++) is non-strict in its second parameter, we do not have to evaluate the rest of
the foldr expression to determine the head of the result of concat. However,
because the combining function (\acc x -> acc+1) in len must run on every
element of the list before a list length can be computed, we require the result of the
entire foldr to compute a final length. Thus, in that case, foldl’ is a preferable
choice.
Table 13.8 summarizes these fold higher-order functions with respect to
evaluation strategy in eager and lazy languages. Defining tail-recursive functions
in languages with a lazy evaluation strategy requires more attention than doing so
in languages with an eager evaluation strategy. Using foldl’ requires constant
stack space, but necessitates a complete expansion even for combining functions
that are non-strict in their second parameter. However, even though foldr is
not defined using tail recursion, it can run efficiently if the combining function
is non-strict in its second parameter. More generally, the space complexity of lazy
programs is complex.
606 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
Table 13.8 Summary of Higher-Order fold Functions with Respect to Eager and
Lazy Evaluation
We offer some general guidelines for when foldr, foldl, and foldl’ are
most appropriate in designing functions (assuming the use of each function results
in the same value).
eager finite
use foldl’ use foldl’
programming strict
input list?
language?
lazy combining
function?
infinite
non-strict use foldr
Figure 13.8 Decision tree for the use of foldr, foldl, and foldl’ in designing
functions (assuming the use of each function results in the same value).
folding the same associative operator [e.g., (++)] across the same list with the
same initial value using foldl’ and foldr. Use a different associative operator
than any of those already given in this section. Use program comments to clarify
your demonstration. Hint: Use repeat in conjuntion with take to generate finite
lists to be used as test lists in your example; use repeat to generate infinite lists to
be used as test lists in your example and take to generate output from an infinite
list that has been processed.
Exercise 13.7.3 Demonstrate how to overflow the control stack in Haskell using
foldr with a function that is made strict in its second argument with $!.
Exercise 13.7.4 Define a recursive Scheme function square using tail recursion
that accepts only a positive integer n and returns the square of n (i.e., n2 ). Your
definition of square must contain only one recursive helper function bound in a
letrec expression that does not require an unbounded amount of memory.
Exercise 13.7.6 The Fibonacci series 0, 1, 1, 2, 3, 5, 8, 13, 21, . . . begins with the
numbers 0 and 1 and has the property that each subsequent Fibonacci number
is the sum of the previous two Fibonacci numbers. The Fibonacci series occurs in
nature and, in particular, describes a form of a spiral. The ratio of the successive
Fibonacci numbers converges on a constant value of 1.618. . . . This number, too,
repeatedly occurs in nature and has been called the golden ratio or the golden
mean. Humans tend to find the golden mean aesthetically pleasing. Architects
608 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
often design windows, rooms, and buildings with a golden mean length/width
ratio.
Define a Scheme function fibonacci, using only one tail call, that accepts a
non-negative integer n and returns the nth Fibonacci number. Your definition of
fibonacci must run in Opnq and Op1q time and space, respectively. You may
define one helper function, but it also must use only one tail call. Do not use more
than 10 lines of code. Your function must be invocable.
Examples:
> (fibonacci 0)
0
> (fibonacci 1)
1
> (fibonacci 2)
1
> (fibonacci 3)
2
> (fibonacci 4)
3
> (fibonacci 5)
5
> (fibonacci 6)
8
> (fibonacci 7)
13
> (fibonacci 8)
21
> (fibonacci 20)
6765
Here, +cps and *cps are the CPS analogs of the + and * operators, respectively,
and each accepts an additional parameter representing the continuation. When
+cps is invoked on line 11, the third parameter specifies how to continue
the computation. Specifically, the third parameter is a lambda expression that
indicates what should be done with the return value of the invocation of +cps. In
this case, the return value is passed to *cps with 3. Notice that the continuation of
*cps is the identity function because we simply want to return the value. Consider
the following expression:
The function b calls the function a in tail position on line 3. As a result, the
continuation of a is the same as that of b. In other words, b does not perform
any additional work with the return value of a. The same is not true of the call
to the function inc in the function a on line 2. The call to inc on line 2 is in
operand position. Thus, when a receives the result of inc, the function a performs
an additional computation—in this case, a multiplication by 2—before returning
to its continuation. Here, the implicit continuation of the call to
We can rewrite this entire letrec expression in CPS by replacing these implicit
continuations with explicit lambda expressions:
610 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
(define factorial
(lambda (n)
(cond
((zero? n) 1)
(else (* n (factorial (- n 1)))))))
1 (define factorial
2 (letrec
3 ((fact-cps (lambda (n growing-k)
4 (cond
5 ((eqv? n 1) (growing-k 1))
6 (else (fact-cps (- n 1) ; a tail call
7 (lambda (rtnval) (* rtnval (growing-k n)))))))))
8 (lambda (n)
9 (cond
10 ((zero? n) 1)
11 (else (fact-cps n (lambda (x) x)))))))
The most critical lines of code in this definition are lines 6 and 7 where the
recursive call is made and the explicit continuation is passed, respectively. Lines
6–7 conceptually indicate: take the result of (n-1)! and continue the computation
by first continuing the computation of n! with n and then multiplying the result
by (n-1)!. In other words, when we call (growing-k n), we are passing the
input parameter to fact-cps in an unmodified state to its continuation. This
approach is tantamount to writing (lambda (x k) (k x)). The following
series of expansions demonstrates the unnaturalness of this approach:
8. The factorial functions presented in this section are not entirely defined in CPS because the
primitive functions (e.g., zero?, *, and -) are not defined in CPS . See Section 13.8.3 and Programming
Exercise 13.10.26.
13.8. CONTINUATION-PASSING STYLE 611
(factorial 3)
(* 1 (* 2 3))
(* 1 6)
While defined using tail recursion, this first version of fact-cps runs contrary
to the spirit of CPS. The definition does not embrace the naturalness of the
continuation of the computation.
Consider replacing lines 6–7 in this first version of fact-cps with the
following lines:
6 (else (fact-cps (- n 1) ; a tail call
7 (lambda (rtnval) (growing-k (* rtnval n)))))))))
((lambda (x) x) 6)
continuation of the computation. However, this second version grows the passed
continuation growing-k in each successive recursive call. (The first version of
fact-cps incidentally does too.) Thus, while the second version is more naturally
CPS , it is not space efficient. In an attempt to keep the run-time stack of constant
size (through the use of CPS), we have shifted the source of the space inefficiency
from a growing stack to a growing continuation. The continuation argument is a
closure representation of the stack (Section 9.8.2).
Thus, this second version of fact-cps demonstrates that use of tail recursion
is not sufficient to guarantee space efficiency at run-time. Even though both calls
to fact-cps and growing-k are in tail position (lines 6 and 7, respectively), the
run-time behavior of fact-cps is essentially the same as that of the non-CPS
version of factorial given at the beginning of this subsection—the expansion
of the run-time behavior of each function shares the same shape. The use of
continuation-passing style in this second version of fact-cps explicitly reifies
the run-time stack in the interpreter and passes it as an additional parameter to
each recursive call. Just as the stack grows when running a function defined using
recursive control behavior, in the fact-cps function the additional parameter
representing the continuation—the analog of the stack—is also growing because it
is encapsulating the continuation from the prior call.
Let us reconsider the definition of a factorial function using tail recursion
(in Section 13.7.2 and repeated here):
1 (define factorial
2 (lambda (n)
3 (letrec ((fact
4 (lambda (n a)
5 (cond
6 ((zero? n) a)
7 (else (fact (- n 1) (* n a))))))) ; a tail call
8 (fact n 1))))
This function is not written using CPS, but is defined using tail recursion. The
following is a CPS rendition of this version of factorial:
1 (define factorial
2 (lambda (n)
3 (letrec ((fact-cps
4 (lambda (n a constant-k)
5 (cond
6 ((zero? n) (constant-k a))
7 (else (fact-cps (- n 1) (* n a) constant-k)))))) ; a CPS tail call
8 (fact-cps n 1 (lambda (x) x)))))
Here, unlike the first version of the fact-cps function defined previously, this
third version does not grow the passed continuation constant-k. In this version,
the continuation passed is constant across the recursive calls to fact-cps (line 7):
(factorial 3)
((lambda (x) x) 6)
Notice that the primitive operators used in the definition of remainder-cps (e.g.,
< on line 4 and - on line 5) are not written in CPS. To maximize the benefits of
CPS discussed in this chapter, all function calls in a program should use CPS . In
other words, continuation-passing style is an all-or-nothing proposition, especially
to obviate the need for a run-time stack of activation records. The following is a
complete CPS rendition of the remainder-cps function, including definitions of
614 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
Table 13.9 Properties of the four versions of fact-cps presented in Section 13.8.2.
the less than and subtraction operators in CPS (the <cps and -cps functions on
lines 1–3 and 5–7, respectively):
1 (define <cps
2 (lambda (x y k)
3 (k (< x y))))
4
5 (define -cps
6 (lambda (x y k)
7 (k (- x y))))
8
9 (define remainder-cps
10 (lambda (n d k)
11 (<cps n d (lambda (bool)
12 (cond
13 (bool (k n))
14 (else (-cps n d
15 (lambda (rtnval) (remainder-cps rtnval d k)))))))))
For purposes of clarity of presentation, the primitives used in this chapter are not
defined in CPS (Programming Exercise 13.10.26).
1 (define product-cps
2 (lambda (lon k)
3 ( l e t ((break k))
4 (letrec ((P
5 (lambda (l growing-k)
6 (cond
7 ((n u l l? l) (growing-k 1))
8 ((zero? (car l)) (break
9 "Encountered a zero in the input list."))
10 (else (P (cdr l)
11 (lambda (x) (growing-k (* (car l) x)))))))))
12 (P lon k)))))
continuation—which we will not invoke until we have determined the input list
does not include a zero. This approach is time efficient, but not space efficient. In
contrast, if we desire the function to run in constant space, we must perform the
multiplications as the recursion proceeds (line 10 in the following definition):
1 (define product-cps
2 (lambda (lon k)
3 ( l e t ((break k))
4 (letrec ((P
5 (lambda (l a constant-k)
6 (cond
7 (( n u l l? l) (constant-k a))
8 ((zero? (car l)) (break
9 "Encountered a zero in the input list."))
10 (else (P (cdr l) (* (car l) a) constant-k))))))
11 (P lon 1 k)))))
(define product-cps
(lambda (lon k)
(letrec ((P (lambda (l a k)
(cond
((n u l l? l) (k a))
((zero? (car l))
(k "Encountered a zero in the input list."))
(else (P (cdr l) (* (car l) a) k))))))
(P lon 1 k))))
Table 13.11 summarizes the similarities and differences in these three versions of a
product function.
We conclude our discussion of the time-space trade-off by stating:
• We can be time efficient by waiting until we know for certain that we will
not encounter any exceptions before beginning the necessary computation.
This requires us to store the pending computations on the call stack or
Table 13.11 Properties Present and Absent in the call/cc and CPS Versions of
the product Function. Notice that we cannot be both time and space efficient.
13.8. CONTINUATION-PASSING STYLE 617
• The function can accept more than one continuation. Any function defined
using CPS can accept more than one continuation. For instance, we can define
product-cps as follows, rendering the normal and exceptional continuations
more salient:
1 (define product-cps
2 (lambda (lon k break)
3 (letrec ((P
4 (lambda (l normal-k)
5 (cond
6 (( n u l l? l) (normal-k 1))
7 ((zero? (car l)) (break
8 "Encountered a zero in the input list."))
9 (else (P (cdr l)
10 (lambda (x)
11 (normal-k (* (car l) x)))))))))
12 (P lon k))))
In this version, the second and third parameters (k and break) represent the
normal and exceptional continuations, respectively:
In the last invocation to product-cps (line 11), break is bound to the built-
in Scheme list function at the time of the call.
618 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
Figure 13.9 Both call/cc and CPS involve reification and support control
abstraction.
• Any continuation can accept more than one argument. Any continuation
passed to a function defined using CPS can accept more than one argument
because the programmer is defining the function that represents the
continuation (rather than the interpreter reifying and returning it as a unary
function, as is the case with call/cc). The same is not possible with
call/cc—though it can be simulated (Programming Exercise 13.10.15). For
instance, we can replace lines 7–8 in the definition of product-cps given in
this subsection with
Now break accepts two arguments: the result of the evaluation of the
product of the input list (i.e., here, 0) and an error message:
This approach helps us cleanly factor the code to handle successful execution
from that for unsuccessful execution (i.e., the exception).
13.9 Callbacks
A callback is simply a reference to a function, which is typically used to return
control flow back to another part of the program. The concept of a callback is
related to continuation-passing style. Consider the following Scheme program in
direct style:
The main program (lines 6–8) calls addWord (to add a word to the dictionary; line
7), followed by getDictionnaire (to get the dictionary; line 8). The following is
the rendering of this program in CPS using a callback:
9 (begin
10 ...
11 (installHandler handleClick) ; install callback function
12 (start-event-loop))))
This type of callback is called a deferred callback because its execution is deferred
until the event that triggers its invocation occurs. Sometimes callbacks used
this way are also referred to as asynchronous callbacks because the callback
(handleClick) is invoked asynchronously or “at any time” in response to the
(mouse click) event.
In an object-oriented context, the UI component is an object and its event
handlers are defined as methods in the class of which the UI component is an
instance. The methods to install/register custom event handlers (i.e., callback
installation methods) are also part of this class. When a programmer desires to
install a custom event handler, either (1) the programmer calls the installation
method and passes a callback to it, and the installation method stores a pointer
to that callback in an instance variable of the object, or (2) the programmer creates
a subclass and overrides the default event handler.
Programming with callbacks is an inversion of the traditional programming
practice with an API. Typically, an application program calls functions in a
language library or API to make use of the abstractions that the API provides
as they relate to the application. With callbacks, the API invokes callbacks the
programmer defines and installs.
The CPS conversion involves a set of rewrite rules from a variety of syntactic
constructs (e.g., conditional expressions or function applications) to their
equivalent forms in continuation-passing style (Feeley 2004). As a result of this
systematic conversion, all non-tail calls in the original program are translated into
tail calls in the converted program, where the continuation of the non-tail call
is packaged and passed as a closure, leaving the call in tail position. Since each
function call is in tail position, each function call can be translated as a jump using
tail-call optimization (see the right side of Figure 13.7).
13.10. CPS TRANSFORMATION 621
READABLE/
WRITABLE
Figure 13.10 Program readability/writability vis-à-vis space complexity axes: (top left)
writable and space inefficient: programs using non-tail (recursive) calls; (bottom
left) unwritable and space inefficient: programs using tail calls, including CPS,
but without tail-call optimization (TCO); (bottom right) unwritable and space
efficient: programs using tail calls, including CPS, with TCO, exhibiting iterative
control behavior; and (top right) writable and efficient: programs using non-tail
(recursive) calls mechanically converted to the use of all tail calls through the CPS
transformation, with TCO, exhibiting iterative control behavior. The curved arrow
at the origin indicates the order in which these approaches are presented in the
text.
analogous to interpreted
C optimizations
x86
gcc clang
> (+ 1
(call/cc
(lambda (capturedk) (+ 2 (capturedk 3)))))
4
13.10. CPS TRANSFORMATION 623
> (call/cc-cps
(lambda (capturedk normal-k)
(capturedk 3 (lambda (result) (+ 2 result))))
(lambda (result) (+ 1 result)))
4
1 (define call/cc-cps
2 (lambda (f normal-k)
3
4 ( l e t ((reified-current-continuation
5 ;; replace current continuation with captured continuation:
6 ;; replace current continuation currentk_tobeignored with
7 ;; with captured continuation normal-k of call/cc-cps
8 (lambda (result currentk_tobeignored)
9 (normal-k result))))
10
11 (f reified-current-continuation normal-k))))
The expression on line 11 calls the first argument to call/cc-cps (i.e., the
function f; step 1) and passes to it the reified continuation of the invocation
of call/cc-cps (i.e., reified-current-continuation; step 2) created on
lines 4–9. When call/cc-cps is invoked:
• f is
(lambda (capturedk normal-k)
(capturedk 3 (lambda (result) (+ 2 result))))
1 > (+ 1
2 (call/cc
624 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
Unlike in the prior example, the continuation captured through call/cc is never
invoked in this example; that is, the captured continuation capturedk is not
invoked on line 4. Translating this expression into CPS leads to
1 > (call/cc-cps
2 ( l e t ((f (lambda (x normal-k) (normal-k (+ x 10)))))
3 (lambda (capturedk normal-k)
4 (normal-k (f 10 (lambda (result) (+ 2 result))))))
5 (lambda (result) (+ 1 result)))
6 23
(define call/cc-cps
(lambda (f k)
(f (lambda (return_value ignore)
(k return_value)) k)))
Here are two additional examples of invocations of call/cc-cps, along with the
analogous call/cc examples:
(define product
(lambda (l)
(letrec ((P (lambda (lon break)
(cond
(( n u l l? lon) (break 1))
((zero? (car lon)) (break 0))
(else (P (cdr lon)
(lambda (returnvalue)
(break (* (car lon) returnvalue)))))))))
(P l (lambda (x) x)))))
Exercise 13.10.4 Explain what CPS offers that call/cc does not.
626 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
(define stackBuilder
(lambda(x)
(cond
((eqv? 0 x) "DONE" )
(else (cons '() (stackBuilder (- x 1)))))))
(define stackBuilderCPS
(lambda(x k)
( l e t ((break k))
(letrec ((helper (lambda (x k)
(cond
((eqv? 0 x) (break "DONE") )
(else (helper (- x 1)
(lambda (rv) (cons '() rv))))))))
(helper x k)))))
(define stackBuilder-cc
(lambda(x)
(call/cc
(lambda(k)
(letrec ((helper (lambda(x)
(cond
((eqv? 0 x) (k "DONE"))
(else (helper (- x 1)))))))
(helper x))))))
(stackBuilder 10)
(stackBuilderCPS 10 (lambda(x) x))
(stackBuilder-cc 10)
Run this program in the Racket debugger and step through each of the three
different calls to stackBuilder, stackBuilderCPS, and stackBuilder-cc.
In particular, observe the growth, or lack thereof, of the stack in the upper right-
hand corner of the debugger. What do you notice? Report the details of your
observations of the behavior and dynamics of the stack.
13.10.18 13.10.17 ˆ ˆ
‘ ‘ ‘ ‘
13.10.19 N/A ˆ ˆ ˆ
‘ ‘ ‘ ‘ ‘
13.10.20 13.10.19 ˆ ˆ
‘ ‘ ‘ ‘ ‘
13.10.21 N/A ˆ ˆ
‘ ‘ ‘ ‘ ‘
13.10.22 13.10.21 ˆ ˆ
‘ ‘ ‘ ‘
13.10.23 N/A ˆ ˆ ˆ
‘ ‘ ‘ ‘ ‘
13.10.24 13.10.23 ˆ ˆ
Table 13.13 Mapping from the Greatest Common Divisor Exercises in This Section to the Essential Aspects of Continuation-Passing
Style
627
628 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
Exercise 13.10.9 Define a recursive Scheme function member1 that accepts only
an atom a and a list of atoms lat and returns the integer position of a in lat
(using zero-based indexing) if a is a member of lat and #f otherwise. Your
definition of member1 must use continuation-passing style to compute the position
of the element, if found, in the list. Your definition must not use call/cc.
In addition, your definition of member1 must not return back through all the
recursive calls when the element a is not found in the list lat. Your function must
not perform any unnecessary operations, but need not return in constant space.
Use the following template for your function and include the missing lines of code
(represented as ...):
(define member1
(lambda (a lat)
(letrec ((member-cps (lambda (ll break)
(letrec ((M (lambda (l k)
(cond
...
...
...))))
...))))
(member-cps lat (lambda (x) x)))))
Exercise 13.10.10 Define a recursive Scheme function member1 that accepts only
an atom a and a list of atoms lat and returns the integer position of a in
lat (using zero-based indexing) if a is a member of lat and #f otherwise.
Your definition of member1 must use continuation-passing style, but the passed
continuation must not grow. Thus, the function must run in constant space. Your
definition must not use call/cc. In addition, your definition of member1 must
not return back through all the recursive calls when the element a is not found
in the list lat. Your function must run in constant space, but need not avoid all
unnecessary operations. Use the following template for your function and include
the missing lines of code (represented as ...):
(define member1
(lambda (a lat)
(letrec ((member-cps (lambda (l ... break)
(cond
...
...
...))))
(member-cps lat ... (lambda (x) x)))))
(define fibonacci
(lambda (n)
(letrec ((fibonacci-cps (lambda (n prev curr k)
(cond
...
...))))
(fibonacci-cps n (lambda (x) x)))))
Do not use call/cc in your function definition. See the examples in Programming
Exercise 13.7.6.
Examples:
Exercise 13.10.13 Redefine the first version of the Scheme function product-cps
in Section 13.8.4 as product, a function that accepts a variable number of
arguments and returns the product of them. Define product using continuation-
passing style such that no multiplications are performed if any of the list elements
are zero. Your definition must not use call/cc. The nested function P from the
first version in Section 13.8.4 is named product-cps in this revised definition.
Examples:
> (product 1 2 3 4 5 6)
720
> (product 1 2 3 0 4 5 6)
"Encountered a zero in the input list."
must run in constant space. The nested function P from the version in Section 13.8.4
is named product-cps in this revised definition.
(define product-cps
(lambda (lon k break)
(letrec ((P
(lambda (l normal-k)
(cond
((n u l l? l) (normal-k 1))
((zero? (car l)) (break 0
"Encountered a zero in the input list."))
(else (P (cdr l)
(lambda (x) (normal-k (* (car l) x)))))))))
(P lon k))))
When a zero is encountered in the input list, this function returns with two
values: 0 and a string. Recall that the ability to continue with multiple values is
an advantage of CPS over call/cc.
Redefine this function using direct style (i.e., in non-CPS fashion) with call/cc.
While it is not possible to pass more than one value to a continuation captured
with call/cc, figure out how to simulate this behavior to achieve the following
result when a zero is encountered in the list:
Exercise 13.10.16 Define a Scheme function product that accepts only a list of
numbers and returns the product of them. Your definition must not perform any
multiplications if any of the list elements is zero. Your definition must not use
call/cc or continuation-passing style. Moreover, the call stack may grow only once
to the length of the list plus one (for the original function).
(define gcd-lon
(lambda (lon)
( l e t ((main (lambda (ll break)
(letrec ((gcd-lon1
(letrec ((gcd-cps (lambda (u v k)
(cond
13.10. CPS TRANSFORMATION 631
((zero? v) (k u))
(else (gcd-cps v (remainder u v) k))))))
(lambda (l k)
(cond
(...)
(...)
(else ...))))))
(gcd-lon1 ll break)))))
(main lon (lambda (x) x)))))
Examples:
Examples:
(define gcd *
(lambda (l)
( l e t ((main (lambda (ll break)
(letrec ((gcd *1
(letrec ((gcd-cps (lambda (u v k)
(cond
((zero? v) (k u))
(else (gcd-cps v (remainder u v) k))))))
(lambda (l k)
(cond
(...
(cond
(...)
(else ...))
(...)
(else ...)))))))
(gcd *1 ll break)))))
(main l (lambda (x) x)))))
Examples:
Examples:
(define gcd-lon
(lambda (lon)
( l e t ((main (lambda (ll break)
(letrec ((gcd-lon1
(letrec ((gcd-cps (lambda (u v k)
(cond
((zero? v) (k u))
(else (gcd-cps v (remainder u v) k))))))
(lambda (l a k)
(cond
(...)
(...)
(...)
(else ...))))))
(gcd-lon1 ll ... break)))))
(main lon (lambda (x) x)))))
Do not use call/cc in your function definition. See the examples in Programming
Exercise 13.3.13.
(define gcd-lon
(lambda (lon)
( l e t ((main (lambda (ll break)
(letrec ((gcd-lon1
(letrec ((gcd-cps (lambda (u v k)
(cond
((zero? v) (k u))
(else (gcd-cps v (remainder u v) k))))))
(lambda (l a k)
(cond
(...)
(...)
(...)
(...)
(else ...))))))
(gcd-lon1 ll ... break)))))
(main lon (lambda (x) x)))))
(define gcd *
(lambda (l)
( l e t ((main (lambda (ll break)
(letrec ((gcd *1
(letrec ((gcd-cps (lambda (u v k)
(cond
((zero? v) (k u))
(else (gcd-cps v (remainder u v) k))))))
(lambda (l a k)
(cond
((number? (car l))
(cond
(...)
(...)
(else ...)))
(...)
(else ...))))))
(gcd *1 ll ... break)))))
(main l (lambda (x) x)))))
(define gcd *
(lambda (l)
( l e t ((main (lambda (ll break)
(letrec ((gcd *1
(letrec ((gcd-cps (lambda (u v k)
(cond
((zero? v) (k u))
(else (gcd-cps v (remainder u v) k))))))
(lambda (l a k)
(cond
((number? (car l))
(cond
(...)
(...)
(...)
(else ...))
(...)
13.11. THEMATIC TAKEAWAYS 635
(else ...)))))))
(gcd *1 ll ... break)))))
(main l (lambda (x) x)))))
(define while-loop
(lambda (condition body)
( l e t ((W (lambda (k)
...)))
(W ...))))
Exercise 13.10.27 Consider the Scheme program in Section 13.6.1 that represents
an implementation of coroutines using call/cc. Rewrite this program using the
call/cc-cps function defined in Section 13.10.1 as a replacement of call/cc.
‘
call/cc with tail use captured k use accumulator ˆ iterative N/A second version of product in
recursion parameter Section 13.7.2
‘
tail recursion without must return through use accumulator ˆ iterative N/A first version of product in
CPS or call/cc call stack parameter Section 13.7.2
‘
CPS (implies tail call) use passed break, use passed iterative ˆ ˆ first version of product in
e.g., identity growing-k Section 13.8.4
‘ ‘
CPS (implies tail call) use passed break, use constant ˆ iterative second version of product in
e.g., identity continuation and Section 13.8.4
accumulator
parameter; e.g.,
(constant-k a)
Table 13.14 The Approaches to Function Definition as Related to Control Presented in This Chapter Based on the Presence and
Absence of a Variety of Desired Properties. Theme: We cannot be both time and space efficient.
637
638 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
of the prior computation—the one value for which the continuation is waiting to
complete the next computation.
The call/cc function in Scheme captures the current continuation with a
representation of the environment, including the run-time stack, at the time call/cc
is invoked. The expression (k ), where k is a first-class continuation captured
through (call/cc (lambda (k) ...)) and is a value, does not just transfer
program control. The expression (k ) transfers program control and restores the
environment, including the stack, that was active at the time call/cc captured k, even
if it is not active when k is invoked. This capture and restoration of the call stack
is the ingredient necessary for supporting the creation of a wide variety of new
control constructs. More specifically, it is the unlimited extent of lexical closures
that unleash the power of first-class continuations for control abstraction: The
unlimited lifetime of closures enable control to be transferred to stack frames that
seemingly no longer exist, called heap-allocated stack frames.
Mechanisms for transferring control in programming languages are typically
used for handling exceptions in programming. These mechanisms include
function calls, stack unwinding/crawling operators, exception-handling systems,
and first-class continuations. In the absence of heap-allocated stack frames, once
the stack frames between the function that caused/raised an exception and the
function handling that exception have been popped off the stack, they are gone
forever. For instance, the setjmp/longjmp stack unwinding/crawling functions
in C allow a programmer to perform a nonlocal exit from several functions
on the stack in a single jump. Without heap-allocated stack frames, these stack
unwinding/crawling operators transfer control down the stack, but not back up
it. Thus, these mechanisms are simply for nonlocal exits and, unlike first-class
continuations, are limited in their support for implementing other types of control
structures (e.g., breakpoints).
We have also defined recursive functions in a manner that maintains the
natural correspondence between the recursive specification or mathematical
definition of the function [e.g., n! “ n ˚ pn ´ 1q!] and the program code
implementing the function (e.g., factorial). This congruence is a main theme
running throughout Chapter 5. When such a function runs, the activation records
for all of the recursive calls are pushed onto the run-time stack while building
up pending computations. Such functions typically require an ever-increasing
amount of memory and exhibit recursive control behavior. When the base case is
reached, the computation required to compute the function is performed as these
pending computations are executed while the activation records for the recursive
calls pop off the stack and the memory is reclaimed. In a function using tail
recursion, the recursive call is the last operation that the function must perform.
Such a recursive call is in tail position [e.g., (factorial (- n 1) (* n a))]
in contrast to operand position [e.g., (* n (factorial (- n 1)))]. A function
call is a tail call if there is no promise to do anything with the returned value.
Recursive functions using tail recursion exhibit iterative control behavior. However,
the structure of the program code implementing a function using tail recursion no
longer reflects the recursive specification of the function—the symmetry is broken.
13.12. CHAPTER SUMMARY 639
Thus, the use of tail recursion trades off function writability for improved space
complexity.
We can turn all function calls into tail calls by encapsulating any computation
remaining after each call—the “what to do next”—into an explicit, reified
continuation and passing that continuation as an extra argument in each tail call. In
other words, we can make the implicit continuation of each called function explicit
by packaging it as an additional argument passed in each function call. Functions
written in this manner use continuation-passing style (CPS). The continuation that
the programmer of a function using CPS manually reifies is the continuation that
the call/cc function automatically reifies. A function defined using CPS can
accept multiple continuations; this property helps us cleanly factor the various
ways a program might complete its computation. A function defined in CPS can
pass multiple results to its continuation; this property provides us with flexibility
in communicating results to continuations.
A desired result of CPS is that the recursive function defined in CPS run
in constant memory space. This means that no computations are waiting for
the return value of each recursive call, which in turn means the function that
made the recursive call can be popped off the stack. The growing stack of
pending computations can be transmuted through CPS as a growing continuation
parameter. We desire a function embracing the spirit of CPS, where, ideally, the
passed continuation is not growing. Continuation-passing style with a bounded
continuation parameter and tail-call optimization eliminates the run-time stack,
thereby ensuring the recursive function can run in constant space—and rendering
recursion as efficient as iteration.
There is a trade-off between time complexity and space complexity in
programming. We can be either (1) time efficient, by waiting until we know for
certain that we will not encounter any exceptions before beginning the necessary
computation (which requires us to store the pending computations on the call stack
or in a continuation parameter), or (2) space efficient, by incrementally computing
intermediate results (in, for example, an accumulator parameter) in the presence
of the uncertainty of encountering any exceptional situations. It is challenging to
do both (Table 13.14).
Programming abnormal flows of control and running recursive functions in constant
space are two issues that can easily get conflated in the study of program
control. Continuation-passing style with tail-call optimization can be used to
achieve both. Tail-call optimization realizes the constant space complexity. Passing
and invoking the continuation parameter (e.g., the identity function) is used to
program abnormal flows of control. If the continuation parameter is growing, then
it is used to program the normal flow of control—albeit in a cluttered manner. In
contrast, call/cc is primarily used for programming abnormal flows of control.
For instance, the call/cc function can be used to unwind the stack in the case of
exceptional values (e.g., a 0 in the list input to a product function; see the versions
of product using call/cc in Sections 13.3.1 and 13.7.2). (Programming abnormal
flows of control with first-class continuations in this manner can be easily confused
with improving time complexity of a function by obviating the need to return through
640 CHAPTER 13. CONTROL AND EXCEPTION HANDLING
Technique Purpose/Effect
continuation-passing style tail recursion
tail recursion + TCO space efficiency
CPS + TCO space efficiency
first-class continuations run-time efficiency
(call/cc or CPS)
for exception handling
Logic Programming
The more I think about language, the more it amazes me that people
ever understand each other at all.
— Kurt Gödel
For now, what is important is not finding the answer, but looking for it.
— Douglas R. Hofstadter, Gödel, Escher, Bach: An Eternal Golden
Braid (1979)
contrast to an imperative style of programming, where the programmer
I N
specifies how to compute a solution to a problem, in a declarative style of
programming, the programmer specifies what they want to compute, and the
system uses a built-in search strategy to compute a solution. A simple and
perhaps familiar example of declarative programming is the use of an embedded
regular expression language within a programming language. For instance, when
a programmer writes the Python expression ([a-z])([a-z])[a-z]\2\1, the
programmer is declaring what they want to match—in this case, strings consisting
of five lowercase alphabetical character palindromes—-and not how to match
those strings using for loops and string manipulation functions. In this chapter,
we study the foundation of declarative programming3 in symbolic logic and
Prolog—the most classical and widely studied programming language supporting
a logic/declarative style of programming.
p _ q
p_q Ą r
p _ q Ą r
pp _ qq ” pppq _ qq
pp _ q Ą r q ” ppp _ qq Ą r q
pp _ q Ą r q ” ppp _ pqqq Ą r q
3. We use the terms logic programming and declarative programming interchangeably in this chapter.
14.2. PROPOSITIONAL CALCULUS 643
p p q p Ą q p _ q pp Ą qq ðñ pp _ qq
T F T T T T
T F F F F T
F T T T T T
F T F T T T
The truth table presented in Table 14.2 proves the logical equivalence between
p Ą q and p _ q.
A model of a proposition in formal logic is a row of the truth table. Entailment,
which is a semantic concept in formal logic, means that all of the models that make
the left-hand side of the entailment symbol (() true also make the right-hand side
true. For instance, p ^ q ( p _ q, which reads left to right “p ^ q entails p _ q”
and reads right to left “p _ q is a semantic consequence of p ^ q.” Notice that
p _ q ( p ^ q is not true because some models that make the proposition on the
left-hand side true (e.g., the second and third rows of the truth table) do not make
the proposition on the right-hand side true.
While implication and entailment are different concepts, they are easily
confused. Implication is a function or connective operator that establishes a
conditional relationship between two propositions. Entailment is a semantic
relation that establishes a consequence relationship between a set of propositions
and a proposition.
p q p ^ q p _ q pp ^ qq Ą pp _ qq
T T T T T
T F F T T
F T F T T
F F F F T
This statement is called the deduction theorem and a proposition that is true for all
models is called a tautology (see rightmost column in Table 14.3).
The relationship between logical equivalence (”) and entailment (() is also
notable:
Biconditional and logical equivalence are also sometimes confused with each
other. Like implication, biconditional establishes a (bi)conditional relationship
between two propositions. Akin to entailment, logical equivalence is a semantic
relation that establishes a (bi)consequence relationship. While different concepts,
biconditional and logical equivalence (like implication and entailment) are
similarly related:
Figure 14.1 The theoretical foundations of functional and logic programming are
λ-calculus and first-order predicate calculus, respectively.
Phosopher pPscq.
FrendpLc, Leseq.
In the first example, Phosopher is called the functor. In the second example,
Lc, Lese is the ordered list of arguments. When the functor and the ordered
list of arguments are written together in the form of a function as one element
of a relation, the result is called a compound term. The following are examples of
compound propositions in predicate calculus:
These two logical quantifiers have the highest precedence in predicate calculus.
The scope of a quantifier is limited to the atomic proposition that it precedes unless
it precedes a parenthesized compound proposition, in which case it applies to the
entire compound proposition.
Propositions are purely syntactic and, therefore, have no intrinsic semantics—
they can mean whatever you want them to mean. In Symbolic Logic and the Game of
Logic, Lewis Carroll wrote:
clse
hkkkkkkkikkkkkkkj clse
hkkkikkkj
pt1 _ t2 _ t3 q ^pt4 _ loomoon
t 5 _ t 6 _ t 7 q ^ pt 8 _ t 9 q
term
Law Expression
p_q ” q_p
Commutative
p^q ” q^p
pp _ qq _ r ” p _ pq _ r q
Associative
pp ^ qq ^ r ” p ^ pq ^ r q
pp ^ qq _ r ” pp _ r q ^ pq _ r q
Distributive
pp _ qq ^ r ” pp ^ r q _ pq ^ r q
pp _ qq ” p ^ q
DeMorgan’s
pp ^ qq ” p _ q
14.4 Resolution
14.4.1 Resolution in Propositional Calculus
There are multiple rules of inference in formal systems of logic that are used
to infer new propositions from given propositions. For instance, modus ponens
is a rule of inference: pp ^ pp Ą qqq Ą q (if p implies q, and p, therefore q),
p,p Ą q
often written q
. Application of modus ponens supports the elimination of
antecedents (e.g., p) from a logical proof and, therefore, is referred to as the rule of
detachment. Resolution is the primary rule of inference used in logic programming.
Resolution is designed to be used with propositions in CNF. It can be stated as
follows:
p _ q, q _ r
p _ r
Given propositions:
p _ q
q _ r
After combining the two propositions,
cancel out the matching, negated terms.
p _ q _ q _ r
p_ _
q q_r
Inferred proposition:
p _ r
Given propositions:
sbngspnge, rosq _ ƒ rendspnge, rosq
ƒ rendspnge, rosq _ tkdypnge, rosq
After combining the two propositions, cancel out the matching, negated terms.
sbngspnge, rosq _ ƒ rendspnge, rosq _ ƒ rendspnge, rosq _ tkdypnge, rosq
((( ((((
((
sbngspnge, rosq_ƒ rends
( ((((rosq
(pnge, _ ( (p(
ƒ rends nge, rosq _ tkdypnge, rosq
( ( ( ( ( (
Inferred proposition:
sbngspnge, rosq _ tkdypnge, rosq
At present, we are not concerned with any intended semantics of any propositions,
but are simply exploring the mechanics of resolution. Consider the example of an
application of resolution in Table 14.6.
In the prior examples, the process of resolution started with the axioms
(i.e., the propositions assumed to be true), from which was produced a new,
inferred proposition. This approach to the application of resolution is called
forward chaining. The question being asked is: What new propositions can we
derive from the existing propositions? An alternative use of resolution is to test
a hypothesis represented as a proposition for validity. We start by adding the
negation of the hypothesis to the set of axioms and then run resolution. The process
or resolution continues as usual until a contradiction is found, which indicates that
the hypothesis is proved to be true (i.e., it is a theorem). This process produces a
proof by refutation. Consider a knowledge base of one axiom commter pcq
Given propositions:
grndƒ ther prg, chrstnq _ grndƒ ther prg, mrq _ sbngspchrstn, mrq _ cosnspchrstn, mrq
sbngspmr, ngeq _ sbngspchrstn, mrq _ sbngspchrstn, ngeq
After combining the two propositions, cancel out the matching, negated terms:
(
((((
grndƒ ther prg, chrstnq _ grndƒ ther prg, mrq _ sbngs
( (p ( (
chrstn,
(((mrq _ cosnspchrstn, mrq _
(
( (((
( ( ((((
sbngspmr, ngeq _ sbngs
((( p (
chrstn, mr q _ sbngs pchrstn, ngeq
( (((
Inferred proposition:
grndƒ ther prg, chrstnq _ grndƒ ther prg, mrq _ sbngspmr, ngeq _ cosnspchrstn, mrq _ sbngspchrstn, ngeq
Given propositions:
commter pcq
negated hypothesis: commter pcq
Combining the two propositions results in a contradiction!
commter pcq_ commter pcq
The As and Bs are called terms. The left-hand side (i.e., the expression before the Ă
symbol) is called the consequent; the right-hand side (i.e., the expression after the
Ă symbol) is called the antecedent. The intuitive interpretation of a proposition in
Knowledge Base
clause 1: commter pq _ doesnothep, cr q _ rdesp, bsq
clause 2: commter pq _ doesnothep, bcyceq _ rdesp, trnq
clause 3: commter pcq
clause 4: doesnothepc, bcyceq
clause 5: rdespc, trnq (negated hypothesis)
Table 14.7 An Example of a Resolution Proof by Refutation, Where the Propositions Therein Are Represented in CNF
14.5. FROM PREDICATE CALCULUS TO LOGIC PROGRAMMING 653
clausal form is as follows: If all of the As are true, then at least one of the Bs must be
true. When converting the individual clauses in an expression in CNF into clausal
form, we introduce implication based on equivalence between p _ q and q Ă p.
The clauses c1 and c2 given previously expressed in clausal form are
clause c1 : B1 _ B2 _ ¨ ¨ ¨ _ Bn Ă A1 ^ A2 ^ ¨ ¨ ¨ ^ Am
clause c2 : t2 Ă t1
clause c3 : t3
p A1 _ A2 ¨ ¨ ¨ _ Am q _ pB 1 _ B 2 _ ¨ ¨ ¨ _ B n q ”
pA1 ^ A2 ¨ ¨ ¨ ^ Am q _ pB 1 _ B 2 _ ¨ ¨ ¨ _ B n q ”
pB 1 _ B 2 _ ¨ ¨ ¨ _ B n q Ă ppA1 ^ A2 ¨ ¨ ¨ ^ Am qq ”
pB 1 _ B 2 _ ¨ ¨ ¨ _ B n q Ă pA 1 ^ A 2 ¨ ¨ ¨ ^ A m q
Factorial
• Natural language specification:
The factorial of zero is 1.
The factorial of a positive integer n is
n multiplied by the factorial of n ´ 1.
• Predicate calculus:
ƒ ctorp0, 1q
@n, @g.
ƒ ctorpn, n ˚ gq Ă zeropnq ^ negtepnq ^ ƒ ctorpn ´ 1, gq
• Conjunctive normal form:
pƒ ctorp0, 1qq^
pzeropnq _ negtepnq _ ƒ ctorpn ´ 1, gq _ ƒ ctorpn, n ˚ gqq
• Horn clauses:
ƒ ctorp0, 1q
ƒ ctorpn, n ˚ gq Ă zeropnq ^ negtepnq ^ ƒ ctorpn ´ 1, gq
Fibonacci
• Natural language specification:
The first Fibonacci number is 0.
The second Fibonacci number is 1.
Any Fibonacci number n, except for the first and second,
is the sum of the previous two Fibonacci numbers.
• Predicate calculus:
ƒ bonccp1, 0q
ƒ bonccp2, 1q
@n, @g, @h.
ƒ bonccpn, g ` hq Ă negtepnq ^ zeropnq ^ onepnq ^ topnq^
ƒ bonccpn ´ 1, gq ^ ƒ bonccpn ´ 2, hq
14.5. FROM PREDICATE CALCULUS TO LOGIC PROGRAMMING 655
• Horn clauses:
ƒ bonccp1, 0q
ƒ bonccp2, 1q
ƒ bonccpn, g ` hq Ă negtepnq ^ zeropnq ^ onepnq ^ topnq^
ƒ bonccpn ´ 1, gq ^ ƒ bonccpn ´ 2, hq
Commuter
• Natural language specification:
For all , if is a commuter, then rides either a bus or a train.
• Predicate calculus:
@.prdesp, bsq _ rdesp, trnq Ă commter pqq
• Conjunctive normal form:
prdesp, bsq _ rdesp, trnq _ commter pqq
• Horn clause:
rdesp, bsq Ă commter pq ^ rdesp, trnq
Sibling relationship
• Natural language specification:
is a sibling of y if and y have the same mother or the same father.
• Predicate calculus:
@, @y. ppDm. sbngp, yq Ă mother pm, q ^ mother pm, yqq_
pDƒ . sbngp, yq Ă ƒ ther pƒ , q ^ ƒ ther pƒ , yqqq
Recall that the universal quantifier is implicit and the existential quantifier is
not required in Horn clauses: All variables on the left-hand side (lhs) of the Ă
operator are universally quantified and those on the right-hand side (which do
not appear on the lhs) are existentially quantified.
In summary, to prepare the propositions in a knowledge base for use with
Prolog, we must convert the wffs in the knowledge base to a set of Horn clauses:
We arrive at the final knowledge base of Horn clauses by applying the following
conversion process on each wff in the original knowledge base:
Since more than one Horn clause may be required to represent a single wff, the
number of propositions in the original knowledge base of wffs may not equal the
number of Horn clauses in the final knowledge base.
pq Ă pq, pr Ă qq
rĂp
This rule indicates that if p implies q and q implies r, then p implies r. The
mechanics of a resolution proof process over Horn clauses are slightly different
than those for propositions expressed in CNF, as detailed in Section 14.4.2. In
particular, given two Horn clauses X and Y , if we can match the head of X with
a term in the antecedent of clause Y , then we can replace the matched head of X
in the antecedent of Y with the antecedent of X . Consider the following two Horn
clauses X and Y :
X: p Ă p1 ^ ¨ ¨ ¨ ^ pn
Y: q Ă p ^ ¨ ¨ ¨ ^ q ´ 1 ^ q ^ q ` 1 ^ ¨ ¨ ¨ ^ q m
Since term p in the antecedent of clause Y matches term p (i.e., the head of clause
X ), we can infer the following new proposition:
Y 1: q Ă q 1 ^ ¨ ¨ ¨ ^ q ´ 1 ^ p 1 ^ ¨ ¨ ¨ ^ p n ^ q ` 1 ^ ¨ ¨ ¨ ^ q m
qĂp
rĂq
rĂp
658
sbngspchrstn, mrq _ cosnspchrstn, mrq Ă grndƒ ther prg, chrstnq ^ grndƒ ther prg, mrq
(If Virgil is the grandfather of Christina and Virgil is the grandfather of Maria, then Christina and Maria are either siblings or cousins.)
sbngspchrstn, ngeq Ă sbngspmr, ngeq ^ sbngspchrstn, mrq
(If Maria and Angela are siblings and Christina and Maria are siblings, then Christina and Angela are siblings.)
sbngspchrstn, mrq _ cosnspchrstn, mrq_ grndƒ ther prg, chrstnq ^ grndƒ ther prg, mrq^
sbngspchrstn, ngeq Ă sbngspmr, ngeq ^ sbngspchrstn, mrq
((((
pchrstn,
sbngs( (((
((mrq_ cosnspchrstn, mrq_ grndƒ ther prg, chrstnq ^ grndƒ ther prg, mrq^
(((( (
(
(((
sbngspchrstn, ngeq Ă sbngspmr, ngeq ^ sbngs( p
( chrstn,
(((( mrq
( (((
cosnspchrstn, mrq _ sbngspchrstn, ngeq Ă grndƒ ther prg, chrstnq ^ grndƒ ther prg, mrq^
sbngspmr, ngeq
Table 14.9 An Example Application of Resolution, Where the Propositions Therein Are Represented in Clausal Form
CHAPTER 14. LOGIC PROGRAMMING
14.5. FROM PREDICATE CALCULUS TO LOGIC PROGRAMMING 659
The structure of this resolution proof is the same as the structure of the prior
example, but the propositions p, q, and r are represented as binary predicates.
The proof indicates that
“If Angela and Rosa are siblings, then Angela and Rosa are friends”; and
“if Angela and Rosa are friends, then Angela and Rosa talk daily”; then
“if Angela and Rosa are siblings, then Angela and Rosa talk daily.”
Backward Chaining
A goal in logic programming, which is called a hypothesis in Section 14.4.2, is
expressed as a headless Horn clause and is similarly pursued through a resolution
proof by contradiction: Assert the goal as a false fact in the database and then search
for a contradiction. In particular, resolution searches the database of propositions
for the head of the known Horn clause P that unifies with a term in the antecedent
of the headless Horn goal clause G representing the negated goal. If a match is
found, the antecedent of the Horn clause P whose head matched a term in the
antecedent of G is replaced with the unified term in G . This process continues until
a contradiction is found:
a rule: p Ă p1 ^ ¨ ¨ ¨ ^ pn
the goal: ƒ se Ă p
new subgoals: ƒ se Ă p1 ^ ¨ ¨ ¨ ^ pn
We unify the body of the goal with the head of one of the known clauses, and
replace the matched goal with the antecedent of the clause, creating a new list
of (sub-)goals. In this example, the resolution process replaces the original goal
p with the subgoals p1 ^ ¨ ¨ ¨ ^ pn . If, after multiple iterations of this process, a
contradiction (i.e., tre Ă ƒ se) is derived, then the goal is satisfied.
Consider a database consisting of only one fact: commter pcq Ă tre.
To pursue the goal of determining if “Lucia is a commuter,” we add
a negation of this proposition expressed as the headless Horn clause
ƒ se Ă commter pcq to the database and run the resolution algorithm:
a fact P : commterplciq Ă tre
a goal G : ƒ se Ă commterplciq
Matching head of P with body of G ,
and replacing body of P with matched body of G :
a contradiction: ƒ se Ă tre
660 CHAPTER 14. LOGIC PROGRAMMING
Using clause 3:
14.6. THE PROLOG PROGRAMMING LANGUAGE
Type of Horn Clause Example Horn Clause Prolog Concept Prolog Syntax
headless ƒ se Ă phosopher pPscq goal/query philosopher(pascal).
headed drnkspry, ergreyq Ă tre fact drinks(ray, earlgrey).
headed drnkspry, ergreyq Ă rule drinks(ray, earlgrey) :-
drnkspry, teq ^ tepergreyq drinks(ray,tea),
tepergreyq tea(earlgrey).
1 shape(circle). /* a fact */
2 shape(square). /* a fact */
3 shape(rectangle). /* a fact */
4
5 rectangle(X) :- shape(square). /* a rule */
6 rectangle(X) :- shape(rectangle). /* a rule */
The facts on lines 1–3 assert that a circle, square, and rectangle are shapes. The
two rules on lines 5–6 declare that shapes that are squares and rectangles are also
rectangles. Syntactically, Prolog programs are built from terms. A term is either
14.6. THE PROLOG PROGRAMMING LANGUAGE 663
atLeast35yearsOld(X) :- presidentOfUSA(X).
Notice that the implication Ă and conjunction ^ symbols are represented in Prolog
as :- and ,, respectively.
$ swipl first.pl
Welcome to SWI-Prolog (threaded, 64 bits, version 8.2.3)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This i s free software.
Please run ?- license. for legal details.
?- make.
t r u e.
?- halt.
$
7. https://www.swi-prolog.org
8. The number following the / indicates the arity of the predicate. The /ă#ą is not part of the syntax
of the predicate name.
664 CHAPTER 14. LOGIC PROGRAMMING
$ swipl
Welcome to SWI-Prolog (threaded, 64 bits, version 8.2.3)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This i s free software.
Please run ?- license. for legal details.
?- co n su l t ('first.pl').
t r u e.
?- make.
t r u e.
?- halt.
$
In either case, enter make. in the SWI-Prolog REPL to reconsult the loaded prolog
program file (without exiting the interpreter), if (uncompiled) changes have been
made to the program. Enter halt. or the EOF character (e.g., ăctrl-D ą on
Linux) to end your session with SWI-Prolog. Table 14.12 offers more information
on this process.
Table 14.12 Predicates for Interacting with the SWI-Prolog Shell (i.e., REPL)
14.6. THE PROLOG PROGRAMMING LANGUAGE 665
Program output. The built-in predicates write, writeln, and nl (for newline),
with the implied semantics, write output. The programmer can include the
following goal in a program to prevent Prolog from abbreviating results with
ellipses:
set_prolog_flag(toplevel_print_options,
[quoted( t r u e), portray( t r u e), max_depth(0)]).
The argument passed to max_depth indicates the maximum depth of the list to
be printed. The maximum depth is 10 by default. If this value is set to 0, then the
printing depth limit is turned off.
1 ?- shape(circle).
2 t r u e.
3 ?- shape(X).
4 X = circle;
5 X = square;
6 X = rectangle.
7 ?- shape(triangle).
8 false.
666 CHAPTER 14. LOGIC PROGRAMMING
The substitution that unifies a variable with a literal or term binds the literal
or term to the variable:
5 ?- X = mary. 14 ?- X = mary(X).
6 X = mary. 15 X = mary(X).
7 16
8 ?- mary = X. 17 ?- X = mary(Y).
9 X = mary. 18 X = mary(Y).
10 19
11 ?- X = mother(mary). 20 ?- X = mary(name(Y)).
12 X = mother(mary). 21 X = mary(name(Y)).
13
On lines 14–15, notice that a variable unifies with a term that contains an
occurrence of the variable (see the discussion of occurs-check in Conceptual
Exercise 14.8.3). A nested term can be unified with another term if the two
14.7. GOING FURTHER IN PROLOG 667
terms have the same (1) predicate name; (2) shape or nested structure; and
(3) number of arguments, which can be recursively unified:
Lines 27–28 and 40–41 are substitutions that unify the clauses on lines 25–26
and 38–39, respectively. Lastly, to unify two uninstantiated variables, Prolog
makes the variables aliases of each other, meaning that they point to the same
memory location:
• If Prolog cannot prove a goal, it assumes the goal to be false. For instance,
the goal shape(triangle) on line 7 in the first Prolog transcript given in
this subsection fails (even though a triangle is a shape) because the process
of resolution cannot prove it from the database—that is, there is neither a
shape(triangle). fact in the database nor a way to prove it from the set
of facts and rules. This aspect of the inference engine in Prolog is called the
closed-world assumption (Section 14.9.1).
The task of satisfying a goal is left to the inference engine, and not to the
programmer.
Notice that the comma in the body (i.e., right-hand side) of the rule on line 7
represents conjunction. Likewise, the :- in that rule represents implication. Thus,
668 CHAPTER 14. LOGIC PROGRAMMING
?- path(b,c).
true .
? -
To prove this goal, Prolog uses resolution, which involves unification. When the
goal path(b,c) is given, Prolog runs its resolution algorithm with the following
steps:
During resolution, the term(s) in the body of the unified rule become subgoal(s).
Consider the goal path(X,c), which returns all the values of X that satisfy this
goal:
?- path(X,c).
X = b ;
X = a ;
false.
?-
Prolog searches its database top-down and searches subgoals from left-to-right
during resolution; thus, it constructs a search tree in a depth-first fashion. A top-
down search of the database during resolution results in a unification between this
goal and the head of the rule on line 6 and leads to the new goal: edge(X,c). A
proof of this new goal leads to additional unifications and subgoals. The entire
search tree illustrating the resolution process is depicted in Figure 14.2. Source
nodes in Figure 14.2 denote subgoals, and target nodes represent the body of a
rule whose head unifies with the subgoal in the source. Edge labels in Figure 14.2
denote the line number of the rule involved in the unification from subgoal source
to body target.
Notice that satisfaction of the goal edge(X,c) involves backtracking to find
alternative solutions. In particular, the solution X=b is found first in the left subtree
and the solution X=a is found second in the right subtree. A source node with
more than one outgoing edge indicates backtracking (1) to find solutions because
searching for a solution in a prior subtree failed (e.g., see two source nodes in the
right subtree each with two outgoing edges) or (2) to find additional solutions (e.g.,
second outgoing edge from the root node leads to the additional solution X=a).
Consider transposing the rules on lines 6 and 7 constituting the path predicate
in the example database:
(goal)
path (X,c)
6 7
edge(b,c), path(c,c)
true
6 7
Figure 14.2 A search tree illustrating the resolution process used to satisfy the goal
path(X,c).
?- path(X,c).
X = a ;
X = b.
?-
The entire search tree illustrating the resolution process with this modified
database is illustrated in Figure 14.3. Notice the order of the terms in the body
of the rule path(X,Y) :- edge(X,Z), path(Z,Y). Left recursion is avoided
in this rule since Prolog uses a depth-first search strategy. Consider a transposition
of the terms in the body of the rule path(X,Y) :- edge(X,Z), path(Z,Y):
6 path(X,Y) :- edge(X,Y).
7 path(X,Y) :- path(Z,Y), edge(X,Z).
The left-to-right pursuit of the subgoals leads to an infinite use of the rule
path(X,Y) :- path(Z,Y), edge(X,Z) due to its left-recursive nature:
?- path(X,c).
X = b ;
670 CHAPTER 14. LOGIC PROGRAMMING
(goal)
path (X,c)
6 7
2 3
edge(b,c), path(c,c)
true
6 7
Figure 14.3 An alternative search tree illustrating the resolution process used to
satisfy the goal path(X,c).
X = a ;
ERROR: Stack limit (1.0Gb) exceeded
ERROR: Stack sizes: local: 1.0Gb, global: 28Kb, trail: 1Kb
ERROR: Stack depth: 12,200,343, last-call: 0%, Choice points: 4
ERROR: Probable infinite recursion (cycle):
ERROR: [12,200,342] user:path(_7404, c)
ERROR: [12,200,341] user:path(_7424, c)
?-
Since the database is also searched in a top-down fashion, if we reverse the two
rules constituting the path predicate, the stack overflow occurs immediately and
no solutions are returned:
6 path(X,Y) :- path(Z,Y),edge(X,Z).
7 path(X,Y) :- edge(X,Y).
?- path(X,c).
ERROR: Stack limit (1.0Gb) exceeded
ERROR: Stack sizes: local: 1.0Gb, global: 23Kb, trail: 1Kb
ERROR: Stack depth: 6,710,271, last-call: 0%, Choice points: 6,710,264
ERROR: Probable infinite recursion (cycle):
The search tree for the goal path(X,c) illustrating the resolution process with this
modified database is presented in Figure 14.4. Since Prolog terms are evaluated
from left to right, Z will never be bound to a value. Thus, it is important to
14.7. GOING FURTHER IN PROLOG 671
(goal)
path (X,c)
6
path(Z,c), edge(X,Z)
6
path(Z,c), edge(X,Z)
6
path(Z,c), edge(X,Z)
6
...
Figure 14.4 Search tree illustrating an infinite expansion of the path predicate in
the resolution process used to satisfy the goal path(X,c).
ensure that variables can be bound to values during resolution before they are
used recursively.
Mutual recursion should also be avoided—to avert an infinite loop in the
search, not a stack overflow:
day_of_rain(X) :- day_of_umbrella_use(X).
day_of_umbrella_use(X) :- day_of_rain(X).
In summary, the order in which both the knowledge base in a Prolog program
and the subgoals are searched and proved, respectively, during resolution is
significant. While the order of the terms in the antecedent of a proposition in
predicate calculus is insignificant (since conjunction is a commutative operator),
Prolog pursues satisfaction of the subgoals in the body of a rule in a deterministic
order. Prolog searches its database top-down and searches subgoals left-to-
right during resolution and, therefore, constructs a search tree in a depth-first
fashion (Figures 14.2–14.4). A Prolog programmer must be aware of the order in
which the system searches both the database and the subgoals, which violates
a defining principle of declarative programming—that is, the programmer need
only be concerned with the logic and leave the control (i.e., inference methods
used to satisfy a goal) up to the system. Resolution comes free with Prolog—
the programmer need neither implement it nor be concerned with the details
of its implementation. The goal of logic/declarative programming is to make
programming entirely an activity of specification—programmers should not have
to impart control upon the program. On this basis, Prolog falls short of the ideal.
The language Datalog is a subset of Prolog. Unlike Prolog, the order of the clauses
in a Datalog program is insignificant and has no effect on program control.
While a depth-first search strategy for resolution is efficient, it is incomplete;
that is, DFS will not always result in solutions even if solutions exist. Thus,
672 CHAPTER 14. LOGIC PROGRAMMING
Table 14.14 Example List Patterns in Prolog Vis-à-Vis the Equivalent List Patterns
in Haskell
1 fruit(apple).
2 fruit(orange).
3 fruit('Pear').
4
5 likes('Olimpia',tangerines).
6 likes('Lucia',apples).
7 likes('Georgeanna',grapefruit).
8
9 composer('Johann Sebastian Bach').
10 composer('Rachmaninoff').
11 composer(beethoven).
12
13 sweet(_x) :- fruit(_x).
14
14.7. GOING FURTHER IN PROLOG 673
15 soundsgood(X) :- composer(X).
16 soundsgood(orange).
17
18 ilike([apples,oranges,pears]).
19 ilike([classical,[music,literature,theatre]]).
20 ilike([truth]).
21 ilike([[2020,mercedes,c300],[2021,bmw,m3]]).
22 ilike([[lisp,prolog],[apples,oranges,pears],['ClaudeDebussy']]).
23 ilike(truth).
24 ilike(computerscience).
Notice the declarative nature of these predicates. Also, be aware that if we desire
to include data in a Prolog program beginning with an uppercase letter, we must
quote the entire string (lines 3, 5–10, and 22); otherwise, it will be treated as a
variable. Similarly, if we desire to use a variable name beginning with a lowercase
letter, we must preface the name with an underscore (_) (line 13). Consider the
following the transcript of an interactive session with this database:
Notice the use of pattern matching and pattern-directed invocation with lists in the
queries on lines 67, 81, 96, and 103 (akin to their use in ML and Haskell in
Sections B.8.3 and C.9.3, respectively, in the online ML and Haskell appendices).
Moreover, notice the nature of some of the queries. For instance, the query on line
10 called a cross-product or Cartesian product. A relation is a subset of the Cartesian
product of two or more sets. For instance, if A “ t1, 2, 3u and B “ t, bu, then
a relation R Ď A ˆ B “ tp1, q, p1, bq, p2, q, p2, bq, p3, q, p3, bqu. The query
on line 27 is also a Cartesian product, but one in which the pairs with duplicate
components are pruned from the resulting relation.
1 isempty([]).
2
3 islist([]).
4 islist([_|_]).
5
6 cons(H,T,[H|T]).
7
8 /* member is built-in */
9 member1(E,[E|_]).
10 member1(E,[_|T]) :- member1(E,T).
Notice the declarative nature of these predicates as well as the use of pattern-
directed invocation (akin to its use in ML and Haskell in Sections B.8.3 and C.9.3,
respectively, in the online ML and Haskell appendices). The second fact (line 4)
of the islist predicate indicates that a non-empty list consists of a head and a
tail, but uses an underscore (_), with the same semantics as in ML/Haskell, to
indicate that the contents of the head and tail are not relevant. The cons predicate
accepts a head and a tail and puts them together in the third list argument. The
cons predicate is an example of using an additional argument to simulate another
return value. However, the fact cons(H,T,[H,T]) is just a declaration—we need
not think of it as a function. For instance, we can pursue the following goal to
determine the components necessary to construct the list [1,2,3]:
14.7. GOING FURTHER IN PROLOG 675
?- cons(H,T,[1,2,3]).
H = 1,
T = [2, 3].
?-
Notice also that the islist and cons facts can be replaced with the
rules islist([_|T]) :- islist(T). and cons(H,T,L) :- L = [H|T].,
respectively, without altering the semantics of the program. The member1
predicate declares that an element of a list is either in the head position (line 9)
or a member of the tail (line 10):
?- member1(E, [1,2,3]).
E = 1 ;
E = 2 ;
E = 3 ;
false.
?- member1(2, L).
L = [2|_10094] .
?- member1(2, L).
L = [2|_11572] ;
L = [_12230, 2|_12238] ;
L = [_12230, _12896, 2|_12904] ;
L = [_12230, _12896, _13562, 2|_13570] .
?-
1 append1([],L,L).
2 append1(L,[],L).
3 append1([X|L1], L2, [X|L12]) :- append1(L1, L2, L12).
Notice that the fact on line 2 in the definition of the append1/2 predicate is
superfluous since the rule on line 3 recurses through the first list only. The append
predicate is a primitive construct that can be utilized in the definition of additional
list manipulation predicates:
We redefine the member1 predicate using append1 (line 2). The revised predicate
requires only one rule and declares that E is a element of L if any list can be
appended to any list with E as the head resulting in list L:
?- member1(4, [2,4,6,8]).
t r u e.
The sublist predicate (line 5) is defined similarly using append1. The reverse
predicate declares that the reverse of an empty list is the empty list (line 8). The
rule (line 9) declares that the reverse of a list [H|T] is the reverse of list T—
the tail—appended to the list [H] containing only the head H. Again, notice the
declarative style in which these predicates are defined. We use lists to define
graphs and a series of graph predicates in Section 14.7.8. However, before doing so,
we discuss arithmetic predicates and the nature of negation in Prolog since those
graph predicates involve those two concepts.
edge(a,b).
edge(b,c).
edge(c,a).
?- t r a c e .
t r u e.
[trace] ?- path(a,c,[],PATH).
Call: (10) path(a, c, [], _7026) ? creep
Call: (11) edge(a, _7468) ? creep
Exit: (11) edge(a, b) ? creep
14.7. GOING FURTHER IN PROLOG 677
[trace] ?-
This trace is produced incrementally as the user presses the ăenterą key after each
line of the trace to proceed one step deeper into the proof process.
1 ?- X i s 5-3.
2 X = 2.
3
4 ?- Y i s X-1.
5 ERROR: Arguments are not sufficiently instantiated
6 ?-
The binding is held only during the satisfaction of the goal that produced the
instantiation/binding (lines 1–2). It is lost after the goal is satisfied (lines 4–5).
The following are the mathematical Horn clauses in Section 14.5.3 represented in
Prolog syntax for Horn clauses:
factorial(0,1).
factorial(N,F) :- N > 0, M i s N-1, factorial(M,G), F i s N*G.
fibonacci(1,0).
fibonacci(2,1).
fibonacci(N,P) :- N > 2, M i s N-1, fibonacci(M,G),
L i s N-2, fibonacci(L,H), P i s G+H.
The factorial predicate binds its second parameter F to the factorial of the
integer represented by its first parameter N:
678 CHAPTER 14. LOGIC PROGRAMMING
?- factorial(0,F).
F = 1 .
?- factorial(N,1).
N = 0 .
?- factorial(5,F).
F = 120 .
1 ?- mother(mary).
2 t r u e.
3
4 ?- mother(M).
5 M = mary.
6
7 ?- \+(mother(M)).
8 false.
9
10 ?- \+(\+(mother(M))).
11 t r u e.
12
13 ?- \+(\+(mother(mary))).
14 t r u e.
Assume only the fact mother(mary) exists in the database. The predicate
\+(mother(M)) is asserting that “there are no mothers.” The response to the
query on line 8 (i.e., false) is indicating that “there is a mother,” and not
indicating that “there are no mothers.” In attempting to satisfy the goal on line
10, Prolog starts with the innermost term and succeeds with M = mary. It then
proceeds outward to the next term. Once a term becomes false, the instantiation
is released. Thus, on line 11, we do not see a substitution for X, which proves the
goal on line 10, but we are only given true. Consider the following goals:
1 ?- \+(M=mary).
2 false.
3
4 ?- M=mary, \+(M=elizabeth).
5 M = mary.
6
7 ?- \+(M=elizabeth), M=mary.
8 false.
9
10 ?- \+(\+(M=elizabeth)), M=mary.
11 M = mary.
12
13 ?- \+(M=elizabeth), \+(M=mary).
14 false.
14.7. GOING FURTHER IN PROLOG 679
Again, false is returned on line 2 without presenting a binding for M, which was
released. Notice that the goals on lines 4 and 7 are the same—only the order of
the subgoals is transposed. While the validity of the goal in logic is not dependent
on the order of the subgoals, the order in which those subgoals are pursued is
significant in Prolog. On line 5, we see that Prolog instantiated M to mary to prove
the goal on line 4. However, the proof of the goal on line 7 fails at the first subgoal
without binding M to mary.
14.7.8 Graphs
We can model graphs in Prolog using a list whose first element is a list of vertices
and whose second element is a list of directed edges, where each edge is a list
of two elements—the source and target of the edge. Using this list representation
of a graph, a sample graph is [[a,b,c,d],[[a,b],[b,c],[c,d],[d,b]]].
Using the append/2 and member/2 predicates (and others not defined here, such
as noduplicateedges/1 and makeset/2—see Programming Exercises 14.7.15
and 14.7.16, respectively), we can define the following graph predicates:
1 graph([Vertices,Edges]) :-
2 noduplicateedges(Edges),
3 flatten(Edges, X), makeset(X, Y), subset(Y, Vertices).
4
5 vertex([Vset,Eset], Vertex1) :- graph([Vset,Eset]), member(Vertex1, Vset).
6
7 edge([Vset,Eset], Edge) :- graph([Vset,Eset]), member(Edge, Eset).
The graph predicate (lines 1–3) tests whether a given list represents a valid
graph by checking if there are no duplicate edges (line 2) and confirming that the
defined edges do not use vertices that are not included in the vertex set (line 3).
The flatten/2 and subset/2 predicates (line 3) are built into SWI-Prolog. The
vertex predicate (line 5) accepts a graph and a vertex; it returns true if the graph
is valid and the vertex is a member of that graph’s vertex set, and false otherwise.
Similarly, the edge predicate (line 7) takes a graph and an edge; it returns true if
the graph is valid and the edge is a member of that graph’s edge set, and false
otherwise. The following are example goals:
?- graph([[a,b,c],[[a,b],[b,c]]]).
true .
?- graph([[a,b,c],[[a,b],[b,c],[d,a]]]).
false.
?- vertex([[a,b,c],[[a,b],[b,c]]], Vertex).
Vertex = a ;
Vertex = b ;
Vertex = c ;
false.
?- edge([[a,b,c],[[a,b],[b,c]]], [a,b]).
true .
?- edge([[a,b,c],[[a,b],[b,c],[d,a]]], [a,b]).
false.
680 CHAPTER 14. LOGIC PROGRAMMING
These predicates serve as building blocks from which we can construct more
graph predicates. For instance, we can check if one graph is a subgraph of another
one:
?- subgraph([[a,b,c],[[a,b],[a,c]]], [[a,b,c],[[a,b],[a,c],[b,c]]]).
true .
?- subgraph([[a,b,c],[[a,b],[a,c],[b,c]]], [[a,b,c],[[a,b],[a,c]]]).
false.
We can also check whether a graph has a cycle, or a cycle containing a given
vertex. A cycle is a chain where the start vertex and the end vertex are the same
vertex. A chain is a path of directed edges through a graph from a source vertex to
a target vertex. Using a Prolog list representation, a chain is a list of vertices such
that there is an edge between each pair of adjacent vertices in the list. Thus, in that
representation of a chain, a cycle is a chain such that there is an edge from the final
vertex in the list to the first vertex in the list. Consider the following predicate to
test a graph for the presence of cycles:
Note that the cycle/2 predicate uses a chain/4 predicate (not defined here; see
Programming Exercise 14.7.19) that checks for the presence of a path from a start
vertex to an end vertex in a graph.
?- cycle([[a,b,c,d],[[a,b],[b,c],[c,d],[d,b]]], a).
false.
?- cycle([[a,b,c,d],[[a,b],[b,c],[c,d],[d,b]]], d).
true .
?- cycle([[a,b,c,d],[[a,b],[b,c],[c,d],[d,b]]]).
true .
In contrast, a complete graph has no self-edges (i.e., an edge from and to the
same vertex), but all other possible edges. A complete directed graph with n
vertices has exactly n ˆ pn ´ 1q edges. Thus, we can check if a graph is complete
by verifying that it is a valid graph, that it has no self-edges, and that the
number of edges is described by the prior arithmetic expression. The following
are independent and complete predicates for these types of graphs—proper
is a helper predicate:
The list length/2 predicate (line 32) is built into SWI-Prolog. The following are
goals involving independent and complete:
?- independent([[],[]]).
t r u e.
?- independent([[a,b,c],[[a,b],[b,c]]]).
false.
?- independent([[a,b,c],[]]).
t r u e.
?- complete([[],[]]).
t r u e.
?- complete([[a,b,c],[[a,b],[a,c],[b,a], [b,c],[c,a],[c,b]]]).
true .
twentiethcennovels('1984','George Orwell',1949).
twentiethcennovels('Wise Blood','Flannery O\u2019Connor',1952).
Each of the five predicates in this Prolog program (each containing multiple facts)
is the analog of a table (or relation) in a database system. The following is a
mapping from some common types of queries in SQL to their equivalent goals
in Prolog.
Union
SELECT * FROM nineteenthcennovels
UNION
SELECT * FROM twentiethcennovels;
?- nineteenthcennovels(TITLE,AUTHOR,YEAR);
| twentiethcennovels(TITLE,AUTHOR,YEAR).
TITLE = 'Sense and Sensibility',
AUTHOR = 'Jane Austen',
YEAR = 1811 ;
TITLE = 'Pride and Prejudice',
AUTHOR = 'Jane Austen',
YEAR = 1813 ;
TITLE = 'Notes from Underground',
AUTHOR = 'Fyodor Dostoyevsky',
YEAR = 1864 ;
TITLE = 'Crime and Punishment',
AUTHOR = 'Fyodor Dostoyevsky',
YEAR = 1866 ;
TITLE = 'The Brothers Karamazov',
AUTHOR = 'Fyodor Dostoyevsky',
YEAR = 1879-80 ;
TITLE = '1984',
AUTHOR = 'George Orwell',
YEAR = 1949 ;
TITLE = 'Wise Blood',
AUTHOR = 'Flannery O'Connor',
YEAR = 1952.
?-
While a comma (,) is the conjunction or the and operator in Prolog, a semicolon
(;) is the disjunction or the or operator in Prolog.
Intersection
SELECT * FROM twentiethcennovels
INTERSECT
SELECT * FROM read;
?- twentiethcennovels(TITLE,AUTHOR,YEAR), read(TITLE,AUTHOR,YEAR).
TITLE = '1984',
AUTHOR = 'George Orwell',
YEAR = 1949 ;
false.
?-
14.7. GOING FURTHER IN PROLOG 683
Difference
SELECT * FROM twentiethcennovels
EXCEPT
SELECT * FROM read;
?- twentiethcennovels(TITLE,AUTHOR,YEAR), \+(read(TITLE,AUTHOR,YEAR)).
TITLE = 'Wise Blood',
AUTHOR = 'Flannery O'Connor',
YEAR = 1952.
?-
Projection
SELECT title
FROM nineteenthcennovels;
?- nineteenthcennovels(TITLE,_,_).
TITLE = 'Sense and Sensibility' ;
TITLE = 'Pride and Prejudice' ;
TITLE = 'Notes from Underground' ;
TITLE = 'Crime and Punishment' ;
TITLE = 'The Brothers Karamazov'.
?-
Selection
SELECT *
FROM nineteenthcennovels
WHERE author = "Fyodor Dostoyevsky" and year >= 1865;
?-
?- nineteenthcennovels(TITLE,'Fyodor Dostoyevsky',_).
TITLE = 'Notes from Underground' ;
TITLE = 'Crime and Punishment' ;
TITLE = 'The Brothers Karamazov'.
?-
Natural Join
SELECT *
FROM nineteenthcennovels, authors
WHERE nineteenthcennovels.author = authors.name;
?- nineteenthcennovels(TITLE,AUTHOR,YEAR), authors(AUTHOR,DOB,BIRTHPLACE).
684 CHAPTER 14. LOGIC PROGRAMMING
?-
Theta-Join
SELECT *
FROM nineteenthcennovels, authors
WHERE nineteenthcennovels.author = authors.name and year <= 1850;
?- nineteenthcennovels(TITLE,AUTHOR,YEAR),
| authors(AUTHOR,DOB,BIRTHPLACE), YEAR =< 1850.
TITLE = 'Sense and Sensibility',
AUTHOR = 'Jane Austen',
YEAR = 1811,
DOB = '16 Dec 1775',
BIRTHPLACE = 'Hampshire, England' ;
TITLE = 'Pride and Prejudice',
AUTHOR = 'Jane Austen',
YEAR = 1813,
DOB = '16 Dec 1775',
BIRTHPLACE = 'Hampshire, England' ;
TITLE = 'The Brothers Karamazov',
AUTHOR = 'Fyodor Dostoyevsky',
YEAR = 1879-80,
DOB = '11 Nov 1821',
BIRTHPLACE = 'Moscow, Russian Empire'.
?-
Adding the preceding queries in the form of rules creates what are called views
in database terminology, where the head of the headed Horn clause is the name of
the view:
% Union:
novels(TITLE,AUTHOR,YEAR) :-
nineteenthcennovels(TITLE,AUTHOR,YEAR);
14.7. GOING FURTHER IN PROLOG 685
twentiethcennovels(TITLE,AUTHOR,YEAR).
% Intersection:
readtwentiethcennovels(TITLE,AUTHOR,YEAR) :-
twentiethcennovels(TITLE,AUTHOR,YEAR), read(TITLE,AUTHOR,YEAR).
% Difference:
unread(TITLE,AUTHOR,YEAR) :-
twentiethcennovels(TITLE,AUTHOR,YEAR), \+(read(TITLE,AUTHOR,YEAR)).
% Projection:
nineteenthcennoveltitles(TITLE) :- nineteenthcennovels(TITLE,_,_).
% Selection:
latenineteenthcennovelsbyFD(TITLE,YEAR) :-
nineteenthcennovels(TITLE,'Fyodor Dostoyevsky',YEAR), YEAR >= 1865.
% Natural join:
nineteenthcennovelsauthors :-
nineteenthcennovels(TITLE,AUTHOR,YEAR), authors(AUTHOR,DOB,BIRTHPLACE).
% Theta-join:
earlynineteenthcennovelsauthors :-
nineteenthcennovels(TITLE,AUTHOR,YEAR),
authors(AUTHOR,DOB,BIRTHPLACE), YEAR =< 1850.
Exercise 14.7.3 Explain why the \+/1 Prolog predicate is not a true logical NOT
operator. Provide an example to support your explanation.
Exercise 14.7.4 Does Prolog use short-circuit evaluation? Provide a Prolog goal (and
the response the interpreter provides in evaluating it) to unambiguously support
your answer. Note that the result of the goal ?- 3 = 4, 3 = 3. does not prove
or disprove the use of short-circuit evaluation in Prolog.
Exercise 14.7.5 Since the depth-first search strategy is problematic for reasons
demonstrated in Section 14.7.1, why does Prolog use depth-first search? Why is
breadth-first search not used instead?
686 CHAPTER 14. LOGIC PROGRAMMING
RDBMS Prolog
relation predicate
attribute argument
tuple ground fact
table extensional definition of predicate (i.e., set of facts)
view intensional definition of predicate (i.e., a rule)
variable query evaluation fixed query evaluation (i.e., depth-first search)
forward chaining backward chaining
table/set at a time tuple at a time
Exercise 14.7.6 In Section 14.7.1, we saw that left-recursion on the left-hand side
of a rule causes a stack overflow. Why is this not the case in the reverse predicate
in Section 14.7.4?
Exercise 14.7.9 Consider the following Prolog goal and its result:
?- X=0, \+(X=1).
X = 0.
Explain why the result of the following Prolog goal does not bind X to 1:
?- \+(X=0), X=1.
false.
append1([],L,L).
append1(L,[],L).
append1([X|L1], L2, [X|L12]) :- append1(L1, L2, L12).
This predicate has a bug—it produces duplicate solutions (lines 4–5, 8–9, 12–13,
14–15, and 16–17):
14.7. GOING FURTHER IN PROLOG 687
1 ?- append1(X,Y, [dostoyevsky,orwell,oconnor]).
2 X = [],
3 Y = [dostoyevsky, orwell, oconnor] ;
4 X = [dostoyevsky, orwell, oconnor],
5 Y = [] ;
6 X = [dostoyevsky],
7 Y = [orwell, oconnor] ;
8 X = [dostoyevsky, orwell, oconnor],
9 Y = [] ;
10 X = [dostoyevsky, orwell],
11 Y = [oconnor] ;
12 X = [dostoyevsky, orwell, oconnor],
13 Y = [] ;
14 X = [dostoyevsky, orwell, oconnor],
15 Y = [] ;
16 X = [dostoyevsky, orwell, oconnor],
17 Y = [] ;
18 false.
19
20 ?-
?- append1(X,Y, [dostoyevsky,orwell,oconnor]).
X = [],
Y = [dostoyevsky, orwell, oconnor] ;
X = [dostoyevsky],
Y = [orwell, oconnor] ;
X = [dostoyevsky, orwell],
Y = [oconnor] ;
X = [dostoyevsky, orwell, oconnor],
Y = [] ;
false.
?-
Exercise 14.7.13 Define a Prolog predicate sum that binds its second argument S
to the sum of the integers from 1 up to and including the integer represented by its
first parameter N.
Examples:
?- sum(N,0).
N = 0 .
?- sum(0,S).
S = 0 .
?- sum(4,S).
S = 10 .
?- sum(4,8).
false.
688 CHAPTER 14. LOGIC PROGRAMMING
?- sum(5,Y).
Y = 15 .
?- sum(500,Y).
Y = 125250 .
?- sum(-100,Y).
false.
Exercise 14.7.14 Consider the following logical description for the Euclidean
algorithm to compute the greatest common divisor (gcd) of two positive integers
and :
Examples:
?- noduplicateedges([[a,b],[b,c],[d,a]]).
t r u e.
?- noduplicateedges([[a,b],[b,c],[d,a],[b,c]]).
false.
Exercise 14.7.16 Define a Prolog predicate makeset/2 that accepts a list and
removes any repeating elements—producing a set. The result is returned in
the second list parameter. Use no auxiliary predicates, except for not/1 and
member/2.
Examples:
?- makeset([],[]).
t r u e.
?- makeset([a,b,c],SET).
SET = [a, b, c].
?- makeset([a,b,c,a],SET).
SET = [b, c, a] .
?- makeset([a,b,c,a,b],SET).
SET = [c, a, b] .
Exercise 14.7.17 Using only append, define a Prolog predicate adjacent that
accepts only three arguments and that succeeds if its first two arguments are
adjacent in its third list argument and fails otherwise.
14.7. GOING FURTHER IN PROLOG 689
Examples:
?- adjacent(1,2,[1,2,3]).
t r u e.
?- adjacent(1,2,[3,1,2]).
t r u e.
?- adjacent(1,2,[1,3,2]).
false.
?- adjacent(2,1,[1,2,3]).
t r u e.
?- adjacent(2,3,[1,2,3]).
t r u e.
?- adjacent(3,1,[1,2,3]).
false.
Exercise 14.7.18 Modify your solution to Programming Exercise 14.7.17 so that the
list is circular.
Examples:
?- adjacent(1,4,[1,2,3,4]).
t r u e.
?- adjacent(4,1,[1,2,3,4]).
t r u e.
?- adjacent(2,4,[1,2,3,4]).
false.
?- chain([[a,b,c,d],[[a,b],[b,c],[c,d],[d,b]]], b, d, CHAIN).
CHAIN = [b, c, d] .
?- chain([[a,b,c,d],[[a,b],[b,c],[c,d],[d,b]]], a, d, CHAIN).
CHAIN = [a, b, c, d] .
?- chain([[a,b,c,d],[[a,b],[b,c],[c,d],[d,b]]], a, a, CHAIN).
false.
?- chain([[a,b,c,d],[[a,b],[b,c],[c,d],[d,b]]], d, d, CHAIN).
CHAIN = [d, b, c, d] .
Exercise 14.7.20 Define a Prolog predicate sort that accepts two arguments, sorts
its first integer list argument, and returns the result in its second integer list
argument.
Examples:
?- sort([1],S).
S = [1] .
?- sort([1,2],S).
S = [1, 2] .
690 CHAPTER 14. LOGIC PROGRAMMING
?- sort([5,4,3,2,1],S).
S = [1, 2, 3, 4, 5] .
?- 3 < 4.
t r u e.
?- 4 < 3.
false.
?-
Exercise 14.7.21 Define a Prolog predicate last that accepts only two arguments
and that succeeds if its first argument is the last element of its second list argument
and fails otherwise.
Examples:
?- last(X,[1,2,3]).
X = 3
?- last(4,[1,2,3]).
false.
Exercise 14.7.22 Define a Prolog nand/3 predicate. The following table models a
nand gate:
p q p NAND q
0 0 1
1 0 1
0 1 1
1 1 0
numbers of days in different months and February in leap years. A leap year is
a year that is divisible by 400, or divisible by 4 but not also by 100. Therefore, 2000,
2012, 2016, and 2020 were leap years, while 1800 and 1900 were not. Your solution
must not use more than three user-defined predicates or exceed 20 lines of code.
Examples:
?- validdate(feb,29,2000).
t r u e.
?- validdate(feb,30,2000).
false.
?- validdate(feb,29,2004).
t r u e.
?- validdate(feb,29,1900).
false.
?- validdate(may,16,2007).
t r u e.
?- validdate(jun,31,2007).
false.
?- validdate(apr,-10,3).
false.
?- validdate(apr,32,3).
false.
?- validdate(apr,30,-100).
false.
?- validdate(apr,30,0).
t r u e.
?- validdate(jul,0,0).
false.
?- validdate(jul,1,0).
t r u e.
?- validdate(fun,15,2020).
false.
?- !.
t r u e.
However, the cut predicate has a side effect: It both freezes parts of
solutions already found and prevents multiple/alternative solutions from being
produced/considered. In this way, it prunes branches in the resolution search tree
and reduces the number of branches in the search tree considered.
The cut predicate can appear in a Prolog program in the body of a rule or in
a goal (as a subgoal). In either case, when the cut is encountered, it freezes (i.e.,
fixes) any prior instantiations of free variables bound during unification for the
remainder of the search and prevents backtracking. As a consequence, alternative
692 CHAPTER 14. LOGIC PROGRAMMING
(goal)
path (X,c)
3 2
edge(b,c), path(c,c)
true
6 7
Figure 14.5 The branch (encompassed by a dotted box) of the resolution search
tree for the path(X,c) goal that the cut operator removes in the first path
predicate.
instantiations, which might lead to success, are not tried. Reconsider the path
predicate from Section 14.7.1, but with a cut included (line 6):
Output statements have been added to the body of the rules to assist in tracing the
search. Consider the goal path(X,c):
14.8. IMPARTING MORE CONTROL IN PROLOG: CUT 693
?- path(X,c).
Evaluate 1st path rule.
Finished evaluating edge(b,c) fact.
X = b.
?-
The search tree for this goal is shown in Figure 14.5. The edge labels in the figure
denote the line number from the Prolog program involved in the match from sub-
goal source to antecedent target. The left subtree corresponds to the rule on line
6, whose antecedent contains a cut. Here, the cut freezes the binding of X to b,
so that the right subtree is not considered. Once a cut has been encountered (i.e.,
evaluated to true), during backtracking the search of the subtrees of the parent
node of the node containing the cut stops, and the search resumes with the parent
node of the parent, if present. As a result, the cut prunes from the search tree all
siblings to the right of the node with the cut. Consider the following modification
to the path predicate:
The two rules constituting the prior path predicate are transposed and the cut is
shifted from the last predicate of the body of one of the rules to the last predicate
of the body of the other rule. Reconsider the goal path(X,c):
?- path(X,c).
Evaluate 1st path rule.
Finished evaluating edge(a,b) fact.
Evaluate 1st path rule.
Finished evaluating edge(b,c) fact.
Evaluate 1st path rule.
Evaluate 2nd path rule.
Evaluate 2nd path rule.
Finished evaluating edge(b,c) fact.
X = a.
?-
The search tree for this goal is presented in Figure 14.6. Notice that the output
statements trace the depth-first search of the resolution tree. In this example, the
failure in the left subtree occurs before the cut is evaluated, so the solution X=a is
found. Once the cut is evaluated (after X is bound to a), the solution X=a is frozen
and the right subtree is never considered. Now consider one last modification to
the path predicate:
The cut predicate is shifted one term to the left on line 6. Reconsider the goal
path(X,c):
?- path(X,c).
Evaluate 1st path rule.
694 CHAPTER 14. LOGIC PROGRAMMING
(goal)
path (X, c)
6 7
edge(X,Z),path(Z,c),! edge(X,c)
3
2
edge(a,b),path(b,c),! edge(b,c)
true true
{X = b}
success
6 7
6 7
Failure here occurs before cut is evaluated and, thus, this solution is produced.
Figure 14.6 The branch (encompassed by a dotted box) of the resolution search
tree for the path(X,c) goal that the cut operator removes in the second path
predicate.
The search tree for the goal path(X,c) is presented in Figure 14.7. Unlike in the
prior example, here the failure in the left subtree occurs after the cut is evaluated,
so even the solution X=a is not found. Now no solutions are returned.
In the three preceding examples, the cut predicate is used in the body of a rule.
However, the cut predicate can also be used (as a subgoal) in a goal. Consider the
following database:
1 author(dostoyevsky).
2 author(orwell).
3 author(oconnor).
14.8. IMPARTING MORE CONTROL IN PROLOG: CUT 695
(goal)
path (X, c)
6 7
edge(X,Z),!,path(Z,c) edge(X,c)
2 3
edge(a,b),!,path(b,c) edge(b,c)
true true
{X = b}
6 success
7
edge(b,Z),!,path(Z,c) edge(b,c) cut prunes
{X = a} true this subtree
3 {X = a}
success
edge(b,c),!,path(c,c)
true cut prunes
this subtree
6 7
Failure here occurs after cut is evaluated and, thus, these solutions are never considered.
Figure 14.7 The branch (encompassed by a dotted box) of the resolution search tree
for the path(X,c) goal that the cut operator removes in the third path predicate.
We use the cut predicate in the goal on line 23 in the following transcript to prevent
consideration of alternative instantiations of X by freezing the first instantiation
(i.e., X=dostoyevsky):
1 ?- author(AUTHOR). 15 X = orwell,
2 AUTHOR = dostoyevsky ; 16 Y = oconnor ;
3 AUTHOR = orwell ; 17 X = oconnor,
4 AUTHOR = oconnor. 18 Y = dostoyevsky ;
5 19 X = oconnor,
6 ?- author(X), author(Y). 20 Y = orwell ;
7 X = Y, Y = dostoyevsky ; 21 X = Y, Y = oconnor.
8 X = dostoyevsky, 22
9 Y = orwell ; 23 ?- author(X), !, author(Y).
10 X = dostoyevsky, 24 X = Y, Y = dostoyevsky ;
11 Y = oconnor ; 25 X = dostoyevsky,
12 X = orwell, 26 Y = orwell ;
13 Y = dostoyevsky ; 27 X = dostoyevsky,
14 X = Y, Y = orwell ; 28 Y = oconnor.
696 CHAPTER 14. LOGIC PROGRAMMING
Notice how the cut in the goal on line 23 froze the instantiation of X to
dostoyevsky, so that backtracking pursued only alternative instantiations of Y
(lines 26 and 28) to prove the goal. Consider replacing the second fact (line 2) with
the rule author(orwell) :- !:
1 ?- author(AUTHOR). 9 X = orwell,
2 AUTHOR = dostoyevsky ; 10 Y = dostoyevsky ;
3 AUTHOR = orwell. 11 X = Y, Y = orwell.
4 12
5 ?- author(X), author(Y). 13 ?- author(X), !, author(Y).
6 X = Y, Y = dostoyevsky ; 14 X = Y, Y = dostoyevsky ;
7 X = dostoyevsky, 15 X = dostoyevsky,
8 Y = orwell ; 16 Y = orwell.
The cut in the rule on line 2 affects the results of the goals on lines 1, 5, and 13.
In particular, once a variable is bound to orwell, no additional instantiations are
considered. The cut freezes the instantiations and prevents backtracking to the left
of the cut predicate in a line of code, while alternative instantiations are considered
to the right of the cut predicate:
instntitions frozen nd,hkkkkkkkkkkikkkkkkkkkkj
ths, bcktrcking prevented, to left of ct
T1 , T2 , ¨ ¨ ¨ , Tm , !, Tm ` 1, ¨ ¨ ¨ , Tn ´ 1, Tn .
loooooooooooooooomoooooooooooooooon
lterntive instntitions nd bcktrcking occr to right of ct
member1(E,[E|_]).
member1(E,[_|T]) :- member1(E,T).
This definition of member1/2 returns true as many times as there are occurrences
of the element in the input list:
?- member1(oconnor,[dostoyevsky,orwell,oconnor,austen,oconnor]).
true ;
true ;
false.
?-
Using a cut we can prevent multiple solutions from being produced such that
member1/2 returns true only once, even if the element occurs more than once
in the input list:
?- member1(oconnor,[dostoyevsky,orwell,oconnor,austen,oconnor]).
true
-?
14.8. IMPARTING MORE CONTROL IN PROLOG: CUT 697
?- member1(AUTHOR,[dostoyevsky,orwell,oconnor,austen,oconnor]).
AUTHOR = dostoyevsky.
?-
(a)
/* edge(X,Y) declares there is a directed edge from vertex X to Y */
edge(a,b).
edge(b,c).
(b)
/* edge(X,Y) declares there is a directed edge from vertex X to Y */
edge(a,b).
edge(b,c).
1 author(dostoyevsky).
2 author(orwell).
3 author(oconnor).
For each of the following goals, draw the search tree and indicate which parts of it
the cut prunes, as done in Figures 14.5–14.7:
?- complete([[a,b,c],[[a,b],[a,c],[b,a], [b,c],[c,a],[c,b]]]).
true ;
true ;
...
?- bubblesort([9,8,7,6,5,4,3,2,1],SL).
SL = [1, 2, 3, 4, 5, 6, 7, 8, 9] ;
SL = [2, 1, 3, 4, 5, 6, 7, 8, 9] ;
SL = [2, 3, 1, 4, 5, 6, 7, 8, 9] ;
SL = [2, 3, 4, 1, 5, 6, 7, 8, 9]
...
...
As can be seen, after producing the sorted list (line 2), the predicate produced
multiple spurious solutions. Modify the bubblesort predicate to ensure that it
does not return any additional results after it produces the first result—which is
always the correct one:
14.8. IMPARTING MORE CONTROL IN PROLOG: CUT 699
?- bubblesort([9,8,7,6,5,4,3,2,1],SL).
SL = [1, 2, 3, 4, 5, 6, 7, 8, 9].
?-
?- squarelistofints([1,2,3,4,5,6],SQUARES).
SQUARES = [1, 4, 9, 16, 25, 36].
?- squarelistofints([1,2,3.3,4,5,6],SQUARES).
SQUARES = [1, 4, 3.3, 16, 25, 36].
?- squarelistofints([1,2,"pas un entier",4,5,6],SQUARES).
SQUARES = [1, 4, "pas un entier", 16, 25, 36].
Complete this program. Specifically, define the bodies of the two rules constituting
the towers predicate. Hint: The body of the second rule requires four terms
(lines 3–6).
Example (with three discs):
?- towers(3,"A","B","C").
Move a disc from peg A to peg B.
Move a disc from peg A to peg C.
Move a disc from peg B to peg C.
700 CHAPTER 14. LOGIC PROGRAMMING
Exercise 14.8.8 Define the z= predicate in Prolog using only the !, fail, and =
predicates. Name the predicate donotunify.
Exercise 14.8.9 Define the z== predicate in Prolog using only the !, fail, and ==
predicates. Name the predicate notequal.
parent(olimpia,lucia).
parent(olimpia,olga).
sibling(X,Y) :- parent(M,X), parent(M,Y).
1 ?- sibling(X,Y).
2 X = Y, Y = lucia ;
3 X = lucia,
4 Y = olga ;
5 X = olga,
6 Y = lucia ;
7 X = Y, Y = olga.
8
9 ?-
Thus, Prolog thinks that lucia is a sibling of herself (line 1) and that olga is a
sibling of herself (line 7). Modify the sibling rule so that Prolog does not produce
pairs of siblings with the same elements.
member1(E,[E|_]).
member1(E,[_|T]) :- member1(E,T).
Exercise 14.8.13 The following is the triple predicate, which triples a list (i.e., given
[3], it produces [3,3,3]):
that 4 is not a member of the list [1,2]; it just means that the system failed
to prove that 4 is not a member of the list.
• Limited Expressivity of Horn Clauses: Horn clauses are not expressive
enough to capture any arbitrary proposition in predicate calculus. For
instance, a proposition in clausal form with a disjunction of more than one
non-negated term cannot be expressed as a Horn clause. As an example,
the penultimate preposition in clausal form presented in Section 14.5.1,
represented here, contains a disjunction of two non-negated terms:
siblings(christina,maria) :- grandfather(virgil,christina),
grandfather(virgil,maria),
\+(cousins(christina,maria)).
cousins(christina,maria) :- grandfather(virgil,christina),
grandfather(virgil,maria),
\+(siblings(christina,maria)).
To cast this proposition, from Section 14.3.1, in clausal form, we can (1) negate
it, which declares that a value for X which renders the proposition true does
not exist, and (2) represent the negated proposition as a goal:
DX.ptrnsmssonpX, mnqq
(There are no cars with manual transmissions.)
14.9. ANALYSIS OF PROLOG 703
Table 14.16 Summary of the Mismatch Between Predicate Calculus and Prolog
rather than
DX.ptrnsmssonpX, mnqq
(Not all cars have a manual transmission.)
1 ?- automobile(A). 10
2 A = bmw ; 11 ?- r e t r a c t (automobile(bmw)).
3 A = mercedes. 12 t r u e.
4 13
5 ?- a s s e r t a (automobile(honda)). 14 ?- automobile(A).
6 t r u e. 15 A = honda ;
7 16 A = mercedes ;
8 ?- a s s e r t z (automobile(toyota)). 17 A = toyota.
9 t r u e.
These three lines of code constitute the semantic part of the Prolog interpreter. Like
Lisp, Prolog is a homoiconic language—all Prolog programs are valid Prolog terms.
As a result, it is easy—again, as in Lisp—to write Prolog programs that analyze
Predicate Semantics
assert/1: Adds a fact to the end of the database.
assertz/1: Adds a fact to the end of the database.
asserta/1: Adds a fact to the beginning of the database.
retract/1: Removes a fact from the database.
var(ăTermą): Succeeds if ăTermą is currently a free variable.
novar(ăTermą): Succeeds if ăTermą is currently not a free variable.
ground(ăTermą): Succeeds if ăTermą holds no free variable.
clause/2: Matches the head and body of an existing clause
in the database; can be used to implement
a metacircular interpreter (i.e., an implementation
of call/1; see Section 14.9.3).
other Prolog programs. Thus, the Prolog interpreter shown here is not only a self-
interpreter, but a metacircular interpreter.
The Warren Abstract Machine (WAM) is a theoretical computer that defines an
execution model for Prolog programs; it includes an instruction set and memory
model (Warren 1983). A feature of WAM code is tail-call optimization (discussed in
Chapter 13) to improve memory usage. WAM code is a standard target for Prolog
compilers and improves program efficiency in the interpretation that follows. A
compiler, called WAMCC, from Prolog to C through the WAM has been constructed
and evaluated (Codognet and Diaz 1995).9
(defrule ourrule
(weather raining)
=>
(assert (carry umbrella)))
(defrule rule_name
(pattern_1) ; IF Condition 1
(pattern_2) ; And Condition 2
.
.
(pattern_N) ; And Condition N
=> ; THEN
(action_1) ; Perform Action 1
(action_2) ; And Action 2
.
.
(action_N)) ; And Action N
The CLIPS shell can be invoked in UNIX-based systems with the clips
command. From within the CLIPS shell, the user can assert facts, defrules,
and (run) the inference engine. When the user issues the (run) command,
the inference engine pattern matches facts with rules. If all patterns are matched
within the rule, then the actions associated with that rule are fired. To load
facts and rules from an external file, use the -f option (e.g., clips -f
database.clp). Table 14.18 summarizes the commands accessible from within
the CLIPS shell and usable in CLIPS scripts. Next, we briefly discuss three language
concepts that are helpful in CLIPS programming.
14.10.2 Variables
Variables in CLIPS are prefixed with a ? (e.g., ?x). Variables need not be declared
explicitly, but they must be bound to a value before they are used. Consider the
following program that computes a factorial:
(defrule factorial
(factrun ?x)
=>
(assert (fact ?x 1)))
(defrule facthelper
(fact ?x ?y)
(test (> ?x 0))
=>
(assert (fact (- ?x 1) (* ?x ?y))))
When the facts for the rule facthelper are pattern matched, ?x and ?y are each
bound to a value. Next, the bound value for ?x is used to evaluate the validity of
the fact (test (> ?x 0)). When variables are bound within a rule, that binding
Command Function
(run) Run the inference engine.
(facts) Retrieve the current fact-list.
(clear) Restores CLIPS to startup state.
(retract n) Retract fact n.
(retract *) Retract all facts.
(watch facts) Observe facts entering or exiting memory.
(exit) Exits the CLIPS shell.
exists only within that rule. For persistent global data, defglobal should be used
as follows:
(defglobal ?*var* = "" )
14.10.3 Templates
Templates are used to associate related data (e.g., facts) in a single package—
similar to structs in C. Templates are containers for multiple facts, where each
fact is a slot in the template. Rules can be pattern matched to templates based on
a subset of a template’s slots. Following is a demonstration of the use of pattern
matching to select specific data from a database of facts:
(deftemplate car
(slot make
(type SYMBOL)
(allowed-symbols
truck compact)
(default compact))
(multislot name
(type SYMBOL)
(default ?DERIVE)))
(deffacts cars
(car (make truck)
(name Tundra))
(car (make compact)
(name Accord))
(car (make compact)
(name Passat)))
(defrule compactcar
(car (make compact)
(name ?name))
=>
(printout t ?name crlf))
708 CHAPTER 14. LOGIC PROGRAMMING
b 2
a a
a
1 3
Reproduced from Arabnia, Hamid R., Leonidas Deligiannidis, Michale R. Grimaila, Douglas D.
Hodson, and Fernando G. Tinetti. 2019. CSC’19: Proceedings of the 2019 International Conference on
Scientific Computing. Las Vegas: CSREA Press.
Examples:
CLIPS> (run)
Input string: aaabba
Rejected
CLIPS> (reset)
CLIPS> (run)
Input string: aabbba
Accepted
Exercise 14.10.2 Rewrite the factorial program in Section 14.10.2 so that only the
fact with the final result of the factorial rule is stored in the fact list. Note that
retract can be used to remove facts from the fact list.
14.11. APPLICATIONS OF LOGIC PROGRAMMING 709
Examples:
noun_phrase(NP) :- noun_phrase_adj(NP).
noun_phrase_adj(NP) :- noun(NP).
verb_phrase(VP) :- verb(VP).
Conclusion
Well, what do you know about that! These forty years now, I’ve been
speaking in prose without knowing it!
— Monsieur Jourdain in Moliére’s The Bourgeois Gentleman (new verse
adaptation by Timothy Mooney)
This process has taught us how to use, compare, and build programming languages.
It has also made us better programmers and well-rounded computer scientists.
Tail recursion
Functions
Lambda/anonymous
Recursion Tail calls Tail-call optimization without a run-time
functions
stack
Pass-by-name
parameters Trampolines
(thunks)
First-class
Continuation- Generators/
continuations Coroutines
passing style iterators
e.g., via call/cc
lazy evaluation
Modular
pass-by-need
First-class functions programming
parameters
First-class closures
Object-oriented
allocated from Objects
programming
the heap
Curried HOFs,
Currying/uncurrying
Higher-order higher-order
curried
functions (functional)
functions
programming
Figure 15.1 The relationships between some of the concepts we studied. A solid
directed arrow indicates that the target concept relies only on the presence of the
source concept. A dotted directed arrow indicates that the target concept relies
partially on the presence of the source concept.
The abstraction baked into this expression is isolated in the fixed-point Y combinator,
which we implemented in JavaScript in Programming Exercise 6.10.15.
716 CHAPTER 15. CONCLUSION
Object-oriented
Objects Run-time typing
programming
Bottom-up
Higher-order functions Macros Metaprogramming
programming
Embedded
languages
Patterns/abstractions Homoiconicity
you give someone Lisp, he has any language he pleases” (Friedman and Felleisen
1996b, Afterword, p. 207). For instance, support for object-oriented programming
can be built from the abstractions already available to the programmer in Lisp
(Graham 1993, p. ix). Lisp’s support for macros, closures, and dynamic typing lifts
object-oriented programming to another level (Graham 1996, p. 2). Figure 15.2
depicts the relationships between these advanced concepts of programming
languages. (Notice that macros are central in Figure 15.2, much as closures are
central in Figure 15.1.) Homoiconic languages with macros (e.g., Lisp and Clojure)
simplify metaprogramming and, thus, bottom-up programming (Figure 15.2).
We encourage readers to explore macros and bottom-up programming further,
especially in the works by Graham (1993, 1996) and Krishnamurthi (2003).
Lastly, let us reconsider some of the ideas introduced in Chapter 1. Over
the past 20 years or so, certain language concepts introduced in foundational
languages have made their way into more contemporary languages. Today,
language concepts conceived in Lisp and Smalltalk—first-class functions and
closures, dynamic binding, first-class continuations, and homoiconicity—are
increasingly making their way into contemporary languages. Heap-allocated, first-
class, lexical closures; first-class continuations; homoiconicity; and macros are
concepts and constructs for building language abstractions to make programming
easier.
Exercise 15.2 Identify a programming language with which you are unfamiliar.
Armed with your understanding of language concepts, design options, and styles
of programming as a result of formal study of language and language concepts,
describe the language through its most defining characteristics. If you completed
Conceptual Exercise 1.16 when you embarked on this course of study, revisit the
language you analyzed in that exercise. In which ways do your two (i.e., before
and after) descriptions of that language differ?
Exercise 15.3 Revisit the recurring book themes introduced in Section 1.6 and
reflect on the instances of these themes you encountered through this course of
study. Classify the following items using the themes outlined in Section 1.6.
• Comments cannot nest in C and C++.
• Scheme uses prefix notation for both operators and functions—there really is
no difference between the two in Scheme. Contrast with C, which uses infix
notation for operators and prefix notation for functions.
• The while loop in Camille.
• Static vis-à-vis dynamic scoping.
• Lazy evaluation enables the implementation of complex algorithms in a
concise way (e.g., quicksort in three lines of code, Sieve of Eratosthenes).
• C uses pass-by-name for the if statement, but pass-by-value for user-defined
functions.
• Deep, ad hoc, and shallow binding.
• All operators use lazy evaluation in Haskell.
• First version of Lisp used dynamic scoping, which is easier to implement than
lexical scoping but turned out to be less natural to use.
• In Smalltalk, everything is an object and all computation is described as
passing messages between objects.
• Conditional evaluation in Camille.
• Multiple parameter-passing mechanisms.
Exercise 15.4 Reflect on why some languages have been in use for more than
50 years (e.g., Fortran, C, Lisp, Prolog, Smalltalk), while others are either no longer
supported or rarely, if ever, used (e.g., APL, PL/1, Pascal). Write a short essay
discussing the factors affecting language survival.
Exercise 15.5 Write a short essay reflecting on how you met, throughout this
course of study, the learning outcomes identified in Section 1.8. Perhaps draw
some diagrams to aid your reflection.
15.5. FURTHER READING 719
Python Primer
programming, after having read this appendix can write intermediate programs in
Python.
A.2 Introduction
Python is a statically scoped language, uses an eager evaluation strategy,
incorporates functional features and a terse syntax from Haskell, and incorporates
data abstraction from Dylan and C++. One of the most distinctive features of
Python is its use of indentation to demarcate blocks of code. While Python
was developed and implemented in the late 1980s in the Netherlands by Guido
van Rossum, it was not until the early 2000s that the language’s use and
popularity increased. Python is now embraced as a general-purpose, interpreted
programming language and is available for a variety of platforms.
This appendix is not intended to be a comprehensive Python tutorial or
language reference. Its primary objective is to establish an understanding of
Python programming in a reader already familiar with imperative and some
functional programming as preparation for the use of Python, through which to
study of concepts of programming languages and build language interpreters
in this text. Because of the multiple styles of programming it supports (e.g.,
imperative, object-oriented, and functional), Python is a worthwhile vehicle
through which to explore language concepts, including lexical closures, lambda
functions, iterators, dynamic type systems, and automatic memory management.
(Throughout this text, we explore closures (in Chapter 6), typing (in Chapter 7),
currying and higher-order functions (in Chapter 8), type systems (in Chapter 9),
and lazy evaluation (in Chapter 12) through Python. We also build language
interpreters in Python in Chapters 10–12.) We leave the use of Python for exploring
language concepts for the main text of this book.
This appendix is designed to be straightforward and intuitive for anyone
familiar with imperative and functional programming in another language, such
as Java, C++, or Scheme. We often compare Python expressions to their analogs in
Scheme. We use the Python 3.8 implementation of Python. Note that ąąą is the
prompt for input in the Python interpreter used in this text.
>>> help( i n t )
Help on c l a s s i n t in module builtins:
class in t(object)
| i n t ([x]) -> integer
| i n t (x, base=10) -> integer
|
A.3. DATA TYPES 723
To convert a value of one type to a value of another type, the constructor method
for the target type class can be called:
>>> s t r (123)
'123'
>>> s t r ( i n t ("123"))
'123'
Python has the following types: numeric (int, float, complex), sequences (str,
unicode, list, tuple, set, bytearray, buffer, xrange), mappings (dict),
files, classes, instances and exceptions, and bool:
>>> bool
<type 'bool'>
>>> type(True)
<type 'bool'>
>>> s t r
<type 'str'>
>>> type('a')
<type 'str'>
>>> i n t
<type 'int'>
>>> type(3)
<type 'int'>
>>> f l o a t
<type 'float'>
>>> type(3.3)
<type 'float'>
>>> l i s t
724 APPENDIX A. PYTHON PRIMER
<type 'list'>
>>> type([2,3,4])
<type 'list'>
>>> type([2,2.1,"hello"])
<type 'list'>
>>> t u p l e
<type 'tuple'>
>>> type((2,3,4))
<type 'tuple'>
>>> s e t
< c l a s s 'set'>
>>> type({1,2,3,3,4})
< c l a s s 'set'>
>>> d i c t
<type 'dict'>
For a list of all of the Python built-in types, enter the following:
NAME
builtins - Built-in functions, exceptions, and other objects.
DESCRIPTION
Noteworthy: None i s the `nil' object;
Ellipsis represents `...' in slices.
CLASSES
object
BaseException
Exception
ArithmeticError
...
Python does not use explicit type declarations for variables, but rather uses
type inference as variables are (initially) assigned a value. Memory for variables
is allocated when variables are initially assigned a value and is automatically
garbage collected when the variable goes out of scope.
In Python, ’ and " have the same semantics. When quoting a string containing
single quotes, use double quotes, and vice versa:
Alternatively, as in C, use \ to escape the special meaning a " within double quotes:
• Character conversions. The ord and chr functions are used for character
conversions:
>>> ord('a')
97
>>> chr(97)
'a'
>>> chr(ord('a'))
'a'
• Numeric conversions.
>>> i n t (3.4) # type conversion
3
>>> f l o a t (3)
3.0
• String concatenation. The + is the infix binary append operator that is used
for concatenating two strings.
• Arithmetic. The infix binary operators +, -, and * have the usual semantics.
Python has two division operators: // and /. The // operator is a floor
division operator for integer and float operands:
>>> 10 // 3
3
>>> -10 // 3
-4
>>> 10.0 // 3.333
3.0
>>> -10.0 // 3.333
-4.0
>>> 4 // 2
2
>>> 1 // -2
-1
726 APPENDIX A. PYTHON PRIMER
>>> 10 / 3
3.3333333333333335
>>> -10 / 3
-3.3333333333333335
>>> 10.0 / 3.333
3.0003000300030003
>>> -10.0 / 3.333
-3.0003000300030003
>>> 4 / 2
2.0
>>> 1 / -2
-0.5
• Comparison. The infix binary operators == (equal to), <, >, <=, >=, and !=
(not equal to) compare integers, floats, characters, strings, and values of other
types:
>>> 4 == 2
False
>>> 4 > 2
True
>>> 4 != 2
True
>>> 'b' > 'a'
True
>>> ['b'] > ['a']
True
• Boolean operators. The infix operators or, and, and not are used with the
usual semantics. The operators or and and use short-circuit evaluation (or lazy
evaluation as discussed in Chapter 12):
>>> i f 1 != 2:
... "Python has a one-armed if statement"
...
'Python has a one-armed if statement'
>>>
>>> i f 1 != 2:
... "true branch"
... e l s e :
... "false branch"
...
'true branch'
A.4. ESSENTIAL OPERATORS AND EXPRESSIONS 727
>>> i f 1 != 2:
... "true branch"
... e l s e :
File "<stdin>", line 3
else:
^
IndentationError: unindent does not match any outer
indentation level
The indentation conventions enforced by Python are for the benefit of the
programmer—to avoid buggy code. As Bruce Eckel says:
• Comments.
‚ Single-line comments:
‚ Multi-line comments. While Python does not have a special syntax for
multi-line comments, a multi-line comment can be simulated using a
multi-line string because Python ignores a string if it is not being used in
an expression or statement. The syntax for multi-line strings in Python
uses triple quotes—either single or double:
p r i n t ("This is code.")
"""
This string will be ignored by the Python interpreter
because it is not being used in an expression or statement.
Thus, it functions as a multi-line comment.
"""
p r i n t ("More code.")
"Regular strings can also function as comments,"
# but since Python has special syntax for a
# single-line comment, they typically are not used that way.
a_function()
This i s where docstrings reside f o r functions.
A docstring can be a single- or multi-line string.
Docstrings are used by the Python help system.
• The list/split and join functions are Python’s analogs of the explode
and implode functions in ML, respectively:
>>> l i s t ("apple")
['a', 'p', 'p', 'l', 'e']
$ python
>>> 2 + 3
5
>>>
1 >>> answer = 2 + 3
2 >>> answer
3 5
4 >>> def f(x):
5 ... return x + 1
6 ...
7 >>> f(1)
8 2
9 >>> ^D
10 $
Enter the EOF character [which is ăctrl-dą on UNIX systems (line 9) and
ăctrl-zą on Windows systems] or quit() to exit the interpreter.
‚ Enter python ăfilenameą.py from the command prompt using file
I / O , which causes the program in ăfilenameą.py to be evaluated line
by line by the interpreter:3
1 $ cat first.py
2
3 answer = 2 + 3
4
5 answer
6
7 def f(x):
8 return x + 1
9
10 f(1)
11
12 $ python first.py
13 $
2. The name of the executable file for the Python interpreter may vary across systems (e.g.,
python3.8).
3. The interpreter automatically exits once EOF is reached and evaluation is complete.
730 APPENDIX A. PYTHON PRIMER
1 $ cat first.py
2
3 answer = 2 + 3
4
5 p r i n t (answer)
6
7 def f(x):
8 return x + 1
9
10 p r i n t (f(1))
11 $
12 $ python first.py
13 5
14 2
15 $
1 $ cat first.py
2
3 answer = 2 + 3
4
5 p r i n t (answer)
6
7 def f(x):
8 return x + 1
9
10 p r i n t (f(1))
11
12 $ python
13 Python 3.8.3 (default, May 15 2020, 14:33:52)
14 [Clang 10.0.1 (clang-1001.0.46.4)] on darwin
15 Type "help", "copyright", "credits" or "license"
16 f o r more information.
17 >>>
18 >>> import first
19 5
20 2
21 >>>
If the program is modified, enter the following lines into the interpreter
to reload it:
‚ Redirect standard input into the interpreter from the keyboard to a file
by entering python < ăfilenameą.py at the command prompt:4
$ cat first.py
answer = 2 + 3
p r i n t (answer)
def f(x):
return x + 1
p r i n t (f(1))
A.5 Lists
As in Scheme, but unlike in ML and Haskell, lists in Python are heterogeneous,
meaning all elements of the list need not be of the same type. For example, the
list [2,2.1,"hello"] in Python is heterogeneous while the list [2,3,4] in
Haskell is homogeneous. Like ML and Haskell, Python is type safe. However,
Python is dynamically typed, unlike ML and Haskell. The semantics of [] is
the empty list. Tuples (Section A.6) are more appropriate to store unordered
items of different types. Lists in Python are indexed using zero-based indexing.
The + is the append operator that accepts two lists and appends them to each
other.
Examples:
>>> [1,2,3]
[1, 2, 3]
>>> [1.1,2,False,"hello"]
[1.1,2,False,"hello"]
>>> []
[]
4. Again, the interpreter automatically exits once EOF is reached and evaluation is complete.
732 APPENDIX A. PYTHON PRIMER
>>> [2,2.1,"2"][2].isdigit()
True
>>> "hello world"[2].isdigit()
False
>>> [1,2,3][2]
3
>>> [1,2,3]+[4,5,6]
[1, 2, 3, 4, 5, 6]
Lists in Python vis-à-vis lists in Lisp. There is not a direct analog of the cons
operator in Python. The append list operator + can be used to simulate cons, but
its time complexity is Opnq. For instance,
A.6. TUPLES 733
Examples:
A.6 Tuples
A tuple is a sequence of elements of potentially mixed types. A tuple typically
contains unordered, heterogeneous elements akin to a struct in C with the
exception that a tuple is indexed by numbers (like a list) rather than by field names
(like a struct). Formally, a tuple is an element e of a Cartesian product of a given
number of sets: e P pS1 ˆ S2 ˆ ¨ ¨ ¨ ˆ Sn q. A two-element tuple is a called a pair [e.g.,
e P pA ˆ Bq]. A three-element tuple is a called a triple [e.g., e P pA ˆ B ˆ Cq].
The difference between lists and tuples in Python, which has implications for
their usage, can be captured as follows. Tuples are a data structure whose fields
are unordered and have different meanings, such that they typically have different
types. Lists, by contrast, are ordered sequences of elements, typically of the same
type. For instance, a tuple is an appropriate data structure for storing an employee
record containing id, name, rate, and a designation of promotion or not. In turn,
a company can be represented by a list of these employee tuples ordered by
employment date:
Although this situation is rare, the need might arise for a tuple with only one
element. Suppose we tried to create a tuple this way:
>>> (1)
1
>>> ("Mary")
'Mary'
The expression (1) does not evaluate to a tuple; instead, it evaluates to the
integer 1. Otherwise, this syntax would introduce ambiguity with parentheses
in mathematical expressions. However, Python does have a syntax for making a
tuple with only one element—insert a comma between the element and the closing
parenthesis:
>>> (1,)
(1,)
>>> ("Mary",)
('Mary',)
When defining functions at the read-eval-print loop as shown here, a blank line is
required to denote the end of a function definition (lines 3 and 6).
Note that order matters if you omit the keyword in the call:
i n t i, tmp;
Note that the keyword arguments must be listed after all of the positional
arguments in the argument list.
• Mixture of positional and unnamed keyword arguments:5
5. We use sys.stdout.write here rather than print to suppress a space from being automatically
written between arguments to print.
738 APPENDIX A. PYTHON PRIMER
These Python functions are the analogs of the following Scheme functions:
> (square 4)
16
> (add 3 4)
7
> (inc 5)
6
7
>>> add6(2)
8
factorial
fibonacci
reverse
Note that reverse can reverse a list containing values of any type.
member
False
>>> 5 in [1,2,3,4]
False
Local Binding
These functions are the Python analogs of the following Scheme functions:
(define (powerset l)
(cond
((n u l l? l) '(()))
(else
( l e t ((y (powerset (cdr l))))
(append (insertineach (car l) y) y)))))
A.7. USER-DEFINED FUNCTIONS 743
Nested Functions
Note that more than two mutually recursive functions can be defined.
i f lat == []:
r e t u r n ([], [])
e l i f len(lat) == 1:
r e t u r n ([], lat)
else:
(left, right) = split (lat[2:])
r e t u r n ([lat[0]] + left, [lat[1]] + right)
A.7. USER-DEFINED FUNCTIONS 745
r e t u r n merge(leftsorted, rightsorted)
p r i n t (mergesort ([9,8,7,6,5,4,3,2,1]))
$
$ python mergesort.py
[1, 2, 3, 4, 5, 6, 7, 8, 9]
i f lat == []:
r e t u r n ([], [])
e l i f len(lat) == 1:
r e t u r n ([], lat)
else :
(left, right) = split (lat[2:])
r e t u r n ([lat[0]] + left, [lat[1]] + right)
# split it
(left, right) = split(lat)
r e t u r n merge(leftsorted, rightsorted)
p r i n t (mergesort ([9,8,7,6,5,4,3,2,1]))
$
$ python mergesort.py
[1, 2, 3, 4, 5, 6, 7, 8, 9]
$ cat mergesort.py
import operator
i f lat == []:
r e t u r n ([], [])
e l i f len(lat) == 1:
r e t u r n ([], lat)
else :
(left, right) = split (lat[2:])
r e t u r n ([lat[0]] + left, [lat[1]] + right)
i f lat == []:
r e t u r n []
e l i f len(lat) == 1:
r e t u r n lat
else:
# split it
(left, right) = split(lat)
r e t u r n merge(compop,leftsorted, rightsorted)
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[9, 8, 7, 6, 5, 4, 3, 2, 1]
Final Version
The following is the final version of mergesort using nested, protected functions
and accepting a comparison operator as a parameter that is factored out to avoid
passing it between successive recursive calls. We also use a keyword argument for
the comparison operator:
import operator
i f lat == []:
r e t u r n ([], [])
e l i f len(lat) == 1:
r e t u r n ([], lat)
else:
(left, right) = split (lat[2:])
r e t u r n ([lat[0]] + left, [lat[1]] + right)
i f lat == []:
r e t u r n []
e l i f len(lat) == 1:
r e t u r n lat
else :
# split it
(left, right) = split(lat)
r e t u r n merge(leftsorted, rightsorted)
r e t u r n mergesort1(lat)
p r i n t (mergesort ([9,8,7,6,5,4,3,2,1]))
p r i n t (mergesort ([1,2,3,4,5,6,7,8,9], operator.gt))
$
$ python mergesort.py
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[9, 8, 7, 6, 5, 4, 3, 2, 1]
748 APPENDIX A. PYTHON PRIMER
Notice also that we factored the argument compop out of the function merge in
this version, since it is visible from an outer scope.
>>> c l a s s new_counter:
... def __init__(self, initial):
... self.current = initial
... def __call__(self):
... self.current = self.current + 1
... r e t u r n self.current
...
>>> counter1 = new_counter(1)
>>> counter2 = new_counter(100)
>>>
>>> counter1
<__main__.new_counter o b j e c t at 0x10c12b250>
>>> counter2
<__main__.new_counter o b j e c t at 0x10c0f37f0>
>>>
>>> counter1()
2
>>> counter1()
3
>>> counter2()
101
>>> counter2()
102
>>> counter1()
4
>>> counter1()
5
>>> counter2()
103
While the object-oriented approach is perhaps more familiar to those readers from
a traditional object-oriented programming background, it executes more slowly
due to the object overhead. However, the following approach permits multiple
callable objects to share their signature through inheritance:
>>> c l a s s new_counter:
... def __init__(self, initial):
... self.current = initial
... def __call__(self):
... self.current = self.current + 1
... r e t u r n self.current
A.8. OBJECT-ORIENTED PROGRAMMING IN PYTHON 749
...
>>>
>>> c l a s s custom_counter(new_counter):
... # __init__ is inherited from parent class new_counter
... def __call__(self, step):
... self.current = self.current + step
... r e t u r n self.current
...
>>> counter1 = custom_counter(1)
>>> counter2 = custom_counter(100)
>>>
>>> counter1(1)
2
>>> counter1(2)
4
>>> counter2(3)
103
>>> counter2(4)
107
>>> counter1(5)
9
>>> counter1(6)
15
>>> counter2(7)
114
Notice that the callable object returned is bound to the environment in which it
was created. In traditional object-oriented programming, an object encapsulates
(or binds) multiple functions (called methods) and (to) an (the same) environment.
Thus, we can augment the class new_counter with additional methods:
>>> c l a s s new_counter:
... current = 0
... def initialize(self, initial):
... self.current = initial
... def increment(self):
... self.current = self.current+1
... def decrement(self):
... self.current = self.current-1
... def get(self):
... r e t u r n self.current
... def write(self):
... p r i n t (self.current)
...
>>> counter1 = new_counter()
>>> counter2 = new_counter()
>>> counter1.initialize(1)
>>> counter2.initialize(100)
>>> counter1.increment()
>>> counter2.increment()
>>> counter1.increment()
>>> counter2.increment()
>>> counter1.increment()
>>> counter2.increment()
>>> counter1.write()
4
>>> counter2.write()
103
>>> counter1.decrement()
>>> counter2.decrement()
>>> counter1.decrement()
750 APPENDIX A. PYTHON PRIMER
>>> counter2.decrement()
>>> counter1.decrement()
>>> counter2.decrement()
>>> counter1.write()
1
>>> counter2.write()
100
>>> divisor = 1
>>> integer = integer / divisor
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'integer' i s not defined
In executing the syntactically valid second line of code, the interpreter raises a
NameError because integer is not defined before the value of integer is used.
Because this exception is not handled by programmer, it is fatal. However, the
exception may be caught and handled:
>>> divisor = 1
>>> t r y :
... integer = integer / divisor
... e x c e p t :
... p r i n t ("An exception occurred. Proceeding anyway.")
... integer = 0
...
An exception occurred. Proceeding anyway.
>>> p r i n t (integer)
0
This example catches all exceptions that may occur within the try block of the
exception. The except block will execute only if an exception occurs in the try
block.
Python also permits programmers to catch specific exceptions and define a
unique except block for each exception:
>>> divisor = 1
>>> try:
... integer = integer / divisor
... e x c e p t NameError:
... p r i n t ("Caught a name error.")
... e x c e p t ZeroDivisionError:
... p r i n t ("Caught a divide by 0 error.")
... e x c e p t Exceptions as e:
... p r i n t ("Caught something else.")
A.9. EXCEPTION HANDLING 751
... p r i n t (e)
...
Caught a name error.
Lastly, programmers may raise their own exceptions to force an exception to occur:
>>> t r y :
... r a i s e NameError
... e x c e p t NameError:
... p r i n t ("Caught my own exception!")
...
Caught my own exception!
Examples:
Exercise A.2 Define a recursive Python function called makeset without using a
set. The makeset function accepts only a list as input and returns the list with
any repeating elements removed. The order in which the elements appear in the
returned list does not matter, as long as there are no duplicate elements. Do not
use any user-defined auxiliary functions, except member.
Examples:
>>> makeset([1,3,4,1,3,9])
[4,1,3,9]
>>> makeset([1,3,4,9]
[1, 3, 4, 9]
>>> makeset(["apple","orange","apple"])
['orange', 'apple']
Exercise A.3 Solve Programming Exercise A.2, but this time use a set in your
definition. The function must still accept and return a list. Hint: This can be done
in one line of code.
Exercise A.4 Define a recursive Python function cycle that accepts only a list and
an integer i as arguments and cycles the list i times. Do not use any user-defined
auxiliary functions.
Examples:
Exercise A.5 Define a recursive Python function transpose that accepts a list as
its only argument and returns that list with adjacent elements transposed. Specifi-
cally, transpose accepts an input list of the form re1 , e2 , e3 , e4 , e5 , e6 ¨ ¨ ¨ , en s
A.9. EXCEPTION HANDLING 753
>>> transpose([1,2,3,4])
[2,1,4,3]
>>> transpose([1,2,3,4,5,6])
[2,1,4,3,6,5]
>>> transpose([1,2,3])
[2,1,3]
Exercise A.6 Define a recursive Python function oddevensum that accepts only a
list of integers as an argument and returns a pair consisting of the sum of the odd
and even positions of the list, in that order. Do not use any user-defined auxiliary
functions.
Examples:
>>> oddevensum([])
(0, 0)
>>> oddevensum([6])
(6, 0)
>>> oddevensum([6,3])
(6, 3)
>>> oddevensum([6,3,8])
(14, 3)
>>> oddevensum([1,2,3,4])
(4,6)
>>> oddevensum([1,2,3,4,5,6])
(9,12)
>>> oddevensum([1,2,3])
(4,2)
Exercise A.7 Define a recursive Python function member that accepts only an
element and a list of values of the type of that element as input and returns True
if the item is in the list and False otherwise. Do not use in within the definition
of your function. Hint: This can be done in one line of code.
Exercise A.8 Define a recursive Python function permutations that accepts only
a list representing a set as an argument and returns a list of all permutations of that
list as a list of lists. You will need to define some nested auxiliary functions. Pass a
λ-function to map where applicable in the bodies of the functions to simplify their
definitions.
Examples:
>>> permutations([])
[]
>>> permutations([1])
754 APPENDIX A. PYTHON PRIMER
[[1]]
>>> permutations([1,2])
[[1,2],[2,1]]
>>> permutations([1,2,3])
[[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]]
>>> permutations([1,2,3,4])
[[1,2,3,4],[1,2,4,3],[1,3,2,4],[1,3,4,2], [1,4,2,3],
[1,4,3,2],[2,1,3,4],[2,1,4,3], [2,3,1,4],[2,3,4,1],
[2,4,1,3],[2,4,3,1], [3,1,2,4],[3,1,4,2],[3,2,1,4],
[3,2,4,1], [3,4,1,2],[3,4,2,1],[4,1,2,3],[4,1,3,2],
[4,2,1,3],[4,2,3,1],[4,3,1,2],[4,3,2,1]]
+ operator appends two lists. Python supports anonymous/λ functions and both
positional and named keyword arguments to functions.
Introduction to ML
B.2 Introduction
ML (historically, MetaLanguage) is, like Scheme, a language supporting primarily
functional programming with some imperative features. It was developed by A. J.
Robin Milner and others in the early 1970s at the University of Edinburgh. ML is a
general-purpose programming language in that it incorporates functional features
from Lisp, rule-based programming (i.e., pattern matching) from Prolog, and data
abstraction from Smalltalk and C++. ML is an ideal vehicle through which to
explore the language concepts of type safety, type inference, and currying. The
objective here, however, is elementary programming in ML. ML also, like Scheme,
is statically scoped. We leave the use of ML to explore these language concepts to
the main text.
758 APPENDIX B. INTRODUCTION TO ML
1 - 3;
2 v a l it = 3 : int
3 - 3.33;
4 v a l it = 3.3 : real
5 - true;
6 v a l it = true : bool
7 - #"a";
8 v a l it = #"a" : char
9 - "hello world";
10 v a l it = "hello world" : string
Notice that ML uses type inference. The : colon symbol associates a value with a
type and is read as “is of type.” For instance, the expression 3 : int indicates
that 3 is of type int. This explains the responses of the interpreter on lines 2, 4, 6,
8, and 10 when an expression is entered on the preceding line.
- ord(#"a");
v a l it = 97 : int
- chr(97);
v a l it = #"a" : char
- chr(ord(#"a"));
v a l it = #"a" : char
- 4.2 / 2.1;
v a l it = 2.0 : real
- 4 div 2;
v a l it = 2 : int
- ~1;
v a l it = ~1 : int
• Comparison. The infix binary operators = (equal to), <, >, <=, >=, and <>
(not equal to) compare ints, reals, chars, or strings with one exception:
reals may not be compared using = or <>. Instead, use the prefix functions
Real.== and Real.!=. For now, we can think of Real as an object (in an
object-oriented program), == as a message, and the expression Real.==
as sending the message == to the object Real, which in turn executes
the method definition of the message. Real is called a structure in ML
(Section B.10). Structures are used again in Section B.12.
- 4 = 2;
v a l it = false : bool
- 4 > 2;
v a l it = true : bool
- 4 <> 2;
v a l it = true : bool
- Real.==(2.1, 4.1);
v a l it = false : bool
- Real.!=(4.1, 2.1);
v a l it = true : bool
1. Technically, all operators in ML are unary operators, in that each accepts a single argument that is
a pair. However, generally, though not always, there is no problem interpreting a unary operator that
only accepts a single pair as a binary operator.
760 APPENDIX B. INTRODUCTION TO ML
- true o r e l s e false;
v a l it = true : bool
- false andalso false;
v a l it = false : bool
- not false;
v a l it = true : bool
- explode;
v a l it = fn : string -> char list
- explode("apple");
v a l it = [#"a",#"p",#"p",#"l",#"e"] : char list
- implode;
v a l it = fn : char list -> string
- implode([#"a", #"p", #"p", #"l", #"e"]);
v a l it = "apple" : string
- implode(explode("apple"));
v a l it = "apple" : string
$ sml
Standard ML of New Jersey (64-bit) v110.98
- 2 + 3;
- v a l it = 5 : int
- ^D
$
B.5. RUNNING AN ML PROGRAM 761
Using this method of execution, the programmer can define new functions
at the prompt of the interpreter:
- fun f(x) = x + 1;
v a l f = fn : int -> int
- f(1);
v a l it = 2 : int
Use the EOF character (which is ăctrl-dą on UNIX systems and ăctrl-zą on
Windows systems) to exit the interpreter.
• Enter sml ăfilenameą.sml from the command prompt using file I / O,
which causes the program in ăfilenameą.sml to be evaluated:
0 $ cat first.sml
1
2 2 + 3;
3
4 fun inc(x) = x + 1;
5
6 $ sml first.sml
7 Standard ML of New Jersey (64-bit) v110.98
8 [opening first.sml]
9 v a l it = 5 : int
10 v a l f = fn : int -> int
11 -
12 - f(1);
13 v a l it = 2 : int
14 -
0 $ cat first.sml
1
2 2 + 3;
3
4 fun inc(x) = x + 1;
5
6 $ sml
7 Standard ML of New Jersey (64-bit) v110.98
8 -
9 - use "first.sml";
10 [opening first.sml]
11 v a l it = 5 : int
12 v a l it = () : unit
13 -
14 - inc(1);
15 v a l it = 2 : int
16 -
• Redirect standard input into the interpreter from the keyboard to a file by
entering sml < ăfilenameą.sml at the command prompt:2
2. The interpreter automatically exits once EOF is reached and evaluation is complete.
762 APPENDIX B. INTRODUCTION TO ML
B.6 Lists
The following are the some important points about lists in ML.
Examples:
- [1,2,3];
v a l it = [1,2,3] : int list
- nil;
v a l it = [] : 'a list
- [];
v a l it = [] : 'a list
- 1::2::[3];
v a l it = [1,2,3] : int list
- 1::nil;
v a l it = [1] : int list
- 1::[];
v a l it = [1] : int list
B.7. TUPLES 763
- 1::2::nil;
v a l it = [1,2] : int list
- hd(1::2::[3]);
v a l it = 1 : int
- tl(1::2::[3]);
v a l it = [2,3] : int list
- hd([1,2,3]);
v a l it = 1 : int
- tl([1,2,3]);
v a l it = [2,3] : int list
- [1,2,3]@[4,5,6];
v a l it = [1,2,3,4,5,6] : int list
B.7 Tuples
A tuple is a sequence of elements of potentially mixed types. Formally, a
tuple is an element e of a Cartesian product of a given number of sets:
e P pS1 ˆ S2 ˆ ¨ ¨ ¨ ˆ Sn q. A two-element tuple is called a pair [e.g., e P pA ˆ Bq]. A
three-element tuple is called a triple [e.g., e P pA ˆ B ˆ Cq]. A tuple typically contains
unordered, heterogeneous elements akin to a struct in C with the exception that a
tuple is indexed by numbers (like a list) rather than by field names (like a struct).
While tuples can be heterogeneous, in a list of tuples, each tuple in the list must be
of the same type. Elements of a tuple are accessible by prefacing the tuple with #n,
where n is the number of the element, starting with 1:
The response from the interpreter when (1, "Mary", 3.76) (line 1) is entered
is (1,"Mary",3.76) : int * string * real (line 2). This response
indicates that the tuple (1,"Mary",3.76) consists of an instance of type int,
an instance of type string, and an instance of type real. The response from the
interpreter when a tuple is entered (e.g., int * string * real) demonstrates
that a tuple is an element of a Cartesian product of a given number of sets.
Here, the *, which is not intended to mean multiplication, is the analog of the
Cartesian-product operator ˆ, and the data types are the sets involved in the
Cartesian product. In other words, int * string * real is a type defined by
the Cartesian product of the set of all ints, the set of all strings, and the set of
all reals. An element of the Cartesian product of the set of all ints, the set of all
strings, and the set of all reals has the type int * string * real:
Here, the type of square is a function int -> int or, in other words, a
function that maps an int to an int. Similarly, the type of add is a function
int * int -> int or, in other words, a function that maps a tuple of type
int * int to an int. Notice that the interpreter prints the domain of a function
that accepts more than one parameter as a Cartesian product using the notation
described in Section B.7. These functions are the ML analogs of the following
Scheme functions:
Notice that the ML syntax involves fewer lexemes than Scheme (e.g., define is not
included). Without excessive parentheses, ML is also more readable than Scheme.
The first version (defined on line 2) does not use pattern-directed invocation; that
is, there is only one definition of the function. The second version (defined on
lines 6–7) uses pattern-directed invocation. If the literal 0 is passed as the second
argument to the function gcd, then the first definition of gcd is used (line 6);
otherwise, the second definition (line 7) is used.
Pattern-directed invocation is not identical to operator/function overloading.
Overloading involves determining which definition of a function to invoke based
on the number and types of arguments it is passed at run-time. With pattern-
directed invocation, no matter how many definitions of the function exist, all have
the same type signature (i.e., number and type of parameters).
Native support for pattern-directed invocation is one of the most convenient
features of user-defined functions in ML because it obviates the need for an
if–then–else expression to differentiate between the various inputs to a
function. Conditional expressions are necessary in languages without built-in
pattern-directed invocation (e.g., Scheme). The following are additional examples
of pattern-directed invocation:
- fun factorial(0) = 1
= | factorial(n) = n * factorial(n-1);
v a l factorial = fn : int -> int
- fun fibonacci(0) = 1
= | fibonacci(1) = 1
766 APPENDIX B. INTRODUCTION TO ML
1 i n t f ( i n t a, i n t b) {
2 r e t u r n (a+b);
3 }
4
5 i n t main() {
6 r e t u r n f(2+3, 4);
7 }
Here, the expression 2+3 is the first argument to the function f that is called on
line 6. Since C uses an eager evaluation parameter-passing strategy, the expression
2+3 is evaluated as 5 and then 5 is passed to f. However, in the body of f, there
is no way to conveniently decompose 5 back to 2+3.
Pattern-directed invocation allows ML to support the decomposition of an
argument from within the signature itself by using a pattern in a parameter. For
instance, consider these three versions of a reverse function:
1 $ cat reverse.sml
2 (* without pattern-directed invocation we need
3 an if-then-else and calls to hd and tl *)
4 fun reverse(lst) =
5 i f null(lst) then nil
6 e l s e reverse(tl(lst)) @ [hd(lst)];
7
8 (* with pattern-directed invocation and
9 calls to hd and tl *)
10 fun reverse(nil) = nil
11 | reverse(lst) = reverse(tl(lst)) @ [hd(lst)];
12
13 (* with pattern-directed invocation,
14 calls to hd and tl are unnecessary *)
15 fun reverse(nil) = nil
16 | reverse(x::xs) = reverse(xs) @ [x];
17 $
18 $ sml reverse.sml
19 Standard ML of New Jersey (64-bit) v110.98
20 [opening reverse.sml]
21 [autoloading]
22 [library $MLNJ-BASIS/basis.cm is stable]
23 [autoloading done]
24 v a l reverse = fn : 'a list -> 'a list
25 v a l reverse = fn : 'a list -> 'a list
26 v a l reverse = fn : 'a list -> 'a list
While the pattern-directed invocation in the second version (lines 10–11) obviates
the need for the if–then–else expression (lines 5–6), the functions hd and tl
(lines 6 and 11) are required to decompose lst into its head and tail. Calls to
the functions hd and tl are obviated by using the pattern x::xs (line 16) in
B.8. USER-DEFINED FUNCTIONS 767
the parameter to reverse. When the third version of reverse is called with a
non-empty list, the second definition of it is executed (line 16), the head of the list
passed as the argument is bound to x, and the tail of the list passed as the argument
is bound to xs.
The cases form in the EOPL extension to Racket Scheme, which may be used
to decompose the constituent parts of a variant record as described in Chapter 9
(Friedman, Wand, and Haynes 2001), is the Racket Scheme analog of the use of
patterns in parameters to decompose arguments to a function. Pattern-directed
invocation, including the use of patterns for decomposing arguments, and the
pattern-action style of programming, is common in the programming language
Prolog.
Anonymous Parameters
The underscore (_) pattern on lines 1 and 2 of the definition of the
konsMinHeadtoOther function represents an anonymous parameter—a param-
eter whose name is unnecessary to the definition of the function. As an additional
example, consider the following definition of a list member function:
Type Variables
While some functions, such as square and add, require arguments of a particular
type, others, such as reverse and member, accept arguments of any type or
arguments whose types are partially restricted. For instance, the type of the
768 APPENDIX B. INTRODUCTION TO ML
function reverse is ’a list -> ’a list. Here, the ’a means “any type.”
Therefore, the function reverse accepts a list of any type ’a and returns a list
of the same type. The ’a is called a type variable. In programming languages,
the ability of a single function to accept arguments of different types is called
polymorphism because poly means “many” and morph means “form.” Such a
function is called polymorphic. A polymorphic type is a type expression containing
type variables. The type of polymorphism discussed here is called parametric
polymorphism, where a function or data type can be defined generically so that it
can handle values identically without depending on their type. (The type variable
”a means “any type that can be compared for equality.”)
Neither pattern-directed invocation nor operator/function overloading (some-
times called ad hoc polymorphism) is the identical to (parametric) polymorphism.
Overloading involves using the same operator/function name to refer to different
definitions of a function, each of which is identifiable by the different number or
types of arguments to which it is applied. Parametric polymorphism, in contrast,
involves only one operator/function name referring to only one definition of the
function that can accept arguments of multiple types. Thus, ad hoc polymorphism
typically only supports a limited number of such distinct types, since a separate
implementation must be provided for each type.
Local Binding
Lines 8–12 of the following example demonstrate local binding in ML:
0 $ cat powerset.sml
1 fun insertineach(_, nil) = nil
2 | insertineach(item, x::xs) =
3 (item::x)::insertineach(item, xs);
4
5 (* use of "let" prevents recomputation of powerset(xs) *)
6 fun powerset(nil) = [nil]
7 | powerset(x::xs) =
8 let
9 v a l y = powerset(xs)
10 in
11 insertineach(x, y)@y
12 end;
13 $
14 $ sml powerset.sml
15 Standard ML of New Jersey (64-bit) v110.98
16 [opening powerset.sml]
17 v a l insertineach = fn : 'a * 'a list list -> 'a list list
18 v a l powerset = fn : 'a list -> 'a list list
B.8. USER-DEFINED FUNCTIONS 769
Nested Functions
Since the function insertineach is intended to be only visible, accessible,
and called by the powerset function, we can also use a let ...in ...end
expression to nest it within the powerset function (lines 3–11 in the next
example):
0 $ cat powerset.sml
1 fun powerset(nil) = [nil]
2 | powerset(x::xs) =
3 let
4 fun insertineach(_, nil) = nil
5 | insertineach(item, x::xs) =
6 (item::x)::insertineach(item, xs);
7
8 v a l y = powerset(xs)
9 in
10 insertineach(x, y)@y
11 end;
12 $
13 $ sml powerset.sml
14 Standard ML of New Jersey (64-bit) v110.98
15 [opening powerset.sml]
16 v a l powerset = fn : 'a list -> 'a list list
17
18 - powerset([1]);
19 v a l it = [[1],[]] : int list list
20
21 - powerset([1,2]);
22 v a l it = [[1,2],[1],[2],[]] : int list list
23
24 - powerset([1,2,3]);
25 v a l it = [[1,2,3],[1,2],[1,3],[1],[2,3],[2],[3],[]] : int list list
$ cat reverse.sml
fun reverse(nil) = nil
| reverse(l) =
let
fun reverse1(nil, l) = l
770 APPENDIX B. INTRODUCTION TO ML
Note that the polymorphic type of reverse, [a] -> [a], indicates that
reverse can reverse a list of any type.
This makes the definition of mutually recursive functions (i.e., functions that call
each other) problematic without direct language support. Mutually recursive
functions in ML must be defined with the and reserved word between each
definition. For instance, consider the functions isodd and iseven, which rely
on each other to determine if an integer is odd or even, respectively:
- isodd(9);
v a l it = true : bool
- isodd(100);
v a l it = false : bool
- iseven(100);
v a l it = true : bool
- iseven(1000000000);
v a l it = true : bool
Note that more than two mutually recursive functions can be defined. Each but the
last must be followed by an and, and the last is followed with a semicolon (;). ML
performs tail-call optimization.
$ cat mergesort.sml
fun split(nil) = (nil, nil)
| split([x]) = (nil, [x])
| split(x::y::excess) =
let
v a l (l, r) = split(excess)
in
(x::l, y::r)
end;
$ cat mergesort.sml
fun mergesort(nil) = nil
| mergesort([x]) = [x]
| mergesort(lat) =
let
fun split(nil) = (nil, nil)
| split([x]) = (nil, [x])
| split(x::y::excess) =
let
v a l (l, r) = split(excess)
in
(x::l, y::r)
end;
(* split it *)
v a l (left, right) = split(lat);
$ cat mergesort.sml
fun mergesort(_, nil) = nil
| mergesort(_, [x]) = [x]
| mergesort(compop, lat) =
let
fun split(nil) = (nil, nil)
| split([x]) = (nil, [x])
| split(x::y::excess) =
let
v a l (l, r) = split(excess)
in
(x::l, y::r)
end;
(* split it *)
v a l (left, right) = split(lat);
Since the closing lexeme for a comment in ML is *), we must add a whitespace
character after the * when converting the infix multiplication operator to a prefix
operator:
- (op *) (4,5);
stdIn:1.5 Error: unmatched close comment
stdIn:1.8-1.11 Error: syntax error: deleting LPAREN INT COMMA
- (op * ) (4,5);
v a l it = 20 : int
Final Version
The following code is the final version of mergesort using nested, protected
functions and accepting a comparison operator as a parameter, which is factored
out to avoid passing it between successive recursive calls:
$ cat mergesort.sml
fun mergesort(_, nil) = nil
| mergesort(_, [x]) = [x]
| mergesort(compop, lat) =
let
(x::l, y::r)
end;
(* split it *)
v a l (left, right) = split(lat1);
Notice also that we factored the argument compop out of the function merge in
this version since it is visible from an outer scope.
- [1,2,3];
v a l it = [1,2,3] : int list
- (1, "Mary", 3.76)
v a l it = (1,"Mary",3.76) : int * string * real
- fun square(x) = x*x;
v a l square = fn : int -> int
B.10 Structures
The ML module system consists of structures, signatures, and functors. A structure
in ML is a collection of related data types and functions akin to a class from
object-oriented programming. (Structures and functors in ML resemble classes and
templates in C++, respectively.) Multiple predefined ML structures are available:
TextIO, Char, String, List, Math. A function within a structure can be invoked
with its fully qualified name (line 1) or, once the structure in which it resides has
been opened (line 8), with its unqualified name (line 29):
1 - Int.toString(3);
2 [autoloading]
3 [library $SMLNJ-BASIS/basis.cm is stable]
4 [library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
5 [autoloading done]
6 v a l it = "3" : string
7 -
8 -- open Int;
9 opening Int
10 type int = ?.int
11 v a l precision : Int.int option
12 v a l minInt : int option
776 APPENDIX B. INTRODUCTION TO ML
To prevent a function from one structure overriding a different function with the
same name from another structure in the single program, use fully qualified names
[e.g., Int.toString(3)].
B.11 Exceptions
The following code is an example of an exception.
- e x c e p t i o n NegativeInt;
- fun power(e,0) = i f (e < 0) then r a i s e NegativeInt e l s e 0
= | power(e,1) = i f (e > 0) then r a i s e NegativeInt e l s e 1
= | power(0,b) = 1
= | power(1,b) = b
= | power(e,b) = i f (e > 0) then r a i s e NegativeInt e l s e b*power(e-1, b);
e x c e p t i o n NegativeInt
v a l power = fn : int * int -> int
-
- power(3,~2);
uncaught e x c e p t i o n NegativeInt
raised at: stdIn:6.40-6.54
B.12.1 Input
The option data type has two values: NONE and SOME. Use isSome() to
determine the value of a variable of type option. Use valOf() to extract the
value of a variable of type option. A string option list is not the same as
a string list.
B.12. INPUT AND OUTPUT 777
Standard Input
The standard input stream generally does not need to be opened and closed.
- TextIO.inputLine(TextIO.stdIn);
get this line of text
v a l it = SOME "get this line of text\n" : string option
File Input
The following example demonstrates file input in ML.
$ cat input.txt
the quick brown fox ran slowly.
totally kewl
$
$ sml
Standard ML of New Jersey (64-bit) v110.98
-
- open TextIO;
< ... snipped ... >
- v a l ourinstream = openIn("input.txt");
v a l ourinstream = - : instream
- v a l line = inputLine(ourinstream);
v a l line = SOME "the quick brown fox ran slowly.\n" : string option
- isSome(line);
v a l it = true : bool
- v a l line = inputLine(ourinstream);
v a l line = SOME "totally kewl\n" : string option
- isSome(line);
v a l it = true : bool
- v a l line = inputLine(ourinstream);
v a l line = NONE : string option
- isSome(line);
v a l it = false : bool
- closeIn(ourinstream);
v a l it = () : unit
$ cat input.txt
This is certainly a
a file containing
multiple lines of text.
Each line is terminated with a
778 APPENDIX B. INTRODUCTION TO ML
by an ML
program.
$
$ cat input.sml
fun makeStringList(NONE) = nil
| makeStringList(SOME str) = (String.tokens (Char.isSpace)) (str);
fun readInput(infile) =
i f TextIO.endOfStream(infile) then nil
e l s e TextIO.inputLine(infile)::readInput(infile);
v a l infile = TextIO.openIn("input.txt");
TextIO.closeIn(infile);
$
$ sml input.sml
Standard ML of New Jersey (64-bit) v110.98
[opening input.sml]
[autoloading]
[library $MLNJ-BASIS/basis.cm is stable]
[autoloading done]
v a l makeStringList = fn : string option -> string list
[autoloading]
[autoloading done]
v a l readInput = fn : TextIO.instream -> string option list
v a l infile = - : TextIO.instream
v a l it =
[["This","is","certainly","a"],["a","file","containing"],
["multiple","lines","of","text."],
["Each","line","is","terminated","with","a"],
["newline","character.","This","file"],["will","be","read"],[],
["by","an","ML"],[],["program."]] : string list list
v a l it = () : unit
B.12.3 Output
Standard Output
File Output
The following transcript demonstrates file output in ML.
$ cat output.txt
hello world
- remove;
v a l it = fn : int * 'a list -> 'a list
- remove(1, [9,10,11,12]);
v a l it = [10,11,12] : int list
- remove(2, [9,10,11,12]);
v a l it = [9,11,12] : int list
- remove(3, [9,10,11,12]);
v a l it = [9,10,12] : int list
- remove(4, [9,10,11,12]);
v a l it = [9,10,11] : int list
- remove(5, [9,10,11,12]);
v a l it = [9,10,11,12] : int list
Exercise B.2 Define a recursive ML function called makeset that accepts only a
list of integers as input and returns the list with any repeating elements removed.
The order in which the elements appear in the returned list does not matter, as
long as there are no duplicate elements. Do not use any user-defined auxiliary
functions, except member.
Examples:
- makeset;
v a l it = fn : ''a list -> ''a list
- makeset([1,3,4,1,3,9]);
v a l it = [4,1,3,9] : int list
- makeset([1,3,4,9]);
v a l it = [1,3,4,9] : int list
- makeset(["apple","orange","apple"]);
v a l it = ["orange","apple"] : string list
Exercise B.3 Define a recursive ML function cycle that accepts only a list and an
integer i as arguments and cycles the list i times. Do not use any user-defined
auxiliary functions.
780 APPENDIX B. INTRODUCTION TO ML
Examples:
- cycle;
v a l it = fn : int * 'a list -> 'a list
- cycle(0, [1,4,5,2]);
v a l it = [1,4,5,2] : int list
- cycle(1, [1,4,5,2]);
v a l it = [4,5,2,1] : int list
- cycle(2, [1,4,5,2]);
v a l it = [5,2,1,4] : int list
- cycle(4, [1,4,5,2]);
v a l it = [1,4,5,2] : int list
- cycle(6, [1,4,5,2]);
v a l it = [5,2,1,4] : int list
- cycle(10, [1]);
v a l it = [1] : int list
- cycle(9, [1,4]);
v a l it = [4,1] : int list
Exercise B.4 Define an ML function transpose that accepts a list as its only
argument and returns that list with adjacent elements transposed. Specifically,
transpose accepts an input list of the form re1 , e2 , e3 , e4 , e5 , e6 ¨ ¨ ¨ , en s and
returns a list of the form re2 , e1 , e4 , e3 , e6 , e5 , ¨ ¨ ¨ , en , en´1 s as output. If n is
odd, en will continue to be the last element of the list. Do not use any user-defined
auxiliary functions and do not use @ (i.e., append).
Examples:
- transpose;
v a l it = fn : 'a list -> 'a list
- transpose ([1,2,3,4]);
v a l it = [2,1,4,3] : int list
- transpose ([1,2,3,4,5,6]);
v a l it = [2,1,4,3,6,5] : int list
- transpose ([1,2,3]);
v a l it = [2,1,3] : int list
Exercise B.5 Define a recursive ML function oddevensum that accepts only a list
of integers as an argument and returns a pair consisting of the sum of the odd and
even positions of the list. Do not use any user-defined auxiliary functions.
Examples:
- oddevensum;
v a l it = fn : int list -> int * int
- oddevensum([]);
v a l it = (0,0) : int * int
- oddevensum([6]);
v a l it = (6,0) : int * int
- oddevensum([6,3]);
v a l it = (6,3) : int * int
- oddevensum([6,3,8]);
v a l it = (14,3) : int * int
- oddevensum([1,2,3,4]);
v a l it = (4,6) : int * int
- oddevensum([1,2,3,4,5,6]);
B.13. THEMATIC TAKEAWAYS 781
Examples:
- permutations;
v a l it = fn : 'a list -> 'a list list
- permutations([1]);
v a l it = [[1]] : int list list
- permutations([1,2]);
v a l it = [[1,2],[2,1]] : int list list
- permutations([1,2,3]);
v a l it = [[1,2,3],[1,3,2],[2,1,3],
[2,3,1],[3,1,2],[3,2,1]] : int list list
- permutations([1,2,3,4]);
v a l it = [[1,2,3,4],[1,2,4,3],[1,3,2,4],[1,3,4,2],
[1,4,2,3],[1,4,3,2],[2,1,3,4],[2,1,4,3],
[2,3,1,4],[2,3,4,1],[2,4,1,3],[2,4,3,1],
[3,1,2,4],[3,1,4,2],[3,2,1,4],[3,2,4,1],
[3,4,1,2],[3,4,2,1],[4,1,2,3],[4,1,3,2],
[4,2,1,3],[4,2,3,1],[4,3,1,2],[4,3,2,1]] : int list list
- permutations(["oranges", "and", "tangerines"]);
v a l it = [["oranges","and","tangerines"],
["oranges","tangerines","and"],
["and","oranges","tangerines"],
["and","tangerines","oranges"],
["tangerines","oranges","and"],
["tangerines","and","oranges"]] : string list list
Introduction to Haskell
C.2 Introduction
Haskell is named after Haskell B. Curry, the pioneer of the Y combinator in λ-
calculus—the mathematical theory of functions on which functional programming
is based. Haskell is a useful general-purpose programming language in that it
incorporates functional features from Lisp, rule-based programming (i.e., pattern
matching) from Prolog, a terse syntax, and data abstraction from Smalltalk
and C++. Haskell is a (nearly) pure functional language with some declarative
features including pattern-directed invocation, guards, list comprehensions, and
mathematical notation. It is an ideal vehicle through which to explore lazy
evaluation, type safety, type inference, and currying. The objective here, however,
is elementary programming in Haskell. We leave the use of the language to explore
concepts to the main text.
This appendix is an example-oriented avenue to get started with Haskell
programming and is intended to get a programmer who is already familiar with
784 APPENDIX C. INTRODUCTION TO HASKELL
Notice from lines 1–10 that Haskell uses type inference. The :: double-colon
symbol associates a value with a type and is read as “is of type.” For instance,
the expression a :: Char indicates that ’a’ is of type Char. This explains the
C.4. TYPE VARIABLES, TYPE CLASSES, AND QUALIFIED TYPES 785
Table C.1 Conceptual Equivalence in Type Mnemonics Between Java and Haskell
786 APPENDIX C. INTRODUCTION TO HASKELL
Read Eq Show
[all except for [all except for [all except for
Io, (->)] Io, (->)] Io, (->)]
(==) (show)
Monad
[Io, [], Monad] RealFloat
[Float, Double]
Figure C.1 A portion of the Haskell type class inheritance hierarchy. The types
in brackets are the types that are members of the type class. The functions in
parentheses are required by any instance (i.e., type) of the type class.
General:
e :: C a => a means “If type a is in type class C, then e has type a.”
Example:
3 :: Num a => a means “If type a is in type class Num, then 3 has type a.”
Table C.2 The General Form of a Qualified Type or Constrained Type and an
Example
3 has the type a. In other words, 3 is of some type in the Num class. Such a type
is called a qualified type or constrained type (Table C.2). The left-hand side of the =>
symbol—which here is in the form C —is called the class constraint or context,
where C is a type class and is a type variable:
• Comparison. The infix binary operators == (equal to), <, >, <=, >=, and /=
(not equal to) compare two integers, floating-point numbers, characters, or
strings:
Prelude 4 == 2
F a l se
Prelude > 4 > 2
True
Prelude > 4 /= 2
True
• Boolean operators. The infix operators || (or), && (and), and not are the
or, and, and not boolean operators with their usual semantics. The operators
|| and && use short-circuit evaluation (or lazy evaluation, as discussed in
Chapter 12):
‚ Multi-line comments:
{- this is
a
multi-line
comment -}
{- this is
a
{- nested
multi-line -}
comment -}
$ ghci
Prelude > 2 + 3
5
Prelude > ^D
Leaving GHCi.
$
Using this method of execution, the programmer can create bindings and
define new functions at the prompt of the interpreter:
Enter the EOF character (which is ăctrl-dą on UNIX systems and ăctrl-zą on
Windows systems) or :quit (or :q) to quit the interpreter.
• Enter ghci ăfilenameą.hs from the command prompt using file I / O,
which causes the program in ăfilenameą.hs to be evaluated:
790 APPENDIX C. INTRODUCTION TO HASKELL
$ cat first.hs
answer = 2 + 3
inc(x) = x + 1
$ ghci first.hs
*Main> answer
5
*Main> inc(1)
2
*Main>
$ cat first.hs
2 + 3
$ ghci first.hs
GHCi, version 8.10.1: https://www.haskell.org/ghc/
:? for help
[1 of 1] Compiling Main ( first.hs, interpreted )
first.hs:1:1: e r r o r :
Parse e r r o r: module header, import declaration
or top-level declaration expected.
|
1 | 2 + 3
| ^^^^^
Failed, no modules loaded.
0 $ cat first.hs
1
2 answer = 2 + 3
3
4 inc(x) = x + 1
5
6 $ ghci
7 Prelude > :load first.hs
8
9 *Main> answer
10 5
11 *Main>
$ cat first.hs
2 + 3
C.7 Lists
The following are some important points about lists in Haskell.
1. The interpreter automatically exits once EOF is reached and evaluation is complete.
792 APPENDIX C. INTRODUCTION TO HASKELL
• The built-in function elem is a list member and returns True if its first
argument is a member of its second list argument and False otherwise.
Examples:
C.8 Tuples
A tuple is a sequence of elements of potentially mixed types. Formally, a
tuple is an element e of a Cartesian product of a given number of sets:
e P pS1 ˆ S2 ˆ ¨ ¨ ¨ ˆ Sn q. A two-element tuple is called a pair [e.g., e P pA ˆ Bq]. A
three-element tuple is called a triple [e.g., e P pA ˆ B ˆ Cq]. A tuple typically contains
unordered, heterogeneous elements akin to a struct in C with the exception that a
tuple is indexed by numbers (like a list) rather than by field names (like a struct).
While tuples can be heterogeneous, in a list of tuples, each tuple in the list must be
of the same type. Elements of a pair (i.e., a 2-tuple) are accessible with the functions
fst and snd:
The response from the interpreter when :type (1, "Mary", 3.76) is entered
(line 7) is (1, "Mary", 3.76) :: (Fractional c, Num a) => (a, [Char], c)
(line 8). The expression (Fractional c, Num a) => (a, [Char], c)
is a qualified type. Recall that the a means “any type” and is called a
type variable; the same holds for type c in this example. The expression
(1, "Mary", 3.76) :: (Fractional c, Num a) => (a, [Char], c)
(line 8) indicates that if type c is in the class Fractional and type a is in the
class Num, then the tuple (1,"Mary",3.76) has type (a,[Char],c). In other
words, the tuple (1,"Mary",3.76) consists of an instance of type a, a list of
Characters, and an instance of type c.
The right-hand side of the response from the interpreter when a tuple is entered
[e.g., (a,[Char],c)] demonstrates that a tuple is an element of a Cartesian
product of a given number of sets. Here, the comma (,) is the analog of the
Cartesian-product operator ˆ, and the data types a, [Char], and c are the sets
involved in the Cartesian product. In other words, (a,[Char],c) is a type
defined by the Cartesian product of the set of all instances of type a, where a
is a member of the Num class; the set of all lists of type Char; and the set of all
instances of type c, where c is a member of the Fractional class. An element of
the Cartesian product of the set of all instances of type a, where a is a member of
the Num class; the set of all lists of type Char; and the set of all instances of type c,
where c is a member of the Fractional class, has the type (a,[Char],c):
7 Prelude >
8 Prelude > :type add
9 add :: Num a => (a, a) -> a
Here, when :type square is entered (line 3), the response of the interpreter is
square :: Num a => a -> a (line 4), which is a qualified type. Recall that the
a means “any type” and is called a type variable. To promote flexibility, especially
in function definitions, Haskell has type classes, which are collections of types.
Also, recall that the types Int and Integer belong to the Num type class. The
expression square:: Num a => a -> a indicates that if type a is in the class
Num, then the function square has type a -> a. In other words, square is a
function that maps a value of type a to a value of the same type a. If the argument
to square is of type Int, then square is a function that maps an Int to an Int.
Similarly, when :type add is entered (line 8), the response of the interpreter is
add :: Num a => (a,a) -> a (line 9); this indicates that if type a is in the
class Num, then the type of the function add is (a,a) -> a. In other words, add
is a function that maps a pair (a,a) of values, both of the same type a, to a value
of the same type a. Notice that the interpreter prints the domain of a function
that accepts more than one parameter as a tuple (using the notation described
in Section C.8). These functions are the Haskell analogs of the following Scheme
functions:
Notice that the Haskell syntax involves fewer lexemes than Scheme (e.g., define
is not included). Without excessive parentheses, Haskell is also more readable than
Scheme.
1 -- (gcd is in Prelude.hs)
2 -- first version without pattern-directed invocation
3 gcd1(u,v) = i f v == 0 then u e l s e gcd1(v, (mod u v))
4
5 -- second version with pattern-directed invocation
6 gcd1(u,0) = u
7 gcd1(u,v) = gcd1(v, (mod u v))
The first version (defined on line 3) does not use pattern-directed invocation; that
is, there is only one definition of the function. The second version (defined on
lines 6–7) uses pattern-directed invocation. If the literal 0 is passed as the second
argument to the function gcd1, then the first definition of gcd1 is used (line 6);
otherwise the second definition is used (line 7).
Pattern-directed invocation is not identical to operator/function overloading.
Overloading involves determining which definition of a function to invoke based
on the number and types of arguments it is passed at run-time. With pattern-
directed invocation, no matter how many definitions of the function exist, all
have the same type signature (i.e., number and type of parameters). Overloading
implies that the number and types of arguments are used to select the applicable
function definition from a collection of function definitions with the same name.
796 APPENDIX C. INTRODUCTION TO HASKELL
factorial(0) = 1
factorial(n) = n * factorial(n-1)
fibonacci(0) = 1
fibonacci(1) = 1
fibonacci(n) = fibonacci(n-1) + fibonacci(n-2)
i n t f ( i n t a, i n t b) {
r e t u r n (a+b);
}
i n t main() {
r e t u r n f(2+3, 4);
}
Here, the expression 2+3 is the first argument to the function f. Since C uses
an eager evaluation parameter-passing strategy, the expression 2+3 is evaluated
as 5 and then 5 is passed to f. However, in the body of f, there is no way to
conveniently decompose 5 back to 2+3.
Pattern-directed invocation allows Haskell to support the decomposition of an
argument from within the signature itself by using a pattern in a parameter. For
instance, consider these three versions of a reverse function:
1 Prelude > :{
2 Prelude | -- without pattern-directed invocation need
3 Prelude | -- an if-then-else and calls to head and tail
4 Prelude | -- reverse is built-in Prelude.hs
5 Prelude | reverse1(lst) =
6 Prelude | i f lst == [] then []
7 Prelude | e l s e reverse1( t a i l (lst)) ++ [head(lst)]
8 Prelude |
9 Prelude | -- with pattern-directed invocation;
10 Prelude | -- still need calls to head and tail
11 Prelude | reverse2([]) = []
12 Prelude | reverse2(lst) = reverse2( t a i l (lst)) ++ [head(lst)]
13 Prelude |
14 Prelude | -- with pattern-directed invocation;
15 Prelude | -- calls to head and tail unnecessary
16 Prelude | reverse3([]) = []
17 Prelude | reverse3(x:xs) = reverse3(xs) ++ [x]
C.9. USER-DEFINED FUNCTIONS 797
18 Prelude | :}
19 Prelude >
20 Prelude > :type reverse1
21 reverse1 :: Eq a => [a] -> [a]
22 Prelude >
23 Prelude > :type reverse2
24 reverse2 :: [a] -> [a]
25 Prelude >
26 Prelude > :type reverse3
27 reverse3 :: [a] -> [a]
Functions can be defined at the Haskell prompt as shown here. If a function or set
of functions requires multiple lines, use :\{ and :\} lexemes (as shown on lines
1 and 18, respectively) to identify to the interpreter the beginning and ending of a
block of code consisting of multiple lines.
While the pattern-directed invocation in reverse2 (lines 11–12) obviates the
need for the if–then–else expression (lines 6–7) in reverse1, the functions
head and tail are required to decompose lst into its head and tail. Calls to the
functions head and tail (lines 7 and 12) are obviated by using the pattern x:xs in
the parameter to reverse3 (line 17). When reverse3 is called with a non-empty
list, the second definition of it is executed (line 17), the head of the list passed as
the argument is bound to x, and the tail of the list passed as the argument is bound
to xs.
The cases form in the EOPL extension to Racket Scheme, which may be used
to decompose the constituent parts of a variant record as described in Chapter 9
(Friedman, Wand, and Haynes 2001), is the Racket Scheme analog of the use of
patterns in parameters to decompose arguments to a function. Pattern-directed
invocation, including the use of patterns for decomposing parameters, and the
pattern-action style of programming, is common in the programming language
Prolog.
1 Prelude > :{
2 Prelude | konsMinHeadtoOther ([], _) = []
3 Prelude | konsMinHeadtoOther (_, []) = []
4 Prelude | konsMinHeadtoOther (l1@(x:xs), l2@(y:ys)) =
5 Prelude | i f x < y then x:l2 e l s e y:l1
6 Prelude | :}
7 Prelude >
8 Prelude > :type konsMinHeadtoOther
9 konsMinHeadtoOther :: Ord a => ([a], [a]) -> [a]
10 Prelude >
11 Prelude > konsMinHeadtoOther ([1,2,3,4], [5,6,7,8])
12 [1,5,6,7,8]
13 Prelude >
798 APPENDIX C. INTRODUCTION TO HASKELL
Anonymous Parameters
The underscore (_) pattern on lines 2 and 3 of the definition of the
konsMinHeadtoOther function represents an anonymous parameter—a param-
eter whose name is unnecessary to the definition of the function. As an additional
example, consider the following definition of a list member function:
Prelude > :{
Prelude | -- elem is the Haskell member function in Prelude.hs
Prelude | member(_, []) = F a l se
Prelude | member(e, x:xs) = (x == e) || member(e,xs)
Prelude | :}
Prelude >
Prelude > :type member
member :: Eq a => (a, [a]) -> Bool
Using anonymous parameters (lines 1–3), we can also define functions to access
the elements of a tuple:
Polymorphism
While some functions, including square and add, require arguments of a
particular type, others, including reverse3 and member, accept arguments of any
type or arguments whose types are partially restricted. For instance, the type of
the function reverse3 is [a] -> [a]. Here, the a means “any type.” Therefore,
the function reverse accepts a list of a particular type a and returns a list of the
same type. The a is called a type variable. In programming languages, the ability
of a single function to accept arguments of different types is called polymorphism
because poly means “many” and morph means “form.” Such a function is called
polymorphic. A polymorphic type is a type expression containing type variables. The
type of polymorphism discussed here is called parametric polymorphism, where
a function or data type can be defined generically so that it can handle values
identically without depending on their type.
Neither pattern-directed invocation nor operator/function overloading
(sometimes called ad hoc polymorphism) is identical to (parametric) polymorphism.
Overloading involves using the same operator/function name to refer to different
definitions of a function, each of which is identifiable by the different number or
C.9. USER-DEFINED FUNCTIONS 799
Local Binding
Lines 8–11 of the following example demonstrate local binding in Haskell:
1 Prelude > :{
2 Prelude | insertineach(_, []) = []
3 Prelude | insertineach(item, x:xs) = (item:x):insertineach(item,xs)
4 Prelude |
5 Prelude | -- use of "let" prevents recomputation of powerset xs
6 Prelude | powerset([]) = [[]]
7 Prelude | powerset(x:xs) =
8 Prelude | let
9 Prelude | temp = powerset(xs)
10 Prelude | in
11 Prelude | (insertineach(x, temp)) ++ temp
12 Prelude | :}
13 Prelude >
14 Prelude > :type insertineach
15 insertineach :: (a, [[a]]) -> [[a]]
16 Prelude >
17 Prelude > :type powerset
18 powerset :: [a] -> [[a]]
powerset([]) = [[]]
powerset(x:xs) = (insertineach(x, temp)) ++ temp
where temp = powerset(xs)
These functions are the Haskell analogs of the following Scheme functions:
Nested Functions
1 Prelude > :{
2 Prelude | powerset([]) = [[]]
3 Prelude | powerset(x:xs) =
4 Prelude | let
5 Prelude | insertineach(_, []) = []
6 Prelude | insertineach(item, x:xs) = (item:x):insertineach(item,xs)
7 Prelude |
8 Prelude | temp = powerset(xs)
9 Prelude | in
10 Prelude | (insertineach(x, temp)) ++ temp
11 Prelude |
12 Prelude | {-
13 Prelude| -- powerset can be similarly defined with where
14 Prelude| powerset([]) = [[]]
15 Prelude| powerset(x:xs) = (insertineach(x, temp)) ++ temp
16 Prelude| where
17 Prelude| insertineach(_, []) = []
18 Prelude| insertineach(item, x:xs) =
19 Prelude| (item:x):insertineach(item,xs)
20 Prelude|
21 Prelude| temp = powerset(xs)
22 Prelude| -}
23 Prelude | :}
24 Prelude >
25 Prelude > :type powerset
26 powerset :: [a] -> [[a]]
27 Prelude >
28 Prelude > powerset([])
29 [[]]
30 Prelude >
31 Prelude > powerset([1])
32 [[1],[]]
33 Prelude > powerset([1,2])
34 [[1,2],[1],[2],[]]
35 Prelude >
36 Prelude > powerset([1,2,3])
37 [[1,2,3],[1,2],[1,3],[1],[2,3],[2],[3],[]]
Prelude > :{
Prelude | reverse51([], m) = m
Prelude | reverse51(x:xs, ys) = reverse51(xs, x:ys)
Prelude |
Prelude | reverse5(lst) = reverse51(lst, [])
Prelude | :}
Prelude >
Prelude > :type reverse51
reverse51 :: ([a], [a]) -> [a]
Prelude >
Prelude > :type reverse5
reverse5 :: [a] -> [a]
C.9. USER-DEFINED FUNCTIONS 801
Prelude > :{
Prelude | reverse5([]) = []
Prelude | reverse5(l) =
Prelude | let
Prelude | reverse51([], l) = l
Prelude | reverse51(x:xs, ys) = reverse51(xs, x:ys)
Prelude | in
Prelude | reverse51(l, [])
Prelude | :}
Prelude >
Prelude > :type reverse5
reverse5 :: [a] -> [a]
Note that the polymorphic type of reverse, [a] -> [a], indicates that
reverse can reverse a list of any type.
Prelude > :{
Prelude | f(x,y) = square(x+y)
Prelude | square(x) = x*x
Prelude | :}
Prelude >
Prelude > f(3,4)
49
Prelude > :{
Prelude | isodd(1) = True
Prelude | isodd(0) = F a l se
Prelude | isodd(n) = iseven(n-1)
Prelude |
Prelude | iseven(0) = True
Prelude | iseven(n) = isodd(n-1)
Prelude | :}
Prelude >
Prelude > :type isodd
isodd :: (Eq a, Num a) => a -> Bool
Prelude >
Prelude > :type iseven
iseven :: (Eq a, Num a) => a -> Bool
Prelude >
Prelude > isodd(9)
True
Prelude >
Prelude > isodd(100)
F a l se
Prelude > iseven(100)
802 APPENDIX C. INTRODUCTION TO HASKELL
True
Prelude > iseven(1000000000)
True
Note that more than two mutually recursive functions can be defined.
Prelude > :{
Prelude | s p l i t ([]) = ([], [])
Prelude | s p l i t ([x]) = ([], [x])
Prelude | s p l i t (x:y:excess) =
Prelude | let
Prelude | (left, right) = s p l i t (excess)
Prelude | in
Prelude | (x:left, y:right)
Prelude |
Prelude | merge(l, []) = l
Prelude | merge([], l) = l
Prelude | merge(l:ls, r:rs) =
Prelude | i f l < r then l:merge(ls, r:rs)
Prelude | e l s e r:merge(l:ls, rs)
Prelude |
Prelude | mergesort([]) = []
Prelude | mergesort([x]) = [x]
Prelude | mergesort(lat) =
Prelude | let
Prelude | -- split it
Prelude | (left, right) = s p l i t (lat)
Prelude |
Prelude | -- mergesort each side
Prelude | leftsorted = mergesort(left)
Prelude | rightsorted = mergesort(right)
Prelude | in
Prelude | -- merge
Prelude | merge(leftsorted, rightsorted)
Prelude |
Prelude | {-
Prelude| -- alternatively
Prelude| mergesort([]) = []
Prelude| mergesort([x]) = [x]
Prelude| mergesort(lat) =
Prelude| -- merge
Prelude| merge(leftsorted, rightsorted)
Prelude| where
Prelude| -- split it
Prelude| (left, right) = split(lat)
Prelude|
Prelude| -- mergesort each side
Prelude| leftsorted = mergesort(left)
Prelude| rightsorted = mergesort(right)
Prelude| -}
Prelude | :}
Prelude >
Prelude > :type s p l i t
C.9. USER-DEFINED FUNCTIONS 803
Prelude > :{
Prelude | mergesort([]) = []
Prelude | mergesort([x]) = [x]
Prelude | mergesort(lat) =
Prelude | let
Prelude | s p l i t ([]) = ([], [])
Prelude | s p l i t ([x]) = ([], [x])
Prelude | s p l i t (x:y:excess) =
Prelude | let
Prelude | (left, right) = s p l i t (excess)
Prelude | in
Prelude | (x:left, y:right)
Prelude |
Prelude | merge(l, []) = l
Prelude | merge([], l) = l
Prelude | merge(l:ls, r:rs) =
Prelude | i f l < r then l:merge(ls, r:rs)
Prelude | e l s e r:merge(l:ls, rs)
Prelude |
Prelude | -- split it
Prelude | (left, right) = s p l i t (lat)
Prelude |
Prelude | -- mergesort each side
Prelude | leftsorted = mergesort(left)
Prelude | rightsorted = mergesort(right)
Prelude | in
Prelude | -- merge
Prelude | merge(leftsorted, rightsorted)
Prelude | :}
Prelude > :type mergesort
mergesort :: Ord a => [a] -> [a]
1 Prelude > :{
2 Prelude | mergesort(_, []) = []
3 Prelude | mergesort(_, [x]) = [x]
4 Prelude | mergesort(compop, lat) =
5 Prelude |
6 Prelude | let
7 Prelude | s p l i t ([]) = ([], [])
8 Prelude | s p l i t ([x]) = ([], [x])
9 Prelude | s p l i t (x:y:excess) =
10 Prelude | let
11 Prelude | (left, right) = s p l i t (excess)
12 Prelude | in
804 APPENDIX C. INTRODUCTION TO HASKELL
This changes the type of mergesort from ((a, a) -> Bool, [a]) -> [a]
to (a -> a -> Bool, [a]) -> [a]:
Of course, unlike the previous version, this new definition of mergesort cannot
accept an uncurried function as its first argument.
Final Version
The following code is the final version of mergesort using nested, protected
functions and accepting a comparison operator as a parameter, which is factored
out to avoid passing it between successive recursive calls:
Prelude > :{
Prelude | mergesort(_, []) = []
Prelude | mergesort(_, [x]) = [x]
Prelude | mergesort(compop, lat) =
Prelude | let
Prelude | mergesort1([]) = []
Prelude | mergesort1([x]) = [x]
Prelude | mergesort1(lat1) =
Prelude | let
Prelude | s p l i t ([]) = ([], [])
Prelude | s p l i t ([x]) = ([], [x])
Prelude | s p l i t (x:y:excess) =
Prelude | let
Prelude | (left, right) = s p l i t (excess)
Prelude | in
Prelude | (x:left, y:right)
Prelude |
Prelude | merge(l, []) = l
Prelude | merge([], l) = l
Prelude | merge(l:ls, r:rs) =
Prelude | i f compop(l, r) then l:merge(ls, r:rs)
Prelude | e l s e r:merge(l:ls, rs)
Prelude |
Prelude | -- split it
Prelude | (left, right) = s p l i t (lat1)
Prelude |
Prelude | -- mergesort each side
Prelude | leftsorted = mergesort1(left)
Prelude | rightsorted = mergesort1(right)
Prelude | in
Prelude | -- merge
Prelude | merge(leftsorted, rightsorted)
Prelude | in
Prelude | mergesort1(lat)
Prelude | :}
Prelude >
Prelude > :type mergesort
mergesort :: ((a, a) -> Bool , [a]) -> [a]
806 APPENDIX C. INTRODUCTION TO HASKELL
Notice also that we factored the argument compop out of the function merge in
this version since it is visible from an outer scope.
Prelude > :{
Prelude | ans1 :: [ I n t e g e r]
Prelude | ans1 = [1,2,3]
Prelude | :}
Prelude >
Prelude > :type ans1
ans1 :: [ I n t e g e r]
Prelude >
Prelude > :{
Prelude | ans2 :: (I n t e g e r , String , F l o a t )
Prelude | ans2 = (1, "Mary", 3.76)
C.10. DECLARING TYPES 807
Prelude | :}
Prelude >
Prelude > :type ans2
ans2 :: ( I n t e g e r , String , F l o a t )
Prelude >
Prelude > :{
Prelude | square :: I n t -> I n t
Prelude | square(x) = x*x
Prelude | :}
Prelude >
Prelude > :type square
square :: I n t -> I n t
Prelude >
Prelude > :type square(2)
square(2) :: I n t
Prelude >
Prelude > :type square(2.0)
<interactive>:1:8: e r r o r :
No i n s t a n c e for ( F r a c t i o n a l I n t ) arising from the literal '2.0'
In the first argument of 'square', namely '(2.0)'
In the expression: square (2.0)
Prelude >
Prelude > :{
Prelude | reverse3 :: [ I n t] -> [ I n t ]
Prelude | reverse3([]) = []
Prelude | reverse3(h:t) = reverse3(t) ++ [h]
Prelude | :}
Prelude >
Prelude > :type reverse3
reverse3 :: [ I n t ] -> [ I n t]
Prelude >
Prelude > reverse3([1,2,3,4,5])
[5,4,3,2,1]
Prelude >
Prelude > reverse3([1.1,2.2,3.3,4.4,5.5])
<interactive>:37:11: e r r o r:
No i n s t a n c e for ( F r a c t i o n a l I n t ) arising from the literal '1.1'
In the expression: 1.1
In the first argument of 'reverse3', namely
'([1.1, 2.2, 3.3, 4.4, 5.5])'
In the expression: reverse3 ([1.1, 2.2, 3.3, 4.4, 5.5])
Exercise C.2 Define a Haskell function called makeset that accepts only a list of
integers as input and returns the list with any repeating elements removed. The
order in which the elements appear in the returned list does not matter, as long as
there are no duplicate elements. Do not use any user-defined auxiliary functions,
except elem.
Examples:
Exercise C.3 Define a Haskell function cycle1 that accepts only a list and an
integer i as arguments and cycles the list i times. Do not use any user-defined
auxiliary functions.
Examples:
Exercise C.4 Define a Haskell function transpose that accepts a list as its only
argument and returns that list with adjacent elements transposed. Specifically,
transpose accepts an input list of the form re1 , e2 , e3 , e4 , e5 , e6 ¨ ¨ ¨ , en s and
returns a list of the form re2 , e1 , e4 , e3 , e6 , e5 , ¨ ¨ ¨ , en , en´1 s as output. If n is
odd, en will continue to be the last element of the list. Do not use any user-defined
auxiliary functions and do not use ++ (i.e., append).
Examples:
Exercise C.5 Define a Haskell function oddevensum that accepts only a list of
integers as an argument and returns a pair consisting of the sum of the odd and
even positions of the list. Do not use any user-defined auxiliary functions.
Examples:
Exercise C.6 Define a Haskell function permutations that accepts only a list
representing a set as an argument and returns a list of all permutations of that
list as a list of lists. You will need to define some nested auxiliary functions. Try to
define only one auxiliary function and pass a λ-function to map within the body
of that function and within the body of the permutations function to simplify
their definitions. Hint: Use the built-in Haskell function concat.
Examples:
D.2 Grammar
The grammar in EBNF for Camille (version 4.0) is given in Figure D.1.
Comments in Camille programs begin with three consecutive dashes (i.e.,
---) and continue to the end of the line. Multi-line comments are not
supported. Comments are ignored by the Camille scanner. Camille can be used
for functional or imperative programming, or both. To use it for functional
programming, use the ăprogrmą ::= ăepressoną grammar rule; to use
it for imperative programming, use the ăprogrmą ::= ăsttementą rule.
812 APPENDIX D. GETTING STARTED WITH THE CAMILLE LANGUAGE
Figure D.1 The grammar in EBNF for the Camille programming language (Perugini
and Watkin 2018).
User-defined functions are first-class entities in Camille. This means that a function
can be the return value of an expression (i.e., an expressed value), can be
bound to an identifier and stored in the environment of the interpreter (i.e.,
a denoted value), and can be passed as an argument to a function. As the
production rules in Figure D.1 indicate, Camille supports side effect (through
variable assignment) and arrays. The primitives array, arrayreference,
and arrayassign create an array, dereference an array, and update an array,
respectively. While we have multiple versions of Camille, each supporting varying
concepts, in version 4.0
Thus, akin to Java or Scheme, all denoted values are references, but are implicitly
dereferenced. For more details of the language, we refer the reader to Perugini and
Watkin (2018). See Appendix E for the individual grammars for the progressive
versions of Camille.
D.3 Installation
To install the environment necessary for running Camille, follow these steps:
there, students add support for non-recursive functions, which raises the issue
of how to represent a function and there are a host of options from which to
choose.
In what follows, each directory corresponds to the different (progressive)
version of the interpreter:
Each individual interpreter directory contains its own README.md describing the
highlights of the particular version of the interpreter in that the directory.
$ pwd
camille-interpreter-in -python-release
$ cd pass -by-value-recursive
$ cat camilleconfig.py
...
...
closure_closure = 0 #static scoping our closure representation of closures
asr_closure = 1 # static scoping our asr representation of closures
python_closure = 2 # dynamic scoping python representation of closures
__closure_switch__ = asr_closure # for lexical scoping
#__closure_switch__ = python_closure # for dynamic scoping
closure = 1
asr = 2
lovr = 3
__env_switch__ = lovr
Design Choices
Representation of Closures N/A ASR | CLS ASR|CLS ASR|CLS
Representation of References N/A N/A ASR ASR
Local Binding Ò let, let* Ò Ò let, let* Ò Ò let, let* Ò Ò let, let* Ò
Conditionals Ó if/else Ó Ó if/else Ó Ó if/else Ó Ó if/else Ó
Non-recursive Functions ˆ Ò fun Ò Ò fun Ò Ò fun Ò
Recursive Functions ˆ Ò letrec Ò Ò letrec Ò Ò letrec Ò
Scoping N/A lexical lexical lexical
Environment Bound to Closure N/A deep deep
‘ deep
‘
References ˆ ˆ
Parameter Passing N/A Ò by value Ò Ò by reference/lazy Ò Ò by value Ò
D.5. HOW TO USE CAMILLE IN A COURSE
Table D.2 Design Choices and Implemented Concepts in Progressive Versions of Camille. The symbol Ó indicates that the concept
is supported through its implementation in the defining language (here, Python). The Python keyword included in each cell,
where applicable, indicates which Python construct is used to implement the feature in Camille. The symbol Ò indicates that the
concept is implemented manually. The Camille keyword included in each cell, where applicable, indicates the syntactic construct
through which the concept is operationalized. (Key: ASR = abstract-syntax representation; CLS = closure; and LOLR = list-of-
list representation. Cells in boldface font highlight the enhancements across the versions.) Reproduced from Perugini, S., and
J. L. Watkin. 2018. “ChAmElEoN: A Customizable Language for Teaching Programming Languages.” Journal of Computing Sciences
in Colleges 34(1): 44–51.
817
818 APPENDIX D. GETTING STARTED WITH THE CAMILLE LANGUAGE
$ pwd
camille-interpreter-in-python-release
D.7. SOLUTIONS TO PROGRAMMING EXERCISES IN CHAPTERS 10–12 819
$
$ cd pass-by-value-non-recursive
$
$ # running the interpreter non-interactively
$
$ cat recursionUnbound.cam
let
sum = fun (x) if zero?(x) 0 else +(x, (sum dec1(x)))
in
(sum 5)
$
$ ./run recursionUnbound.cam
Runtime E r r o r : Line 2: Unbound Identifier 'sum'
$
$ cat recursionBound.cam
let
sum = fun (s, x) if zero?(x) 0 else +(x, (s s,dec1(x)))
in
(sum sum, 5)
$
$ ./run recursionBound.cam
15
$
$ # running the interpreter interactively (CLI)
$
$ ./run
Camille> l e t
sum = fun (x) if zero?(x) 0 else +(x, (sum dec1(x)))
in
(sum 5)
Camille> l e t
sum = fun (s, x) if zero?(x) 0 else +(x, (s s,dec1(x)))
in
(sum sum, 5)
15
Table D.3 Solutions to the Camille Interpreter Programming Exercises in Chapters 10–12
D.8. NOTES AND FURTHER READING 821
Identifiers
Identifiers in Camille are described by the following regular expression:
[_a-zA-Z][_a-zA-Z0-9*?!]*. However, an identifier cannot be a reserved
word in the language (e.g., let).
Syntax
The following is a context-free grammar in EBNF for version 1.0 of the
Camille programming language through Chapter 10:
824 APPENDIX E. CAMILLE GRAMMAR AND LANGUAGE
ntNumber
ăepressoną ::= ănmberą
ntPrimitive_op
ăepressoną ::= ăprmteą (tăepressonąu`p,q )
ntPrimitive
ăprmteą ::= + | - | * | inc1 | dec1 | zero? | eqv?
Semantics
Currently,
ntNumber
ăepressoną ::= ănmberą
ntIdentifier
ăepressoną ::= ădentƒ erą
ntPrimitive_op
ăepressoną ::= ăprmteą (tăepressonąu`p,q )
ntPrimitive
ăprmteą ::= + | - | * | inc1 | dec1 | zero? | eqv?
ntIfElse
ăepressoną ::= if ăepressoną ăepressoną else ăepressoną
E.4. CAMILLE 2.X: NON-RECURSIVE AND RECURSIVE FUNCTIONS 825
ntLet
ăepressoną ::= let tădentƒ erą = ăepressonąu` in ăepressoną
ntLetStar
ăepressoną ::= let* tădentƒ erą = ăepressonąu` in ăepressoną
Semantics
expressed value = integer
denoted value = integer
ntNumber
ăepressoną ::= ănmberą
ntIdentifier
ăepressoną ::= ădentƒ erą
ntPrimitive_op
ăepressoną ::= ăprmteą (tăepressonąu`p,q )
ntPrimitive
ăprmteą ::= + | - | * | inc1 | dec1 | zero? | eqv?
ntIfElse
ăepressoną ::= if ăepressoną ăepressoną else ăepressoną
ntLet
ăepressoną ::= let tădentƒ erą = ăepressonąu` in ăepressoną
ntLetStar
ăepressoną ::= let‹ tădentƒ erą = ăepressonąu` in ăepressoną
ntFuncDecl
ăepressoną ::= fun (tădentƒ erąu‹p,q ) ăepressoną
826 APPENDIX E. CAMILLE GRAMMAR AND LANGUAGE
ntFuncCall
ăepressoną ::= (ăepressoną tăepressonąu‹p,q )
ntLetRec
ăepressoną ::= letrec tădentƒ erą = ăƒ nctoną }` in ăepressoną
Semantics
We desire user-defined functions to be first-class entities in Camille. This means
that a function can be the return value of an expression (altering the expressed
values) and can be bound to an identifier and stored in the environment of the
interpreter (altering the denoted values). Adding user-defined, first-class functions
to Camille alters its expressed and denoted values:
Thus,
Syntax
ntAssignment
ăepressoną ::= assign! ădentƒ erą = ăepressoną
ăprmteą ::= + | - | * | inc1 | dec1 | zero? | eqv? |
array | arrayreference | arrayassign
Semantics
With the addition of references, now in Camille
Thus,
Also, the array creation, access, and modification primitives have the following
semantics:
ntAssignmentStmt
ăsttementą ::= ădentƒ erą = ăepressoną
ntOutputStmt
ăsttementą ::= writeln (ăepressoną)
ntCompoundStmt
ăsttementą ::= {tăsttementąu˚p;q }
ntIfElseStmt
ăsttementą ::= if ăepressoną ăsttementą else ăsttementą
ntWhileStmt
ăsttementą ::= while ăepressoną do ăsttementą
ntBlockStmt
ăsttementą ::= variable tădentƒ erąu˚p,q ; ăsttementą
Semantics
Thus far Camille is an expression-oriented language. We now implement the
Camille interpreter to define a statement-oriented language. We want to retain:
Graham, P. 2002. The Roots of Lisp. Accessed July 19, 2018. http://lib.store.yahoo
.net/lib/paulgraham/jmc.ps.
Graham, P. 2004a. “Beating the Averages.” In Hackers and Painters: Big Ideas from
the Computer Age. Beijing: O’Reilly. Accessed July 19, 2018. http://www
.paulgraham.com/avg.html.
Graham, P. 2004b. Hackers and Painters: Big Ideas from the Computer Age. Beijing:
O’Reilly.
Graham, P. n.d. [Haskell] Pros and Cons of Static Typing and Side Effects?
http://paulgraham.com/lispfaq1.html; https://mail.haskell.org/pipermail
/haskell/2005-August/016266.html .
Graham, P. n.d. LISP FAQ. Accessed July 19, 2018. http://paulgraham.com
/lispfaq1.html.
Graunke, P., R. Findler, S. Krishnamurthi, and M. Felleisen. 2001. “Automatically
Restructuring Programs for the Web.” In Proceedings of the Sixteenth IEEE
International Conference on Automated Software Engineering (ASE), 211–222.
Harbison, S. P., and G. L. Steele Jr. 1995. C: A Reference Manual. 4th ed. Englewood
Cliffs, NJ: Prentice Hall.
Harmelen, F. van, and A. Bundy. 1988. “Explanation-Based Generalisation =
Partial Evaluation.” Artificial Intelligence 36 (3): 401–412.
Harper, R. n.d.
n.d.a. “Teaching FP to Freshman.” Accessed July 19, 2018. http://
existentialtype.wordpress.com/2011/03/15/teaching-fp-to-freshmen/.
Harper, R. n.d.b.
n.d. “What Is a Functional Language?” Accessed July 19, 2018.
http://existentialtype.wordpress.com/2011/03/16/what-is-a-functional
-language/.
Haynes, C. T., and D. P. Friedman. 1987. “Abstracting Timed Preemption with
Engines.” Computer Languages 12 (2): 109–121.
Haynes, C. T., D. P. Friedman, and M. Wand. 1986. “Obtaining Coroutines with
Continuations.” Computer Languages 11 (3/4): 143–153.
Heeren, B., D. Leijen, and A. van IJzendoorn. 2003. “Helium, for Learning
Haskell.” In Proceedings of the ACM SIGPLAN Workshop on Haskell, 62–71. New
York, NY: ACM Press.
Hieb, R., K. Dybvig, and C. Bruggeman. 1990. “Representing Control in the
Presence of First-Class Continuations.” In Proceedings of the ACM SIGPLAN
Conference on Programming Language Design and Implementation (PLDI). New
York, NY: ACM Press.
Hoare, T. 1980. The 1980 ACM Turing Award Lecture. https://www.cs.fsu.edu
/„engelen/courses/COP4610/hoare.pdf .
Hofstadter, D. R. 1979. Gödel, Escher, Bach: An Eternal Golden Braid. New York, NY:
Basic Books.
Hughes, J. 1989. “Why Functional Programming Matters.” The Computer
Journal 32 (2): 98–107. Also appears as: Hughes, J. 1990. “Why Functional
Programming Matters.” In Research Topics in Functional Programming, edited
by D. A. Turner, 17–42. Boston, MA: Addison-Wesley.
Hutton, G. 2007. Programming in Haskell. Cambridge, UK: Cambridge University
Press.
B-4 BIBLIOGRAPHY
Rich, E., K. Knight, and S. B. Nair. 2009. Artificial Intelligence. 3rd ed. India:
McGraw-Hill India.
Robinson, J. A. 1965. “A Machine-Oriented Logic Based on the Resolution
Principle.” Journal of the ACM 12 (1): 23–41.
Savage, N. 2018. “Using Functions for Easier Programming.” Communications of the
ACM 61 (5): 29–30.
Scott, M. L. 2006. Programming Languages Pragmatics. 2nd ed. Amsterdam: Morgan
Kaufmann.
Sinclair, K. H., and D. A. Moon. 1991. “The Philosophy of Lisp.” Communications of
the ACM 34 (9): 40–47.
Somogyi, Z., F. Henderson, and T. Conway. 1996. “The Execution Algorithm of
Mercury, an Efficient Purey Declarative Logic Programming Language.” The
Journal of Logic Programming 29:17–64.
Sperber, M., R. K. Dybvig, M. Flatt, A. van Straaten, R. Findler, and J. Matthews,
eds. 2010. Revised 6 Report on the Algorithmic Language Scheme. Cambridge, UK:
Cambridge University Press.
Sussman, G. J., and G. L. Steele Jr. 1975. “Scheme: An Interpreter for Extended
Lambda Calculus.” AI Memo 349. Accessed May 22, 2020. https://dspace.mit
.edu/handle/1721.1/5794.
Sussman, G. J., G. L. Steele Jr., and R. P. Gabriel. 1993. “A Brief Introduction to
Lisp.” ACM SIGPLAN Notices 28 (3): 361–362.
Swaine, M. 2009. “It’s Time to Get Good at Functional Programming: Is It Finally
Functional Programming’s Turn?” Dr. Dobb’s Journal 34 (1): 14–16.
Thompson, S. 2007. Haskell: The Craft of Functional Programming. 2nd ed. Harlow,
UK: Addison-Wesley.
Ullman, J. 1997. Elements of ML Programming. 2nd ed. Upper Saddle River, NJ:
Prentice Hall.
Venners, B. 2003. Python and the Programmer: A Conversation with Bruce Eckel,
Part I. Accessed July 28, 2021. https://www.artima.com/articles/python-and
-the-programmer
.
Wang, C.-I. 1990. “Obtaining Lazy Evaluation with Continuations in Scheme.”
Information Processing Letters 35 (2): 93–97.
Warren, D. H. D. 1983. “An Abstract Prolog Instruction Set,” Technical Note 309.
Menlo Park, CA: SRI International.
Watkin, J. L., A. C. Volk, and S. Perugini. 2019. “An Introduction to Declarative
Programming in CLIPS and PROLOG.” In Proceedings of the 17th International
Conference on Scientific Computing (CSC), edited by H. R. Arabnia, L. Deligian-
nidis, M. R. Grimaila, D. D. Hodson, and F. G. Tinetti, 105–111. Computer
Science Research, Education, and Applications Press (Publication of the
World Congress in Computer Science, Computer Engineering, and Applied
Computing (CSCE)). CSREA Press. https://csce.ucmss.com/cr/books/2019
/LFS/CSREA2019/CSC2488.pdf.
Webber, A. B. 2008. Formal Languages: A Practical Introduction. Wilsonville, OR:
Franklin, Beedle and Associates.
BIBLIOGRAPHY B-7
Weinberg, G. M. 1988. The Psychology of Computer Programming. New York, NY: Van
Nostrand Reinhold.
Wikström, Å. 1987. Functional Programming Using Standard ML. United Kingdom:
Prentice Hall International.
Wright, A. 2010. “Type Theory Comes of Age.” Communications of the ACM 53 (2):
16–17.
Index
Note: Page numbers followed by f and t indicate figures and tables respectively.
agile methods, 25
A all-or-nothing proposition,
B
abstract data type (ADT), 337, backtracking, 651
613–614
366 Backus–Naur Form (BNF), 40–41
alphabet, 34
abstract syntax, 356–359 backward chaining, 659–660
ambiguity, 52
programming exercises for, balanced pairs of lexemes, 43
ambiguous grammar, 51 bash script, 404
364–365
ancestor blocks, 190 β-reduction, 492–495
representation in Python,
372–373 antecedent, definition of, 651 examples of, 495–499
abstract-syntax tree, 115 ANTLR (ANother Tool for biconditional, 644
Language Recognition), 81 binary search tree abstraction,
for arguments lists, 401–403
for Camille, 359 append, primitive nature of, 151–152
675–676 binary tree abstraction, 150–151
parser generator with tree
builder, 360–364 applicative-order evaluation, binary tree example, 667–672
493, 512 binding and scope
programming exercises for,
364–365 apply_environment_ deep, shallow, and ad hoc
TreeNode, 359–360 reference function, 462 binding, 233–234
arguments. See actual parameters ad hoc binding, 236–238
abstraction, 104
Armstrong, Joe, 178 conceptual exercises for,
binary search, 151–152
arrays, 338 239–240
binary tree, 150–151
assembler, 106 deep binding, 234–235
building blocks as, 174–175
assignment statement, 457–458 programming exercises for,
programming exercises for,
conceptual and programming 240
152–153
exercises for, 465–467 shallow binding, 235–236
activation record, 201
environment, 462–463 dynamic scoping, 200–202
actual parameters, 131 vs. static scoping, 202–207
ad hoc binding, 236–238 illustration of pass-by-value in
free or bound variables,
ad hoc polymorphism. See Camille, 459–460
196–198
overloading; reference data type, 460–461
programming exercises for,
operator/function stack object, 463–465
198–199
overloading use of nested lets to simulate FUNARG problem, 213–214
addcf function, 298 sequential evaluation, addressing, 226–228
ADT. See abstract data type 458–459 closures vs. scope, 224–225
(ADT) associativity, 50 conceptual exercises for, 228
aggregate data types of operators, 57–58 downward, 214
arrays, 338 asynchronous callbacks, 620 programming exercises for,
discriminated unions, 343 atom?, list-of-atoms?, and 228–233
programming exercises for, list-of-numbers?, upward, 215–224
343–344 153–154 upward and downward
records, 338–340 atomic proposition, 642 FUNARG problem in
undiscriminated unions, attribute grammar, 66 single function, 225–226
341–343 automobile concepts, 7 uses of closures, 225
I-2 INDEX
using let and letrec to global transfer of control conceptual exercises for,
define, 158–161 with continuations 329–330
other languages supporting, breakpoints, 560–562 crafting cleverly conceived
161–164 conceptual exercises for, functions with curried,
programming project for, 564–565 324–328
178–179 first-class continuations in folding lists, 319–324
recursive-descent parsers, Ruby, 562–563 functional composition,
Scheme predicates as, 153 nonlocal exits, 556–560 315–316
atom?, list-of-atoms?, programming exercises for, functional mapping, 313–315
and list-of-numbers?, 565–570 programming exercises for,
153–154 other mechanisms for, 570 330–334
list-of pattern, 154–156 conceptual exercises for, 578 sections in Haskell, 316–319
programming exercise for, 156 goto statement, 570–571 Hindley–Milner algorithm, 270
Scheme programming exercises for, HOFs. See higher-order functions
conceptual exercise for, 134 578–579 (HOFs)
homoiconicity, 133–134 setjmp and longjmp, homoiconic language, 133, 540
interactive and illustrative 571–578 homoiconicity, 133–134
session with, 129–133 goal. See headless Horn clause Horn clauses, 653–654
programming exercises for, goto statement, 570–571 limited expressivity of, 702
134–135 grammars, 40–41 in Prolog syntax, casting, 663
functions, 126 conceptual exercises for, 61–64 host language, 115
non-recursive functions context-free languages and, hybrid language
adding support for 42–44 implementations, 109
user-defined functions to disambiguation hybrid systems, 112
Camille, 423–426 associativity of operators, hypothesis, 656
augmenting 57–58
evaluate_expr classical dangling else
function, 427–430 problem, 58–60 I
closures, 426–427 operator precedence, 57 imperative programming, 10
conceptual exercises for, generate sentences from, 44–46 implication function, 643
431–432 language recognition, 46f, implicit conversion, 248–252
programming exercises for, 47–48 implicit currying, 301
432–440 regular, 41–42 implicit typing, 268
simple stack object, 430–431 growing continuation, 610–613 implode function, 325–326
recursive functions growing stack, 610–613 independent set, 680
adding support for recursion inductive data types, 344–347
in Camille, 440–441 instance variables, 216
augmenting H instantiation, 651
evaluate_expr with handle, 48 interactive or incremental
new variants, 445–446 hardware description languages, testing, 146
conceptual exercises for, 17 interactive top-level. See
446–447 Haskell languages, 162, 258–259 read-eval-print loop
programming exercises for, all built-in functions in, interface polymorphism, 267
447–450 301–307 interpretation vis-à-vis
recursive environment, analysis, 385 compilation, 103–109
441–445 applications, 383–385 interpreter, 103
functions on lists comparison of, 383 advantages and disadvantages
append and reverse, 141–144 curry and uncurry functions of, 115t
difference lists technique, in, 295–297 vs. compilers, 114–115
144–146 folding lists in, 319 introspection, 703
list length function, 141 sections in, 316–319 iterative control behavior,
programming exercises for, summaries, 382–383 596–598, 596f
146–149 variant records in, 348–352
headed Horn clause, 653, 656
functor, 645
headless Horn clause, 653, 656, J
665 JIT. See Just-in-Time (JIT)
G heterogeneous lists, 128 implementations
generate-filter style of higher-order functions (HOFs), join functions, 728
programming, 507 155, 716 Just-in-Time (JIT)
generative construct, 41 analysis, 334–335 implementations, 111
I-6 INDEX
This book was typeset with LATEX 2ϵ and BBTEX using a 10-point Palatino font.
Figures were produced using Xfig (X11 diagramming tool) and Graphviz with the
DOT language.