0% found this document useful (0 votes)
9 views

02 Ocaml and Lambda Calculus

The document introduces the concepts of syntax and semantics in programming languages, specifically focusing on arithmetic expressions and their evaluation. It discusses the grammar for a simple arithmetic language and highlights the ambiguity in its syntax, which affects its semantics. Additionally, it explains the foundations of functional programming through lambda calculus, emphasizing the characteristics of functional languages and their relation to computation.

Uploaded by

arshdeeps1805
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

02 Ocaml and Lambda Calculus

The document introduces the concepts of syntax and semantics in programming languages, specifically focusing on arithmetic expressions and their evaluation. It discusses the grammar for a simple arithmetic language and highlights the ambiguity in its syntax, which affects its semantics. Additionally, it explains the foundations of functional programming through lambda calculus, emphasizing the characteristics of functional languages and their relation to computation.

Uploaded by

arshdeeps1805
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Functional Thinking

Introduction to 𝜆-Calculus and OCaml

Programming Abstractions
Dr. Ritwik Banerjee
Computer Science
Stony Brook University
Syntax and Semantics

• Let us try to build a very simple language with a very simple


purpose: arithmetic expressions and their computations
• For example, an expression like 1 + 4 * 4/2
• This expression is a “program” in our programming
language. It evaluates to a single value (in this case, 9)
• To describe and create this language, we need to define
• its syntax, i.e., the “structures” that are deemed as valid
• its semantics, i.e., how such expressions are evaluated (in
other words, what such an expression means; in this
context, the meaning of 1 + 4 * 4/2 is 9)

Spring 2025 © 2025 Ritwik Banerjee 2


• To describe the syntax of a language, we use a grammar
Syntax • Simply put, a grammar is a set of rules (often recursive) that
define what ‘forms’ or ‘structures’ are permitted in the
expressions of the language
• For example, a general rule in the grammar for English is subject-
verb-object, and not subject-object-verb
• “John eats chocolate”, and not “John chocolate eats”
• A very simple grammar for our language of arithmetic expressions:
• This grammar defines three types of syntactic constructs:
integers, binary operators, and expressions
• Each type is further defined in terms of terminals (constants like
0, 1, +) and non-terminals (variables like n, e, etc.)
Int n ::= ... | -1 | 0 | 1 | ...
BinOp ⊕ ::= + | - | * | / • Grammars are often defined using the Backus-Naur Form
(BNF), which is a metalanguage, i.e., a language meant to
Expression e ::= n | e ⊕ e
define other languages.

Spring 2025 © 2025 Ritwik Banerjee 3


Syntax • We can think of the grammar as something that (1) defines how to
decompose a program into its components; or … (2) generates, i.e.,
inductively defines, all valid expressions of the language
• The inductive approach says that if we have infinite time, the
grammar will generate every possible valid expression in the
language, without generating any invalid expression
• The decomposition approach says “give me any valid expression,
and I will be able to tell you how to obtain its components”
• 1 + 4 * 4/2 = e1 ⊕ e2, where e1 = 1 and e2 = 4 * e3, and e3 = 4/2
Int n ::= ... | -1 | 0 | 1 | ... • Our grammar has a problem, however.
BinOp ⊕ ::= + | - | * | / • It is ambiguous!
Expression e ::= n | e ⊕ e

Spring 2025 © 2025 Ritwik Banerjee 4


Semantics • Syntax tells us the structure of valid programs or expressions,
but it doesn’t say anything about the meaning of a valid
program
• What is the meaning of 1 + 4 * 4/2? We know it is 9, but
how have we gone from the initial program to its final
meaning/value?
• Note: the final meaning is also an expression in that same
language, but specifically, it is a terminal expression
• An expression that can be further evaluated is a reducible
expression. Otherwise, it is irreducible
• We want to keep reducing an expression until we can’t! The
process of doing this is what we call the semantics of the
language
• In our arithmetic-expression language, the grammar was
ambiguous. How do you think that affects the semantics?

Spring 2025 © 2025 Ritwik Banerjee 5


Semantics • For a formal definition of semantics, we need to borrow from
elementary Boolean logic and binary relations, so that we can
derive meaning from an expression.
• First, two relations to establish axiomatic facts:
1. “e val” is a unary relation, telling us that e is a value
2. “e1 ↦ e2” is a binary relation telling us that in a single step of
evaluation, we obtain e2 from e1
Why are we doing this?
• Because we want to establish semantics so that there is a logical
path from any valid reducible expression to the final irreducible
“meaning”
• Such a path is defined in terms of inference rules
• e1 ↦ e2 ⇒ e1 ⊕ e3 ↦ e2 ⊕ e3
Note: these variables are not defined anywhere. The rules use an
implicit universal quantification, i.e., it is saying
• ∀ e1,e2,e3, e1 ↦ e2 ⇒ e1 ⊕ e3 ↦ e2 ⊕ e3

Spring 2025 © 2025 Ritwik Banerjee 6


Semantics The semantics of our arithmetic-expression language can be defined as:
1. ↦ n val
2. e1 ↦ e2 ⇒ e1 ⊕ e3 ↦ e2 ⊕ e3
3. (e1 val) ∧ (e2 ↦ e3) ⇒ e1 ⊕ e2 ↦ e1 ⊕ e3
4. n = n1 ⊕ n2 ⇒ n1 ⊕ n2 ↦ n
Using these rules, we can evaluate any valid (i.e., syntactically correct)
arithmetic expression
Notes:
• We are being somewhat lazy in defining the syntax and semantics of our
language. Strictly speaking, we are borrowing external things from the
language of arithmetic to use symbols like “=”, and we are borrowing from
the metalanguage of logic to use ∧, ∀, etc.
• Since we are going to study programming languages as a whole, and not
just individual languages, it is very important to pay attention to what
concepts are internal to a specific language and what comes from
elsewhere!

Spring 2025 © 2025 Ritwik Banerjee 7


Birth of imperative and functional paradigms

• Before the age of electronic computers, there was research on formally defining what
“computing” should mean (1920s – 1930s)
• Over time, it was shown that many of those different formalisms were equivalent in terms of
computability, i.e., anything that could be computed using one model could also be computed with
another (and vice versa)
Two main models/formalisms emerged:
1. The Turing machine, a pushdown automata with a storage tape, created by Alan Turing. This
model is the inspiration behind imperative programming.
2. The other was Lambda calculus, developed by Alonzo Church, based on parametrized
expressions. This model is the basis of all functional programming.
• Each expression was denoted by the eponymous Greek letter 𝜆

Spring 2025 © 2025 Ritwik Banerjee 8


Functional programming

• Programs viewed as functions transforming input to output


• Complex transformations are “compositions” of simpler functions, (i.e., applying one
function to the output of another)
• In purely functional languages, values given to variables do not change when a program is
evaluated
• A “variable” is a name for a value, not for a storage location
• Functions have referential transparency:
• the value of a function depends only on the values of its arguments
• There are no “side effects”
• The order of evaluation of arguments has no effect on the value of the function’s output

Spring 2025 © 2025 Ritwik Banerjee 9


Functional programming languages

• Support for complex (recursive) data types with automatic memory allocation and de-allocation, i.e., automatic
garbage collection
• No loops! Recursion is how repeated computations are performed
• Functions are treated as values
• A variable can be assigned to a function
• Functions can use other functions as arguments, or return a function as output value (higher order functions)
• Functions are “first-class” value, i.e., no arbitrary restrictions that separate functions from other data types
(like, say, integer or string)
• All these features arise naturally from lambda calculus
• Lambda calculus is to functional programming, what algebra is to mathematics

Spring 2025 © 2025 Ritwik Banerjee 10


𝜆 calculus

It is the smallest universal programming language

• “universal” means that it can be used to express any computation that can be
performed by a Turing machine
• “smallest” because it’s a language with three constructs: functions, function
applications, and names, i.e., variables

This is the language used to study types and type systems in


programming languages

Spring 2025 © 2025 Ritwik Banerjee 11


• Expression: name | application | abstraction
• Name: an identified, i.e., a variable
• Abstraction: 𝜆 𝑛𝑎𝑚𝑒 . 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛
• Application: 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛

Syntax
This is the complete syntax of the untyped lambda
calculus
• And yet, it’s a language powerful enough to be able
to do everything we consider to be “computation” in
the modern world

Spring 2025 © 2025 Ritwik Banerjee 12


Semantics • 𝜆-expressions are expressions in a functional language
• The abstraction 𝜆𝑥. 𝐸 is a function with 𝑥 as its
formal parameter, and which returns (the value of) 𝐸
(* Ocaml *)
let identity = fun x -> x;;
• 𝜆𝑥. 𝑥 is the identity function: it returns its input
value 𝑥

# Python
def identity(x):
return x

# or ...
identity = lambda x : x

Spring 2025 © 2025 Ritwik Banerjee 13


Semantics • 𝜆-expressions are expressions in a functional language
• The abstraction 𝜆𝑥. 𝐸 is a function with 𝑥 as its
(* Ocaml *) formal parameter, and which returns (the value of) 𝐸
let nested = • 𝜆𝑥. 𝜆𝑦. 𝑥 is a function that takes 𝑥 as its argument
fun x -> (fun y -> x);; and return another function … and that function
takes 𝑦 as its argument, and then returns 𝑥
# Python
def nested(x):
def inner(y):
return x
return inner

# or ...
def nested(x):
return lambda y : x

Spring 2025 © 2025 Ritwik Banerjee 14


Semantics • 𝜆-expressions are expressions in a functional language
• The application 𝐸1 𝐸2 is a function call, where 𝐸1
is a function and 𝐸2 is its argument
(* Ocaml *)
let identity = fun x -> x;; • 𝜆𝑥. 𝑥 𝑦 denotes 𝑦 is provided as the argument to the
let y = 5;; identity function
identity y;; • Therefore, we know the final irreducible “meaning” of this
lambda expression should be 𝑦

# Python
def identity(x):
return x

y = 5
identity(y)

Spring 2025 © 2025 Ritwik Banerjee 15


Semantics • 𝜆-expressions are expressions in a functional language
• The application 𝐸1 𝐸2 is a function call, where 𝐸1
is a function and 𝐸2 is its argument
(* Ocaml *)
let identity = fun x -> x;; • 𝜆𝑥. 𝑥 𝑦 denotes that 𝑦 is provided as the argument to the
let y = 5;; identity function
identity y;; • Therefore, we know the final irreducible “meaning” of this
lambda expression should be 𝑦

# Python
def identity(x):
return x

y = 5
identity(y)

Spring 2025 © 2025 Ritwik Banerjee 16


All kinds of data (including functions) can be
represented in this simple language
For example, natural numbers are represented using
Data what’s called Church numerals
• 0 = 𝜆𝑓. 𝜆𝑥. 𝑥
representations • 1 = 𝜆𝑓. 𝜆𝑥. 𝑓 𝑥
• 2 = 𝜆𝑓. 𝜆𝑥. 𝑓 𝑓 𝑥
• 3 = 𝜆𝑓. 𝜆𝑥. 𝑓 𝑓 𝑓 𝑥

Spring 2025 © 2025 Ritwik Banerjee 17


Syntax conventions

• The parentheses can be dropped based on these conventions:


• Application is left-associative: e.g., 𝑓 𝑓 𝑥 is the same as 𝑓 𝑓 𝑥, but these
are not the same as 𝑓 𝑓 𝑥
• The lambda binds as much as possible to its right: e.g., 𝜆𝑓. 𝜆𝑥. 𝑓 𝑓 𝑥 is the
same as 𝜆𝑓. 𝜆𝑥. 𝑓 𝑓 𝑥

• Multiple consecutive abstractions can be combined: e.g.,


𝜆𝑓. 𝜆𝑥. 𝑓 𝑓 𝑥 is the same as 𝜆𝑓𝑥. 𝑓 𝑓 𝑥
Spring 2025 © 2025 Ritwik Banerjee 18
Understanding 𝜆 • 𝜆𝑥. 𝐸 denotes a function that takes 𝑥 as its parameter,
and returns (the value of) 𝐸
expressions • In the expression 𝜆𝑥. 𝐸,
• 𝑥 is the formal parameter
• 𝐸 is the function body
# Python
def f(x): • A formal parameter is a variable defined in the function’s
return x + 1 declaration or definition. It receives values when the function is
called.
• It exists during the function’s definition only, and acts as a
placeholder for the actual data on which the function will operate.
(* OCaml *)
let f x = x + 1;; • An argument is the actual value or expression that is passed to the
function when it is called.
• It exists at the time of the function call, and provides the actual
data which the function uses for computation after being called.

Spring 2025 © 2025 Ritwik Banerjee 19


Understanding 𝜆 𝐸1 𝐸2 denotes calling the function 𝐸1 with 𝐸2 as its
argument; e.g., consider
expressions 𝜆𝑤. 𝜆𝑥. 𝑤 𝜆𝑦. 𝜆𝑧. 𝑧 .
In this application, the blue expression is passed as the
argument to the red expression.
# Python Remember: a formal parameter is a placeholder for the
def f(x): actual argument at the time of a function call.
return x + 1
So, an “application” means that we replace every
# What is f(5 + 12*319)? occurrence of the formal parameter in the body of the
# Replace every occurrence
of x with 5 + 12*319 in the
function with the argument being passed.
function body In the above application, the result will be …
𝜆𝑥. 𝜆𝑦. 𝜆𝑧. 𝑧

Spring 2025 © 2025 Ritwik Banerjee 20


Evaluating 𝜆 • The key step in evaluation is to replace every occurrence of the formal
parameter in the body of the function with the argument.
expressions • In an expression 𝜆𝑥. 𝐸1 𝐸2 , we write this step as 𝑥 ↦ 𝐸2 𝐸1
But these replacements must be done very carefully!

Consider 𝜆𝑥. 𝜆𝑧. 𝑥 𝑧 𝑦 . After replacement, we get 𝜆𝑧. 𝑦 𝑧 , and


everything’s great!

But what if the original expression was 𝜆𝑥. 𝜆𝑦. 𝑥 𝑦 𝑦 and we applied
the same replacement step? We end up with 𝜆𝑦. 𝑦 𝑦 .
We changed the name of a formal parameter, and the meaning of the
program changed. That should never happen … our replacement was wrong!
• One 𝑦 is a local name, while the other is not. And we failed to distinguish
between the two identical names in two different scopes.

Spring 2025 © 2025 Ritwik Banerjee 21


• A name is a mnemonic (i.e., assists memory) string used to represent
something
• Usually, they are alphanumeric tokens
• They allow a programmer to refer to values, operations, types, functions,
classes, etc.
• Without names, we would have to use very un-abstract low-level

Names, Scopes, concepts like addresses for everything!


• Names are absolutely crucial for abstraction

and Binding • Programmer can associate a name with a potentially complex fragment
of a program.
• This name usually reflects the ‘purpose’ or ‘meaning’ of that fragment
of code, and hides low-level details and reduces the conceptual
complexity for a programmer.
• Subroutines have names. Thus, names offer control abstractions.
• Data can be bundled together with operations (e.g., a class in Java or
Python; a struct in C; etc.). Thus, names offer data abstractions.

Spring 2025 © 2025 Ritwik Banerjee 22


Bindings

There are three fundamental questions we can ask


about names:
1. How does a name get associated with the thing it represents?
2. Can this association change through the course of a program?
3. If it can change, what are the limitations and rules (if any)?

This “association” is known as binding.

Spring 2025 © 2025 Ritwik Banerjee 23


Bindings and Binding Times

The time at which a binding is created is called the binding time.


• Language design/implementation time: The design of specific program constructs (syntax), primitive types,
and meaning (semantics), etc. are decided when the language is designed. Many issues are left to the
implementer. These may include numeric precision (i.e., the number of bits), run time memory sizes, etc.
• Program writing time: e.g., the choice of algorithms, data structures, names.
• Compile time: compilers choose (i) how to map high-level constructs to machine code, and (ii) the memory
layout for things used in the program.
• Link time: the time at which multiple object codes (machine code files) and libraries are combined into one executable. For
complex programs, there may be names in one module that refer to things in another module. The binding, then, cannot be
done until link time.
• Load time: the time at which the OS loads the executable into memory so that it can run. The OS usually distinguishes
between physical/virtual addresses. Virtual addresses are chosen at link time, but physical addresses can change at run time:
the processor’s memory management hardware translates the virtual addresses into physical addresses during individual
instructions at run time.
• Run time: many language-specific decisions may be taken during run time; e.g., the binding of values to
variables may occur at run time.

Spring 2025 © 2025 Ritwik Banerjee 24


Bindings and Binding Times

Spring 2025 © 2025 Ritwik Banerjee 25


Bindings and Binding Times

Early binding time leads to greater efficiency


• Compilers try to fix decisions that can be taken at compile time, so there’s less decision making at
run time.
• Checking of syntax and static semantics is performed only once at compile time, so there’s no
runtime overhead.
Later binding leads to greater flexibility.
• Interpreters allow programs to be modified at runtime.
• Some languages like Smalltalk and Java allow variable names to refer to objects of multiple types
at runtime (runtime polymorphism).
Objects bound before runtime are said to be statically bound.
• Hence the concept of static binding.
Objects bound at runtime are said to be dynamically bound.
• Hence the concept of dynamic binding.

Spring 2025 © 2025 Ritwik Banerjee 26


Binding lifetime

In general, the lifetime of a binding may be different from the lifetime of the
corresponding object.

An object may be retained and remain accessible even when a given name can no
longer be used.

For example, when an object is passed to a subroutine, the lifetime of the binding
inside that subroutine may be shorter than the object’s lifetime (the object may be
retained by the code that called this subroutine).

On the other hand, if the binding’s lifetime is longer than the life of the object
(i.e., the name exists, but the object does not), it is usually a bug in the program!
This is called a dangling reference, dead reference, stale pointer, etc.

Spring 2025 © 2025 Ritwik Banerjee 27


The textual region of a program in which a binding is active is its scope

In most languages, the scope of a binding is determined at compile-


time (i.e., it is statically defined)
• For example, in C/Java/Python, when we enter a subroutine/method, a new scope is
introduced where local object bindings are created
• When we exit the subroutine/method, those local bindings are destroyed
• These are compile-time decisions (hence, they are called statically scoped languages)
• It is also called lexical scoping, as it is determined based on the text of the program

Scope We often say “scope” instead of talking about the scope of a binding

• This indicates the maximal region of a program where no bindings are changed of
destroyed

Typically, a scope is the body of a class, subroutine, etc.

These are what we call a block

• Delimited by the language syntax; e.g., { … } in C/Java, or by indentation in Python

Spring 2025 © 2025 Ritwik Banerjee 28


Scope, Binding, and Shadowing

• A variable is shadowed when a variable declared within a scope has the same
name as a variable declared in an outer scope
• Equivalently, the name in an inner scope is said to mask the outer name
public class Shadow { class Shadow:
private int var = 0; def __init__(self):
private void method() { self.var = 0 # Instance variable
int var = 5; // has the same name as outer object
// field, so it shadows the above def method(self):
// field inside this method var = 5 # shadows the instance variable
System.out.println(var); // prints 5 print(var) # prints 5
System.out.println(var); // prints 0 (‘this’ moves print(self.var) # prints 0 (‘self’ accesses the
// the namespace to the # instance variable by moving to
// outer scope) # the outer scope
}
}

Spring 2025 © 2025 Ritwik Banerjee 29


Scopes, Binding, and Shadowing

• The variable 𝑥 in 𝜆𝑥. 𝐸 is bound


• 𝑥 in 𝜆𝑥. 𝑥 is a bound variable
• In 𝜆𝑥. (𝑥 𝑦), 𝑥 is bound but 𝑦 is not
• Informally, this means that parameters are local to a function definition)
• A variable that is not bound, is said to be free or unbound
• In 𝜆𝑥. 𝑦, 𝑦 is a free variable
• Informally, this means free variables in a function definition are nonlocal; they are either defined in an outer
scope, or they are undefined
• A variable binds to the closest parameter declaration
• In 𝜆𝑥. 𝜆𝑥. 𝑥, the 𝑥 in the inner function body is bound to the inner parameter (both in red); it shadows the the
parameter 𝑥 declared in the outer function

Spring 2025 © 2025 Ritwik Banerjee 30


Scopes, Binding, and Shadowing

(𝜆𝑥. 𝑥)(𝜆𝑤. (𝑥 𝑤)) free occurrence

(𝜆𝑎. (𝜆𝑏. 𝑎 (𝑏 𝑏)) (𝜆𝑏. 𝑎 (𝑏 𝑏))

Spring 2025 © 2025 Ritwik Banerjee 31


Formal definition of bindings

Let bv 𝐸 denote the set of all bound variables in 𝐸


If 𝐸 is an abstraction, i.e., of the form 𝜆𝑥. 𝐸2 , then
bv 𝐸 = bv 𝐸2 ∪ 𝑥
If 𝐸 is an application, i.e., of the form 𝐸1 𝐸2 , then
bv 𝐸 = bv 𝐸1 ∪ bv 𝐸2
If 𝐸 is a variable, then
bv 𝐸 = ∅

Spring 2025 © 2025 Ritwik Banerjee 32


Formal definition of bindings

Will these programs compile?

Spring 2025 © 2025 Ritwik Banerjee 33


Formal definition : free variables

Let fv 𝐸 denote the set of all free/unbound variables in 𝐸


If 𝐸 is an abstraction, i.e., of the form 𝜆𝑥. 𝐸2 , then
fv 𝐸 = fv 𝐸2 − 𝑥
If 𝐸 is an application, i.e., of the form 𝐸1 𝐸2 , then
fv 𝐸 = fv 𝐸1 ∪ fv 𝐸2
If 𝐸 is a variable of the form 𝑥, then
fv 𝐸 = 𝑥

Spring 2025 © 2025 Ritwik Banerjee 34


Replacements (revisited)

• Consider 𝜆𝑥. 𝜆𝑦. 𝑥 𝑦 𝑦


• The first 𝑦 is a bound variable, while the second 𝑦 is free
• The naïve replacement we saw previously will result in 𝜆𝑦. 𝑦 𝑦
• A free/unbound variable has been “captured”
• The correct procedure is to rename all bound occurrences of 𝑦 to a new (yet unused)
name, and then carry out the replacements
𝜆𝑥. 𝜆𝑦. 𝑥 𝑦 𝑦
≡𝛼 𝜆𝑥. 𝜆𝑡. 𝑥 𝑡 𝑦
→𝛽 𝜆𝑡. 𝑦 𝑡 𝑥 ↦ 𝑦 𝜆𝑡. 𝑥 𝑡

Spring 2025 © 2025 Ritwik Banerjee 35


Functional
programming
languages
After LISP (1960) and Scheme (1980), ML was developed originally as a
metalanguage for proving mathematical theorems.
There were two dialects: standard ML (SML), and categorical abstract machine
language (CAML)

Spring 2025 © 2025 Ritwik Banerjee 36


OCAML

• Objective CAML
• Generally, a good OCaml programmer can write OCaml
code that runs as efficiently as a C code written by a good C
programmer; i.e., in terms of efficiency, OCaml is as good as
any imperative programming language
• Relatively recent language, developed in 1996
• Inspired other languages like Scala and F#
• We will use OCaml as our exemplar for functional programming
• We will not get into OCaml’s object-oriented features
• We will shift our focus to object-oriented programming
using Java and Python, during the second half of this course

Spring 2025 © 2025 Ritwik Banerjee 37


Installing OCaml

• Go to https://ocaml.org and click on the ‘long term support


release’ (right now, that’s version 4.14.2); this official website
also includes documentation, useful ‘cheat sheets’, code
examples, and a really nice tutorial
• Do NOT install version 5.3.0 (which includes experimental
features not yet fully tested)
• Installation is straightforward, but has multiple steps and may
take some time
• Follow the installation guide shared on the course website

Spring 2025 © 2025 Ritwik Banerjee 38


Running OCaml

• Online OCaml interactive shell $ ocaml


• https://try.ocamlpro.com/ Ocaml version 4.14.2
# print_endline “Hello World!”;;
• File names have the .ml extension
Hello World!
• OCaml interactive toplevel - : unit = ()
• prompts with # # (* two semicolons end an expression *)
• user can enter new function/value definitions or # (* this is a comment *)
evaluate expressions
# (* below: terminate interactive shell *)
• OCaml compiler # exit 0;; (* or ctrl + D *)
• ocamlc to compile programs to bytecode
• ocamlopt to compile programs to native code

Spring 2025 © 2025 Ritwik Banerjee 39


OCaml Resources
Real World Ocaml. Yaron The official OCaml manual,
Programming Language
Minsky, Anil Madhavapeddy, documentation, tutorial, and
Pragmatics. Michael Scott.
and Jason Hickey. reference

Ch. 11 for concepts and


https://realworldocaml.org/ https://ocaml.org/
https://realworldocaml.org/ https://ocaml.org/

overview

Spring 2025 © 2025 Ritwik Banerjee 40


Simple Expressions

The last expression entered


# 2 * 3;; # true || (3 > 4) && not false;;
- : int = 6 is of the type - : bool = true
# -2 + 3 * 4;;
integer # “hello ” ^ “world”;;
- : int = 10
- : string = “hello world”
# 4.0 ** 2.0;; with value
# String.contains “hello” ‘o’;;
- : float = 16.
# 42.0 + 17.5;; - : bool = true
- : Error: This expression has type float but an # ();;
expression was expected of type int
- : unit = ()
# 42.0 +. 17.5;;
# print_endline “hello world”;;
- : float = 59.5
hello world
# 42 + int_of_float 17.5;;
- : int = 59 - : unit = ()

Spring 2025 © 2025 Ritwik Banerjee 41


() 01 02 03
In OCaml, ( ) Conceptually, it is the Unlike void, however,
represents the unit same as the void unit is an actual type
type. This is a data type in C or Java; it with a single,
type with only one indicates the concrete value … so,
value: ( ) absence of any it’s more like Python’s
value/information None

C: OCaml:
void log() { printf(“Hello world.\n”); } let log () = print_endline “Hello world”

Python: (* the argument and the return value are


both (), denoting that there is no data *)
def log():
print(“Hello world.”)

result = log() # result is explicitly ‘None’

Spring 2025 © 2025 Ritwik Banerjee 42


if 𝑬𝟏 then 𝑬𝟐 else 𝑬𝟑
Conditional • The “else” part is mandatory
expressions • The type of 𝐸1 must be Boolean
• The type of 𝐸2 and 𝐸3 must match

Spring 2025 © 2025 Ritwik Banerjee 43


Operators

Integer Floating-point Boolean


Comparisons
arithmetic arithmetic operators

+ - * +. -. not &&
*. /. = <>
/ mod ** ||

== !=

< >
<= >=

Spring 2025 © 2025 Ritwik Banerjee 44


OCaml comparison operators

Operator Meaning Description Similar to …

= Structural equality Compares values for content equality Python’s ==; C’s ==

<> Structural inequality Opposite of =, checks if values differ Python’s !=; C’s !=

== Physical equality Checks if two references point to the Comparing object IDs in Python, or
same object in memory using Java’s ==

!= Physical inequality Opposite of ==, checks if references Similar to not in Python for object
point to different objects identity

<, >, <=, Relational Standard numerical or lexicographical Same as Python, C, or Java
>= operators comparisons

Spring 2025 © 2025 Ritwik Banerjee 45


OCaml comparison operators

• = vs ==
• = compares content (deep); use it to
compare values (e.g., numbers, lists,
strings)
• == compares memory address (shallow);
use it to check if two variables refer to
the exact same object
• Don’t confuse equality (=) with identity (==)
• Unlike imperative languages, OCaml
does not use = for assignment
• Assignments are handled differently in
functional programming

Spring 2025 © 2025 Ritwik Banerjee 46


== in OCaml and
Java OCaml ==
(physical
Java == (reference
equality for objects;
value equality for
equality) primitives)
It compares … memory references for objects;
addresses actual values for
primitives

Works on … all values objects (reference


equality) and primitives
(value equality)

Use with data use = for deep use .equals() for deep
structures: equality based equality based on the
on the contents contents

Spring 2025 © 2025 Ritwik Banerjee 47


Functions The syntax of binding a name to a function is the same as that of
binding a name to any other type of value
let <name> <space-separated arguments> = <expression>;;
Notes
1. If a function can work with any data type, then the data type is
itself treated as a parameter
• In the functions f and g, the type of the argument can be
anything, and the function remains well defined
• This is denoted by the parameter ‘a (OCaml syntax is a single
quote followed by a lower-case letter)
2. Comma-separated items are automatically treated as a tuple
• That single tuple is the only argument of the sum function
• The tuple syntax is (m, n) but the type is denoted by the *
symbol; thus, a tuple of two integers is int * int, a tuple
of a float, a string, and an integer is float * string *
int, and so on

Spring 2025 © 2025 Ritwik Banerjee 48


Functions

let max x y = if x < y then y else x;;


let rec mult x y =
if y = 0 || x = 0
then 0
else x + mult x (y-1);;
let rec mult x y =
if y = 0 || x = 0
then 0
else let k = mult x (y-1)
in
x + k;;

Spring 2025 © 2025 Ritwik Banerjee 49


let there be bindings …

• let name = expr binds the name to the expression for the
rest of the program
• let name = expr in other_expr binds name to expr
only within the scope of other_expr
• Each binding is local (just like in 𝜆 calculus), and so, the let
keyword can be used in succession
• let x = 4 in let x = ...

• But let is not an assignment!


• It does not modify the value of an existing variable as done in imperative
languages (e.g., C, Java, or Python)
• It creates a new binding within the specified scope

Spring 2025 © 2025 Ritwik Banerjee 50


let there be bindings …

OCaml Concept
5 x • Both boxes are labeled “x”
x • If the question being raised is “what’s in box x?”,
10
the answer depends on where we are
• But no matter the answer, the contents of the boxes
let x = 5 in (* outer scope *) never change
(let x = 10 in (* shadows outer x *) • Values in OCaml are immutable
print_int x; (* prints 10 *)
• let creates bindings, but we are neither re-
x + 2) + x;; (* prints (10 + 2) + 5 = 17 *)
assigning the same variable, nor modifying
let x = 5;; (* global scope *) the value associated with the old variable
let x = 10;; (* shadows the previous x *)
print_int x;; (* prints 10 *)

Spring 2025 © 2025 Ritwik Banerjee 51


let there be applications …

let name = expr in other_expr is semantically equivalent to (fun name ->


expr) other_expr
• Both mean “in other_expr, replace name with expr”
• But let is usually easier to read
# let a = 3 in a + 2;;
- : int = 5
# (fun a -> a + 2) 3;;
- : int = 5
• So, let is like a function application (this is what we have seen in 𝜆 calculus as well)
• 𝜆𝑎. 𝑎 + 2 3

Spring 2025 © 2025 Ritwik Banerjee 52


Currying • Let’s revisit the type of the addition function: fun x y -> x + y;;
• Ordinarily, we would expect the type to be (int, int) -> int
• But recall from lambda-calculus that a function can only take a single
argument. So, we have the following expansion:
x
add x+y • fun only takes the argument x and returns a function (let’s call it
y
fun_x)
• fun_x then takes the argument y and adds y to x, yielding the final
result
x add • Thus, the type of fun is given by int -> int -> int
• We may read this as int -> (int -> int), following the syntax
y
add to x x+y conventions described in lambda calculus
• That is, a function that takes an int, and returns a function of the type
int -> int.
• This type of evaluation of a function (that takes multiple arguments) as a
sequence of functions, each with a single argument, is called Currying
(named after the mathematician Haskell Curry)

Spring 2025 © 2025 Ritwik Banerjee 53


Data structures

Tuples

Spring 2025 © 2025 Ritwik Banerjee 54


Data structures

Lists

Spring 2025 © 2025 Ritwik Banerjee 55


Pattern matching

• A powerful technique to deconstruct data structures


• In this example, we are pattern matching to add all the integers
in a list
• Given the argument (lst was the formal parameter, a mere
placeholder), [1; 3; 5] matches h::t,
• … which in turn matches h with 1 and t with [3; 5]
• … and evaluates to 1 + add_all [3; 5]
• … and so on, recursively, until in the base case, the
argument matches the empty list

Spring 2025 © 2025 Ritwik Banerjee 56


Exercise

Mimic add_all to write a function called contains that checks if a


list contains a specific given element (return true if present, and false
otherwise)

Spring 2025 © 2025 Ritwik Banerjee 57


Pattern matching

The match is analogous to a switch statement


• Each case describes
i. a pattern on the left-side of ->
ii. an expression to be evaluated if that pattern is matched (the right-side of ->)
• A pattern can be a constant, or it can be terms comprising constants and variables
• The cases are separated by |
• A matching pattern is found by searching through the cases in the order in which
the cases are written
• Once the first matching case is selected, all the other cases become irrelevant

Spring 2025 © 2025 Ritwik Banerjee 58


Pattern matching • The underscore “_” in a pattern is a wildcard
• Each wildcard is treated as a new anonymous
variable
• In functions with one argument, there are two ways of
defining the pattern matching

let rec add_all l = match l with let rec add_all = function


[] -> 0 [] -> 0
| h::t -> h + add_all t;; | h::t -> h + add_all t;;

Spring 2025 © 2025 Ritwik Banerjee 59


Pattern matching and list functions

• This is a polymorphic function


• It takes on different forms based on the
type of items in the list provided as its first
argument
• This kind of polymorphism (poly = many;
morph = form/shape) depends on a
parameter, and it is hence called
parametric polymorphism
• In OCaml, the parameter is denoted by a
single quote followed by a lowercase letter
• The append function shown here is the
builtin @ operator in OCaml

Spring 2025 © 2025 Ritwik Banerjee 60


List functions

Available in the List module


• List.length – returns the number of elements in a list
• List.hd – returns the first element (i.e., the ‘head’) of a
list
• List.tl – returns the list without its head (this is called
the ‘tail’)
• List.rev – reverses a list

Spring 2025 © 2025 Ritwik Banerjee 61


Using map and reduce operations, we can replace traditional loops or
iterations:
List functions • Apply a function f to each element of a list to transform the entire list:
• List.map f [a1; a2; …; an] = [f a1; f a2; …; f an]

Spring 2025 © 2025 Ritwik Banerjee 62


Using map and reduce operations, we can replace traditional loops or
iterations:

List functions • Apply a function f recursively to the current result together with an
element of a list, to finally produce a single element:
• List.fold_left f a [b1; b2; … bn] = f(… (f (f a
b1) b2) …) bn)

Spring 2025 © 2025 Ritwik Banerjee 63


Using map and reduce operations, we can replace traditional loops or
iterations:

List functions • Apply a function f to each element of a list, to produce unit as the
result (used for side effects):
• List.iter f [a1; a2; …; an] = begin f a1; …; f an;
() end

Spring 2025 © 2025 Ritwik Banerjee 64


• Pattern matching involves two tasks. These are to determine
1. whether a pattern matches a value, and
2. what parts of the value are to be bound to which variable names in the
pattern
• The 2nd task requires careful attention
1. The pattern x matches a value v, and binds x -> v

Bindings in 2. The pattern _ matches a value v, and doesn’t bind to anything (this is a
wildcard, and it represents an anonymous variable)

pattern matching 3. The nil pattern [] only matches the nil value [], and does not bind
to anything
4. If a pattern p1 matches a value v1, and produces a set of bindings b1;
and a pattern p2 matches v2, and produces a set of bindings b2, then
p1::p2 matches v1::v2 and produces the set of bindings b1 ∪ b2
(similarly for tuples)
Based on these rules, we can evaluate general pattern matching expressions of the type
match e with p1 -> e1 | p2 -> e2 | … | pn -> en;;

Spring 2025 © 2025 Ritwik Banerjee 65


Bindings in pattern
matching
• In the first two cases, the output is the same as the second item in
the pair, and in the last two cases, the output is the negation of
the second item in the pair
• We can use variable binding here to combine multiple patterns

Spring 2025 © 2025 Ritwik Banerjee 66


Static and dynamic pattern matching

• The semantics we have described so far are dynamic


semantics
• That is, every name is associated with a value at runtime
• OCaml also performs static semantic checks
• Exhaustiveness: statically analyzes types and checks
whether or not the cases guarantee that at least one of
the cases will match any valid input expression
• Unused cases: checks if there is any redundancy (i.e., a
case is not needed due to the previous cases already being
exhaustive)

Spring 2025 © 2025 Ritwik Banerjee 67


Higher-order
functions

Spring 2025 © 2025 Ritwik Banerjee 68


1 2
Exercise Write your own Write your own
map function reduce function
instead of using instead of using
the builtin the builtin
List.map List.fold_left

Spring 2025 © 2025 Ritwik Banerjee 69

You might also like