0% found this document useful (0 votes)
3 views

WuV_notes

These lecture notes on Knowledge Representation and Processing were created for a computer science course at the University Erlangen-Nuremberg in 2020, aimed at undergraduate and master's students. The course covers fundamental concepts, components of knowledge, and various methods of representation and processing, integrating recent research findings in the field of Artificial Intelligence. The notes serve as both an introductory course and a comprehensive overview of knowledge representation.

Uploaded by

sirisha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

WuV_notes

These lecture notes on Knowledge Representation and Processing were created for a computer science course at the University Erlangen-Nuremberg in 2020, aimed at undergraduate and master's students. The course covers fundamental concepts, components of knowledge, and various methods of representation and processing, integrating recent research findings in the field of Artificial Intelligence. The notes serve as both an introductory course and a comprehensive overview of knowledge representation.

Uploaded by

sirisha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

Lectures Notes on Knowledge Representation and Processing

Florian Rabe and Michael Kohlhase

2020
2

These notes were originally prepared for our CS course at University Erlangen-Nuremberg (FAU) in Summer 2020.
They are directed at 3rd semester CS undergraduates and master students but should be intelligible even for earlier
students and could be interesting also for PhD students and for students from adjacent majors. The course is
recommended both as a first course in the specialization area Artificial Intelligence as well as a one-off overview on
on knowledge representation.
The course was developed in Summer 2020 from scratch and materials were built along the way. It integrated current
directions and recent results in research on knowledge representation pulling together materials in an entirely new
and original way.
Contents

1 Meta-Remarks 5

2 Fundamental Concepts 7
2.1 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Representation and Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Components of Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.1 Syntax and Semantics, Data and Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.2 Semantics as Syntax Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 Heterogeneity of Semantics and Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 The Tetrapod Model of Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.1 Five Aspects of Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.2 Relations between the Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Overview of This Course 13


3.1 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Exercises and Running Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4 Representing Syntax and Semantics 15


4.1 Context-Free Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.1 Context-Free Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.2 Inductive Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1.3 Merged Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.4 Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2.1 Functional Programming Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2.2 Object-Oriented Programming Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.3 Combining Paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3 Context-Sensitive Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4 Absolute Semantics: By an Inference System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.5 Relative Semantics: By Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.5.1 General Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.5.2 Compositional Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.5.3 Non-Compositional Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5 Representing Data 27
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3
4 CONTENTS

6 Ontologies 29
6.1 General Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.2 A Basic Ontology Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.3 Representing Ontologies as Triples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.4 Writing Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.4.1 The OWL Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.4.2 The Protege Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.4.3 Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

7 Semantics for BOL 37


7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.2 Deductive Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.2.1 A Basic Semantic Language: SFOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.2.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.3 Concretized Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.3.1 An SQL-Inspired Basic Database Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.3.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.4 Computational Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.4.1 A Scala-Inspired Basic Programming Language . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.4.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.5 Narrative Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.6 Exercise 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

8 Type Systems for Ontology Languages 51


8.1 Intrinsic vs. Extrinsic Typing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8.1.2 Combined Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.2 Abstract Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.2.3 Abstract vs. Concrete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.2.4 Rigorous Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.3 Database Schemas as Typed Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.3.1 Exercise 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

9 Querying via a Semantics 57


9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.2 Deductive Querying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.2.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.2.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.3 Concretized Querying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.4 Computational Querying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.5 Narrative Querying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

10 Conclusion 61
Chapter 1

Meta-Remarks

Important stuff that you should read carefully!

State of these notes I constantly work on my lecture notes. Therefore, keep in mind that:
• I am developing these notes in parallel with the lecture — they can grow or change throughout the semester.
• These notes are neither a subset nor a superset of the material discussed in the lecture. On the one handle,
they may contain more details than mentioned in the lectures. On the other hand, important material such
as background, diagrams, and examples may be part of the lecture but not mentioned in these notes.
• Unless mentioned otherwise, all material in these notes is exam-relevant (in addition to all material discussed
in the lectures).

Collaboration on these notes I am writing these notes using LaTeX and storing them in a git repository on
GitHub at https://github.com/florian-rabe/Teaching. As an experiment in teaching, I am inviting all of you
to collaborate on these lecture notes with me. This would require familiarity with LaTeX as well as Git and GitHub
— that is not part of this lecture, but it is an essential skill for you. Ask in the lecture if you have difficulty figuring
it out on your own.
By forking and by submitting pull requests for this repository, you can suggest changes to these notes. For example,
you are encouraged to:
• Fix typos and other errors.
• Add examples and diagrams that I develop on the board during lectures.
• Add solutions for the homeworks if I did not provide any (of course, I will only integrate solutions after the
deadline).
• Add additional examples, exercises, or explanations that you came up or found in other sources. If you use
material from other sources (e.g., by copying an diagram from some website), make sure that you have the
license to use it and that you acknowledge sources appropriately!
I will review and approve or reject the changes. If you make substantial contributions, I will list you as a contributor
(i.e., something you can put in your CV).
Any improvement you make will not only help your fellow students, it will also increase your own understanding of
the material. Make sure your git commits carry a user name that I can connect to you.)

Other Advice I maintain a list of useful advice for students at https://github.com/florian-rabe/Teaching/


blob/master/general/advice_for_students.pdf. It is mostly targeted at older students who work in individual
projects with me (e.g., students who work on their BSc thesis). But much of it is useful for you already now or will
become useful soon. So have a look.

5
6 CHAPTER 1. META-REMARKS
Chapter 2

Fundamental Concepts

2.1 Abbreviations
knowledge representation and processing KRP the general area of this course
knowledge representation language KRL a languages used in KRP
knowledge representation tool KRT a tool implementing a KPL and processing algorithms for it

2.2 Motivation
2.2.1 Knowledge
Human knowledge pervades all sciences including computer science, mathematics, natural sciences and engineering.
That is not surprising: “science” is derived from the Latin word “scire” meaning “to know”. Similarly, philosophy,
from which all sciences derive, is named after the Greek words “philo” meaning loving and “sophia” meaning
wisdom, and the for common ending “-logy” is derived from Greek “logos” meaning word (i.e., a representation of
knowledge).
In regards to knowledge, computer science is special in two ways: Firstly, many branches of computer science
need to understand KRP as a prerequisite for teaching computers to do knowledge-based tasks. In some sense,
KRP is the foundation and ultimate goal of all artificial intelligence.1 Secondly, modern information technology
enables all sciences to apply computer-based KRP in order to vastly expand on the domain-specific tasks that can
be automated. Currently all sciences are becoming more and more computerized, but most non-CS scientists (and
many computer scientists for that matter) lack a systematic education and understanding of IT-KRP. That often
leads to bad solutions when domain experts cannot see which KRP solutions are applicable or how to apply them.

2.2.2 Representation and Processing


It is no coincidence that this course uses the phrase “Representation and Processing”. In fact, this is an instance
of a universal duality. Consider the following table of analogous concept pairs, which could be extended with many
more examples:

Representation Processing
Static Dynamic
Situation Change
Be Become
Data Structures Algorithms
Set Function
State Transition
Space Time
1 Indeed, a major problem with the currently very successful machine learning-based AI technology is that it remains unclear when

and how it does KRP. That can be dangerous because it leads to AI systems recommending decisions without being able to explain
why that decision should be trusted.

7
8 CHAPTER 2. FUNDAMENTAL CONCEPTS

Again and again, we distinguish a static concept that describes/represents what is a situation/state is and a
dynamic concept that describes how it changes. If that change is a computer doing something with or acting on
that representation, we speak of “processing”.
It is particular illuminating to contrast KRP to the standard CS course on Data Structures and Algorithms (DA).2
Generally speaking, DA teaches the methods, and KRP teaches how to apply them. Data structures are a critical
prerequisite for representing knowledge. But data structures alone do not capture what the data means (i.e., the
knowledge) or if a particular representation makes any sense. Similarly, algorithms are the critical prerequisite
for processing knowledge. But while algorithms can be systematically analyzed for efficiency, it is much harder to
analyze if an algorithm processes knowledge correctly. The latter requires understanding what the input and output
data means.
Capturing knowledge in computers is much harder than developing data structures and algorithms. It is ultimately
the same challenge as figuring out if a computer system is working correctly — a problem that is well-known to be
undecidable in general and very difficult in each individual case.

2.3 Components of Knowledge


2.3.1 Syntax and Semantics, Data and Knowledge
Four concepts are of particular relevance to understanding knowledge. They form a 2 × 2-quadruple of concepts:

Syntax Data
Semantics Knowledge

All four concepts are primitive, i.e., they cannot be defined in simpler terms. All sciences have few carefully-chosen
primitives on which everything builds. This is done most systematically in mathematics (where primitives include
set or function). While mathematical primitives as well as some primitives in physics or CS are specified formally,
the above four concepts can only be described informally, ultimately appealing to pre-existing human understanding.
Moreover, this description is not standardized — different courses may use very different descriptions even they
ultimately try to capture the same elusive ideas.
Data (in the narrow sense of computer science) is any object that can be stored in a computer, typically combined
with the ability to input/output, transfer, and change the object. This includes bits, strings, numbers, files, etc.
Data by itself is useless because we would have no idea what to do with it. For example, the object O =
((49.5739143, 11.0264941), ”2020 − 04 − 21T 16 : 15 : 00CEST ”) is useless data without additional information
about its syntax and semantics. Similarly, a file is useless data unless we know which file format it uses.
Syntax is a system of rules that describes which data is well-formed. For O above the syntax could be “a pair of
(a pair of two IEEE double precision floating point numbers) and a string encoding of an time stamp”. For a file,
the syntax is often indicated by the file name extension, e.g., the syntax of an html file is given in Section 12 of the
current HTML standard3 .
Syntax alone is useless unless we know what the semantics, i.e., what the data means and thus how to correctly in-
terpret and process the data. For example, the syntax of O allows to check that O is well-formed, i.e., indeed contains
two numbers and a timestamp string. That allows rejecting ill-formed data such as ((49.5739143, 11.0264941), ”f oo”).
The HTML syntax allows us to check that a file conforms to the standard.
Semantics is a system of rules that determines the meaning of well-formed data. For example, ISO 8601 specifies
that timestamp string refer to a particular date and time in a particular time zone. Further semantics for O might
be implicit in the algorithms that produce and consume it: such as “the first component of the pair contains two
numbers between 0 and 180 resp. 0 and 360 indicating latitude resp. longitude of a location on earth”. Semantics
might be multi-staged, and further semantics about O might be that O indicates the location and time of the first
lecture of this course. Similarly, Section 14 of the HTML standard specifies the semantics of well-formed HTML
files by describing how they are to be rendered in a web browser.
Knowledge is the combining of some data with its syntax and semantics. That allows applying the semantics to
obtain the meaning of the data (if syntactically well-formed and signaling an error otherwise). In computer systems,
2 The course is typically called “Algorithms and Data Structures”, but that is arguably awkward because algorithms can only exist

if there are data structure to work with. Compare my notes on that course in this repository, where I emphasize data structures much
more than is commonly done in that course.
3 https://html.spec.whatwg.org/multipage/
2.3. COMPONENTS OF KNOWLEDGE 9

• data is represented using primitive data (ultimately the bits provided by the hardware) and encodings of more
complex data (bytes, arrays, strings, etc.) in terms of simpler ones,
• syntax is theoretically specified using grammars and practically implemented in programming languages using
data structures,
• semantics is represented using algorithms that process syntactically well-formed data,
• knowledge is elusive and often emerges from executing the semantics, e.g., rendering of an HTML file.

2.3.2 Semantics as Syntax Transformation


In order to capture knowledge better in computer systems, we often use two syntax levels: one to represent the data
itself and another to represent the knowledge. These can be seen as input and output data. In that case, semantics
is a function that translates from the data syntax to the knowledge syntax, and knowledge is the pair of the data
and the result of applying the semantics. The following table gives some examples.

Data syntax Semantics function Knowledge syntax


SPARQL query evaluation result set
SQL query evaluation result table
program compiler binary code
program expression interpreter result value
logical formula interpretation in a model mathematical object
HTML document rendering graphical representation

Thus, the role of syntax vs. semantics may depend on the context: just like one function’s output can be another
function’s input, one interpretation’s knowledge can be another one’s syntax. For example, we can first compile a
program into binary and then execute it to returns its value.
Such hierarchies of evaluation levels are very common in computer systems. In fact, most state-of-the-art compilers
are subdivided into multiple phases each further interpreting the output of the previous one. Thus, if knowledge is
represented in computers, it is invariably data itself but relative to a different syntax.

2.3.3 Heterogeneity of Semantics and Knowledge


While it is easy to design languages to represent data in general, it is very difficult to designing KRLs that capture
the human-level quality of knowledge. Over the last few decades, the KRP area in computer science has diversified
into different subareas that approach this research problem in fundamentally different ways. In fact, KRP in the
very general sense of this course is usually not even studied by itself — instead the subareas are so different,
specialized, and large that they all sustain their respective university courses and research conferences.
This is related to the fact that the data naturally comes in fundamentally different forms such as graphs, arrays,
tables in the sense of relational databases, programs in a programming language, logical formulas, or natural
language texts. We speak of heterogeneous data. These different forms of data are supported by highly specialized
KPTs: graph databases, array databases, relational databases, package databases for programming languages,
theorem databases for logics (e.g., the Isabelle Archive of Formal Proofs), databases of research papers (such as the
arXiv), and so on.
All of these are very successful for their respective kind of data. And all of them include specifications of semantics
and KP algorithms that implement this semantics. But it can vary massively how the semantics is specified and
implemented. This has caused major practical problems for tool interoperability: many projects require data in
multiple formats and algorithms from multiple tools. But the respective tools are often islands that assume that
all data is represented in the tool’s language and users do not use outside tools. Therefore, the import/export
capabilities of the tools are often limited.
Moreover, transporting data across systems is usually ignorant of the semantics: while each tool takes relatively
good care to implement the semantics correctly, there is much less certainty that the semantics is preserved when
exchanging data across tools. For a trivial example, consider a tool that measures length in inches vs. a tool that
uses centimeters, both using floating point numbers for the data: if they exchange the data, i.e., just the numbers,
they may miscommunicate the semantics.4
This problem is not easy to fix though. The heterogeneity of data and semantics is so extreme that it is, in some
cases, an open theoretical problem how knowledge can be shared at all across tools. The basic idea — exchange the
4 Problems like this have been involved in major disasters such as the Mars Climate Orbiter.
10 CHAPTER 2. FUNDAMENTAL CONCEPTS

data in a way that preserves semantics — can be difficult to implement if both tools use entirely different paradigms
to specify semantics.

2.4 The Tetrapod Model of Knowledge


The Tetrapod model of knowledge is an ongoing research project by the instructors of this course. A first publication
was made in [CFKR20]. The structure of this course will draw heavily on the Tetrapod model to get an overview
of the different approaches to KPR and their interoperability problems.

2.4.1 Five Aspects of Knowledge


The Tetrapod model distinguishes five basic aspects of knowledge and KPR as described below. For each aspect,
there is a variety dedicated KRLs supported by highly optimized KPTs as indicated in the following table:

Aspect KRLs (examples) KPTs (examples)


ontologization ontology languages (OWL), description logics (ALC) reasoners, SPARQL engines (Virtuoso)
concretization relational databases (SQL, JSON) databases (MySQL, MongoDb)
computation programming languages (C) interpreters, compilers (gcc)
deduction logics (HOL) theorem provers (Isabelle)
narration document languages (HTML, LaTeX) editors, viewers

Ontologization focuses on developing and curating a coherent and comprehensive ontology of concepts. This
focuses on identifying the central concepts in a domain and their relations. For example, a medical ontology would
define concepts for every symptom, disease, and medication and then define relations for which symptoms and
medications are related to which disease.
Ontologies typically abstract from the knowledge: they standardize identifiers for the concepts and spell out some
properties and relations but do not try to capture all details of the knowledge. Well-designed ontologies can capture
exactly the knowledge that different KPTs must share and can thus serve as interoperability layers between them.
While organization can use ontology languages such as OWL or RDF, the inherent complexity of formal objects in
computer science and mathematics usually requires going beyond general purpose ontology languages (similar to how
the programming languages underlying computer algebra systems usually go beyond general purpose programming
languages).
Concretization uses languages based on numbers, strings, lists, and records to obtain concrete representations of
datasets in order to store and query their properties efficiently. Because concrete objects are so simple and widely
used, it is possible and common to build concrete datasets on top of general purpose data representation languages
and tools such as JSON or SQL.
Computation uses specification and programming languages to represent algorithmic knowledge.
Deduction uses logics and theorem provers to obtain verifiable correctness.
Narration uses natural language to obtain texts that are easy to understand for humans. Because narrative
languages are not well-standardized (apart from general purpose languages such as free text or LATEX), it is common
to develop narrative libraries on top of ad-hoc languages that impose some formal structure on top of informal text,
such as a fixed tree structure whose leafs are free text or a particular set of LATEX macros that must be used.
Narrative libraries can be classified based on whether entries are derived from publications (e.g., one abstract per
paper in zbMATH) or mathematical concepts (e.g., one page per concept in nLab).

2.4.2 Relations between the Aspects


The aspects can be visualized as the corners of tetrahedron with ontologization in the center and edges and faces
representing solutions that mix two or three aspects as seen in Figure 2.1.
Most approaches try to incorporate all or multiple aspects. But all languages and tools tend to be heavily biased
towards and optimized for a single one of the four corner aspects. This is not due to ignorance but because each
aspect provides characteristic advantages that are extremely hard to capture at once. In fact, every combination of
aspects shares characteristic advantages and disadvantages as sketched in Figure 2.2. For example, deductive and
narrative definitions of a function involved well-definedness arguments, and a function defined by a concrete table
2.4. THE TETRAPOD MODEL OF KNOWLEDGE 11

Narration

Ontologization
Deduction Concretization

Computation

Figure 2.1: Tetrapod model of knowledge

characteristic
Aspect objects advantage joint advantage application
of the other as-
pects
deduction formal proofs correctness ease of use verification
computation programs efficiency well-definedness execution
concretization concrete objects tangibility abstraction storage/retrieval
narration texts flexibility formal seman- human understanding
tics
Aspect pair characteristic advantage
ded./comp. rich meta-theory
narr./conc. simple languages
ded./narr. theorems and proofs
comp./conc. normalization
ded./conc. decidable well-definedness
comp./narr. Turing completeness

Figure 2.2: Shared properties and advantages of aspects

is trivially well-defined, but a computational definition of a function may throw exceptions when running; but only
the latter can store and compute functions efficiently. Consequently, dedicated and mostly disjoint communities
have evolved that have produced large aspect-specific datasets.
12 CHAPTER 2. FUNDAMENTAL CONCEPTS
Chapter 3

Overview of This Course

3.1 Structure
The subsequent parts of this course follow the Tetrapod model with one part per aspect. Each of these will describe
the concepts, languages, and tools of the respective aspect as well as their relation to other aspects.
The aspects of the Tetrapod are typically handled in individual courses, which describe highly specialized languages
and tools in depth. On the contrary, the overall goal of this course will be seeing all of them as different approaches
to semantics and knowledge representation. The course will focus on universal principles and their commonalities
and differences as well as their advantages and disadvantages.
The subsequent chapters of this first part will be dedicated to aspect-independent material. These will not ne-
cessarily be taught in the order in which they appear in these notes. Instead, some of them will be discussed in
connection to how they are relevant in individual aspects.

3.2 Exercises and Running Example


Typical practical projects, e.g., the ones that a strong CS graduate might be put in charge of, involve heteroge-
neous data and knowledge that must be managed using a variety of optimized aspect-specific languages and tools.
Interoperability between these is often a major source of inefficiency and bugs.
The exercises accompanying the course will mimic this situation: they will be designed around a single large project
that requires choosing and integrating methods, languages, and tools from all aspects.
Concretely, this project will be the development of a univis-like system for a university. It will involve heterogeneous
data such as course and program descriptions, legal texts, websites, grade tables, and transcript generation code.
Over the course of the semester students will implement a completely functional system applying the lessons of
the course. This is very unusual and often impossible for other courses: as any university course must teach many
different things from a wide area, it is rarely possible to find a project that requires many and only lessons from a
single course. Here KRP is special because its material pervades all aspects of system development.

13
14 CHAPTER 3. OVERVIEW OF THIS COURSE
Chapter 4

Representing Syntax and Semantics

4.1 Context-Free Syntax


Abstractly, context-free syntax is specified using grammars. Concretely, it is implemented using inductive types.
In the sequel, we will start with the standard definitions and then make a series of variation to each of these
definitions until they become equivalent. The intended equivalence is as follows:

CFG IDT
non-terminal type
production constructor
non-terminal on left of production return type of constructor
non-terminals on right of production arguments types of constructor
terminals on right of production notation of constructor
words derived from non-terminal N expressions of type N

4.1.1 Context-Free Grammars


We start with the usual definition:

Definition 4.1 (Context-Free Grammar). Given a set Σ of characters (containing the terminal symbols), a
context-free grammar consists of
• a set N of names called non-terminal symbols
• a set of productions each consisting of
– an element of N , called the left-hand side
– a word over Σ ∪ N , called the right-hand side

.
Example 4.2. Let Σ = {0, 1, +, ·, =, ≤}. We give a grammar for arithmetic expressions and formulas about them:

E ::= 0
| 1
| E+E
| E·E
.
F ::= E = E
| E≤E

Here we use the BNF style of writing grammars, where the productions are grouped by their left-hand side and
written with ::= and | . We have N = {E, F }.

First, we give a name to each production of a CFG:

15
16 CHAPTER 4. REPRESENTING SYNTAX AND SEMANTICS

Definition 4.3 (Context-Free Grammar with Named Productions). Given a set Σ of characters (containing the
terminal symbols), a context-free grammar consists of
• a set N of names called non-terminal symbols
• a set of productions each consisting of
– a name
– an element of N , called the left-hand side
– a word over Σ ∪ N , called the right-hand side

Example 4.4. The grammar from above with names written to the right of each production

E ::= 0 zero
| 1 one
| E+E sum
| E·E product
.
F ::= E = E equality
| E≤E lessOrEqual

This is not common BNF anymore.

Then we add base types to the productions:

Definition 4.5 (Context-Free Grammar with Named Productions and Base Types). Given a set Σ of characters
(containing the terminal symbols) and a set T of names (containing the base types allowed in productions), a
context-free grammar consists of
• a set N of names called non-terminal symbols
• a set of productions each consisting of
– a name
– an element of N , called the left-hand side
– a word over Σ ∪ T ∪ N , called the right-hand side

The intuition behind base types is that we commonly like to delegate some primitive parts of the grammar to
be defined elsewhere. A typical example are literals such as numbers 0, 1, 2, . . .: We could give regular expression
syntax for digit-strings. Instead, it is nicer to just assume we have a set of base types that we can use to insert an
infinite set of literals into the grammar.

Example 4.6. Let N at be the type of natural numbers and let T = {N at}. Then we can improve the grammar
from above as follows:
E ::= N at literal
| E+E sum
| E∗E product
.
F ::= E = E equality
| E ≤ E lessOrEqual

4.1.2 Inductive Data Types


We start with the usual definition:

Definition 4.7 (Inductive Data Type). Given a set of names T (containing the types known in the current
context), an inductive data type consists of
• a name, called the type,
• a set of constructors each consisting of
– a name n
– a list of elements of T ∪ {n}, called the argument types
4.1. CONTEXT-FREE SYNTAX 17

Example 4.8. Let N at be the type of natural numbers and T = {N at}. We give an inductive type for arithmetic
expressions:
E = literal of N at | sum of E ∗ E | product of E ∗ E
Here we use ML-style notation for inductive data types, which separates constructors by | and writes them as
name of argument-type-product.

First we generalize to mutually inductive types:

Definition 4.9 (Mutually Inductive Data Types). Given a set T of names (containing the types known in the
current context), a family of mutually inductive data type consists of
• a set N of names, called the types,
• a set of constructors each consisting of
– a name
– an element of N , called the return type
– a list of elements of N ∪ T , called the argument types

Example 4.10. We extend the type definition from above by adding a second type for formulas. Thus, N = {E, F }.

E = literal of N at | sum of E ∗ E | product of E ∗ E


F = equality of E ∗ E | lessOrEqual of E ∗ E

Then we add notations to the constructors:

Definition 4.11 (Mutually Inductive Data Types with Notations). Given a set Σ of characters (containing the
terminal symbols) and a set T of names (containing the types known in the current context), a family of mutually
inductive data type with notations consists of
• a set N of names, called the types,
• a set of constructors each consisting of
– a name
– an element of N , called the return type
– a list of elements of T ∪ N , called the argument types
– a word over the alphabet Σ ∪ T ∪ N containing the argument types in order and only elements from Σ
otherwise, called the notation of the constructor

The intuition behind notations is that it can get cumbersome to write all constructor applications as N ame(arguments).
It is more convenient to attach a notation to them such as

Example 4.12. We extend the type definitions from above by adding notations to each constructor. We use the
.
set Σ = {+, ·, =, ≤} as terminals in the notations.

E = literal of N at # N at | sum of E ∗ E # E + E | product of E ∗ E # E · E


.
F = equality of E ∗ E # E = E | lessOrEqual of E ∗ E # E ≤ E

Here we write the constructors as name of argument-type-product # notation. It is easy to see that this has
introduced redundancy: we can infer the argument types from the notation. So we can just drop the argument
types:
E = literal # N at | sum # E + E | product # E · E
.
F = equality # E = E | lessOrEqual # E ≤ E

4.1.3 Merged Definition


With the variation from above we have arrived at the following equivalence:
18 CHAPTER 4. REPRESENTING SYNTAX AND SEMANTICS

Theorem 4.13. Given a set Σ of characters and a set T of names, the following notions are equivalent:
• a family of mutually inductive data types in the context of types T with notations using characters from Σ,
• a context-free grammar with named productions, terminal symbols from Σ, and base types T .

Proof. The key idea is that


• the types and constructors of the former correspond to the non-terminals and productions of the latter
• for each constructor-production pair
– the right-hand side of the latter corresponds to the notation of the former,
– the argument types of the former correspond to the non-terminals occurring on the right-hand side of
the latter.

In implementations in programming languages, we often drop the notations. Instead, those are handled, if needed,
by special parsing and serialization functions.
However, in an implementation, it is often helpful to additionally give names to each argument of a production/-
constructor. That yields the following definition:

Definition 4.14 (Context-Free Syntax). Given a set Σ of characters and a set T of names, a context-free syntax
consists of
• a set N of names, called the non-terminals/types,
• a set of productions/constructors each consisting of
– a name
– an element of N , called the left-hand side/return type
– a sequence of objects, called the right-hand side/arguments which are one of the following
∗ an element of Σ
∗ a pair written (n : t) of a name n, called the argument name, and an element t ∈ T ∪ N called
the argument type.

Example 4.15. Using ad hoc language to write the constructors, our example from above as a context-free syntax
could look as follows:
E = literal # (value : N at) | sum # (lef t : E) + (right : E) | product # (lef t : E) · (right : E)
.
F = equality # (lef t : E) = (right : E) | lessOrEqual # (lef t : E) ≤ (right : E)

This uses an ad hoc

4.1.4 Contexts
We assume a context-free language l.

Definition 4.16 (Context). A context is a list of


• grammar terminology: productions N ::=x
• type terminology: declarations x : N
where each x is a unique name and each N is non-terminal symbol.
The x are called variables.

Remark 4.17. Sometimes the grammar itself has specific productions for contexts and variables. In that case, we
speak of meta-variable contexts and meta-variables to distinguish them from those of the language.

Definition 4.18 (Expressions in Context). Given a context Γ, a word E derived from non-terminal N that may
additionally use the productions of the context is called an expression of type N in context Γ.
We write this as Γ `l E : N .
4.2. IMPLEMENTATION 19

Definition 4.19 (Substitution). Given two contexts `l Γ and `l ∆, a substitution γ from Γ = x1 : N1 , . . . , xn :


Nn to ∆ is a list x1 := e1 , . . . , xn := en where every ei is an expression of type Ni in context ∆ (i.e., ∆ `l ei : Ni ).
We write this as ∆ `l γ : Γ or as `l γ : Γ → ∆.

Definition 4.20 (Substitution Application). Given an expression Γ `l E : N and a substitution `l γ : Γ → ∆


where Γ = x1 : N1 , . . . , xn : Nn and γ = x1 := e1 , . . . , xn := en , we write E[γ] for the result of replacing every xi
in E with ei .

Theorem 4.21. If Γ `l E : N and `l γ : Γ → ∆, then ∆ `l E[γ] : N .

We often want to substitute only a single variable x : N even though E may be defined in a larger context Γ. This
is often written E[x := N ]. That is just an abbreviation for E[γ], where γ contains x := N as well as y := y for
every other variable y of Γ.

4.2 Implementation
Context-free syntax can be implemented systematically in all programming languages. But, depending on the style
of the language, they make drastically different. We give the two most important paradigms as examples.

4.2.1 Functional Programming Languages


In a function programming language, inductive data types are a primitive feature. However, notations and named
arguments are not available. So helper functions must be used.
The basic recipe is as follows:
• The types and constructors (without the notations and named arguments) are implemented as family of
mutually inductive data types.
• For each argument of each constructor, a partial projective function is defined.
• A set of mutually recursive string rendering functions are defined, one for each constructor, that implement
the notations.

Example 4.22. We define our example syntax in ML.


First the inductive types (assuming a type N at already exists in the context):

data E = literal of N at | sum of E ∗ E | product of E ∗ E


and F = equality of E ∗ E | lessOrEqual of E ∗ E

Now the projection functions:

fun literal value(literal(v)) = SOM E v


| literal value( ) = N ON E
fun sum left(sum(x, )) = SOM E x
| sum left( ) = N ON E
fun sum right(sum( , x)) = SOM E x
| sum right( ) = N ON E

and so on for each constructor argument.


Finally, the string rendering functions (assuming a function natT oString already exists in the context):

fun E toString(literal(v)) = natT oString v


| E toString(sum(x, y)) = E toString(x) + ” + ” + E toString(y)
| E toString(product(x, y)) = E toString(x) + ” · ” + E toString(y)
.
and F toString(equality(x, y)) = E toString(x) + ” = ” + E toString(y)
| F toString(lessOrEqual(x, y)) = E toString(x) + ” ≤ ” + E toString(y)
20 CHAPTER 4. REPRESENTING SYNTAX AND SEMANTICS

Because ML has inductive data types as primitives, pattern-matching on our syntax comes for free. We will get
back to that when defining the semantics.

4.2.2 Object-Oriented Programming Languages


In a object-oriented programming language, inductive data types are not available. Therefore, they must be
mimicked using classes. On the positive side, this supports arguments names, and notations are a bit easier.
The basic recipe is as follows:
• Each type is implemented as an abstract class.
• Each constructor of type t is implemented as a concrete class that extends the abstract class t.
• The arguments names and type of each constructor c are exactly the argument names and types of the class
c. The constructor arguments are stored as fields in the class.
• The abstract classes require a toString method, which is implemented in every concrete class according to
its notation.

Example 4.23. We define our example syntax in a generic OO-language somewhat similar to Scala.1
In particular, we assume that the sy
abstract class E {
def toString : String
}
c l a s s l i t e r a l extends E {
f i e l d v a l u e : Nat
c o n s t r u c t o r ( v a l u e : Nat ) {
t h i s . value = value
}
def toString = value . toString
}
c l a s s sum e x t e n d s E {
f i e l d l e f t : Nat
f i e l d r i g h t : Nat
c o n s t r u c t o r ( l e f t : E , r i g h t : E) {
this . left = left
this . right = right
}
d e f t o S t r i n g = l e f t . t o S t r i n g + ”+” + r i g h t . t o S t r i n g
}
c l a s s product extends E {
f i e l d l e f t : Nat
f i e l d r i g h t : Nat
c o n s t r u c t o r ( l e f t : E , r i g h t : E) {
this . left = left
this . right = right
}
d e f t o S t r i n g = l e f t . t o S t r i n g + ”·” + r i g h t . t o S t r i n g
}

abstract class F {
def toString : String
}
c l a s s eq ua lit y extends E {
f i e l d l e f t : Nat
f i e l d r i g h t : Nat
c o n s t r u c t o r ( l e f t : E , r i g h t : E) {
this . left = left
this . right = right
4.2. IMPLEMENTATION 21

}
.
d e f t o S t r i n g = l e f t . t o S t r i n g + ” =” + r i g h t . t o S t r i n g
}
c l a s s product extends E {
f i e l d l e f t : Nat
f i e l d r i g h t : Nat
c o n s t r u c t o r ( l e f t : E , r i g h t : E) {
this . left = left
this . right = right
}
d e f t o S t r i n g = l e f t . t o S t r i n g + ”≤” + r i g h t . t o S t r i n g
}

Because OO-languages do not have inductive data types as primitives, pattern-matching on our syntax requires
awkward switch statements. We will get back to that when defining the semantics.

4.2.3 Combining Paradigms

The Scala language combines ideas from functional and OO-programming. That makes its representation of context-
free syntax particularly elegant.
In Scala, the constructor arguments are listed right after the class name. These are automatically fields of the class,
and a default constructor always exists that defines those fields. That gets rid of a lot of boilerplate.
If we want to make those fields public (and we do because those are the projection functions, we add the keyword
val in front of them. But even that is too much boilerplate. So Scala defines a convenience modifier: if we put
case in front of the classes corresponding to constructors of our syntax, Scala puts in the val automatically. It also
generates a default implementation of toString, which we have to override if we want to implement notations, too.
Finally, Scala also generates pattern-matching functions so that we can pattern-match in the same way as in ML.
Then our example becomes (as usual, assuming a class Nat already exists):

abstract class E {
def toString : String
}
c a s e c l a s s l i t e r a l ( v a l u e : Nat ) e x t e n d s E {
override def toString = value . toString
}
c a s e c l a s s sum ( l e f t : Nat , r i g h t : Nat ) e x t e n d s E {
o v e r r i d e d e f t o S t r i n g = l e f t . t o S t r i n g + ”+” + r i g h t . t o S t r i n g
}
c a s e c l a s s p r o d u c t ( l e f t : Nat , r i g h t : Nat ) e x t e n d s E {
o v e r r i d e d e f t o S t r i n g = l e f t . t o S t r i n g + ”·” + r i g h t . t o S t r i n g
}

abstract class F {
def toString : String
}
c a s e c l a s s e q u a l i t y ( l e f t : Nat , r i g h t : Nat ) e x t e n d s E {
.
o v e r r i d e d e f t o S t r i n g = l e f t . t o S t r i n g + ” =” + r i g h t . t o S t r i n g
}
c a s e c l a s s l e s s O r E q u a l ( l e f t : Nat , r i g h t : Nat ) e x t e n d s E {
o v e r r i d e d e f t o S t r i n g = l e f t . t o S t r i n g + ”≤” + r i g h t . t o S t r i n g
}
22 CHAPTER 4. REPRESENTING SYNTAX AND SEMANTICS

4.3 Context-Sensitive Syntax


It is common to define a language as the set of words that can be produced from the syntax, i.e., from a distinguished
non-terminal (the start symbol) of the context-free grammar. It is common to define a context-sensitive language
as a special case: the set of words that can be produced from a context-sensitive grammar.
This is, however, not helpful in practice. While the above remains the official definition of what the context-sensitive
languages are, all practical definitions are entirely different. In fact, context-sensitive grammars are virtually never
used to define a specific language. Instead, more restrictive definitions are used that capture more properties of
practical languages. In the sequel, we give one possible definition.2

Definition 4.24. A language system consists of


• a context-free syntax,
• a distinguished non-terminal symbol V, whose words are called vocabularies,
• a set of distinguished non-terminal symbols E, whose words are called E-expressions,
• a unary predicate wft(Θ) on vocabularies Θ,
• for every vocabulary Θ and every E, a unary predicate wff EΘ (E) on E-expressions E.
In case of wft(Θ), we call Θ well-formed. In case of wff EΘ (E), we call E a well-formed E-expression over Θ.

Remark 4.25 (Terminology). “language system” is not a standard term. We usually just say “language”.
“Well-formed E-expression over Θ” can be a mouthful. Therefore, it is common to simply say that E is an
E-expression, or that E is a Θ-expression, and expect readers to fill in the details.
It is also common to give the non-terminal E names, such as “term”, “type”, or “formula”. Then we simply say
“term” instead of “term-expression” and so on.

The vocabularies are typically lists of typically named declarations. They introduce the names that can be used to
form expressions. The expression kinds almost always include formulas.
Often declarations contain additional expressions, most importantly types or definitions. In general, all expressions
may occur in declarations, but many language systems do not use all of them.
Very different names are used for the vocabularies in different communities. The following table gives an overview:

Aspect vocabulary Θ expression kinds E


Ontologization ontology individual, concept, relation, property, formula
Concretization database schema cell, row, table, formula
Computation program term, type, object, class, . . .
Logic theory term, type, formula, . . .
Narration dictionary phrases, sentences, texts

In practice, it is most useful to think of a language system as family of languages: one language (containing the
expressions) for every vocabulary.

4.4 Absolute Semantics: By an Inference System


To define a language system, we need to define the well-formedness predicates. One way to do that is by translating
the context-free syntax into another language and then use existing definitions of well-formedness there.
But we also need to be able to get off the ground, i.e., to define a semantics from scratch when we do not have
another language available. This is usually done by giving an inference system for the predicates for well-formedness,
also called a type system.
We can also do both: if we have a semantics via an inference system and another one via translation, we can show
that the latter respects the former. That leads to the concepts of soundness (everything well-formed is translated
to something well-formed) and its dual completeness.
2 This definition works well in the context of this lecture but is non-standard. There is no standard definition at all. Instead, various

similar definitions exist.


4.5. RELATIVE SEMANTICS: BY TRANSLATION 23

4.5 Relative Semantics: By Translation


4.5.1 General Definition
The correspondence for the syntax between context-free grammars and inductive data types can be extended to the
semantics. Now we have a correspondence between case-based function definitions and inductive functions.

Definition 4.26. A semantics by translation consists of the following parts:


• syntax: a formal language l
• semantic language: a formal language L (from a different or the same aspect as l)
• semantic prefix: a vocabulary P in L that is prefixed to the translation of all vocabularies of l
• interpretation: a function that translates every l-vocabulary T to an L-vocabulary P, JT K

Critically, the semantic language (which is itself a formal language and can thus have a semantics itself) must be a
language whose semantics we already know. Therefore, it is often important to give multiple equivalent semantics
— choosing a different semantics for different audiences, who might be familiar with different languages.
The role of the semantic prefix P is to define once and for all the L-material that we need in general to interpret
l-theories (in our case: ontologies). It occurs at the beginning of all interpretations of ontologies. In particular, it
is equal to the interpretation of empty ontology.

4.5.2 Compositional Semantics


Compositionality There are some general principles shared by all translations:
• Every l-declaration is translated to an L-declaration for the same name, and ontologies are translated
declaration-wise.
• For every non-terminal N of l, there is one inductive function J−KN mapping complex l-expressions derived
from N to L-expressions.
• The base cases of references to declared l-identifiers are translated to themselves, i.e., to the identifiers of the
same name declared in L.
• The other cases are compositional: every case for a complex l-expression recurses only into the semantics of
the direct subexpressions.
The notion of compositionality captures these properties. An interpretation function is compositional if the inter-
pretation of any kind of expression E(e1 , . . . , en ) with subexpressions ei only depends on E and the interpretation
of the ei , i.e.,
JE(e1 , . . . , en )K = JEK(Je1 K, . . . , Jen K)

for some semantic operation JEK. Compositionality is also called the substitution property or the homomorphism
property. See also Def. 7.5.
More rigorously, we define a compositional translation as follows:

Definition 4.27 (Compositional Semantics). Consider a semantics for syntax grammar l and interpretation
function J−K.
J−K is compositional if it is defined as follows:
• a family of functions J−KN , one for every non-terminal N of l
• for every expressions E derived from N , we put JEK = JEKN
• each J−KN is defined by induction on the productions for N
• for each production N ::= ∗ (N1 , . . . , Nr ) and all expressions ei derived from Ni

J∗(e1 , . . . , er )KN = J∗K(Je1 KN1 , . . . , Jer KNr )

for some L-expression J∗K


Without loss of generality, we can assume that every production is of the form N ::= ∗ (N1 , . . . , Nr ) where the Ni
are all the non-terminals on the right-hand side and ∗ is a stand-in for all the terminal symbols.
24 CHAPTER 4. REPRESENTING SYNTAX AND SEMANTICS

Compositional Translations of Contexts We can extend every compositional translation to contexts, substi-
tutions, and expressions in contexts:

Definition 4.28. Given a translation J−K as above, for a non-terminal N , we define JN K as the non-terminal
from which the translations of N -expressions are derived.
Then we define:
Jx1 : N1 , . . . , xn : Nn K := x1 : JN1 K, . . . , xn : JNn K
Jx1 := w1 , . . . , xn := wn K := x1 := Jw1 K, . . . , xn := Jwn K
JxK := x

The requirement of compositionality is critical for two reasons:


• A non-compositional translation could translate l-expressions derived from the same non-terminal N to L-
expressions derived from different non-terminals. Then we would not be able to define JN K.
• The definition JxK := x adds a case to the case distinction in the compositional translation function. Without
compositionality, this would not make sense.

Theorem 4.29 (Type Preservation). For a compositional translation as above, we have

Γ `l w : N implies JΓK `L JwK : JN K

Substitution Theorem The main value of compositionality is the following:

Theorem 4.30 (Substitution Theorem). Consider a compositional semantics.


For every context Γ = x1 : N1 , . . . , xr : Nr , every syntax expression Γ `l E : N , and every substitution `l γ : Γ

JE[γ]KN = JEK[JγK]

Formulated without substitutions, this means that for every syntax expression E(e1 , . . . , er ) derived from N , where
the ei are subexpression derived from non-terminal Ni , we have

JE(e1 , . . . , en )KN = JEK(Je1 KN1 , . . . , Jen KNr )

Simply put, a semantics is compositional iff it is defined by mutually inductive translation functions with only
compositional cases. The latter is very easy to check by inspecting the shape of the finitely many cases of the
definition. The former is a powerful property because it applies to any of the infinitely many expressions of the
syntax.

4.5.3 Non-Compositional Semantics


It is highly desirable but not always possible to give a compositional translation. Sometimes a feature of the
syntactic language cannot be directly interpreted in the semantic language. In that case, it may still be possible to
give a non-compositional translation.

Example 4.31 (Non-Compositional Translation via Sub-Induction). A simple example of non-compositionality is


the translation of natural numbers based on zero, one, and addition (i.e., N ::=0 | 1 | N + N ) into natural numbers
based on zero and successor (i.e., N ::=0 | succ(N )): It is straightforward to translate zero and one compositionally:

J0K = 0 J1K = succ(0)

Now we would like to translate


Jm + nK = J+K(JmK, JnK),
4.5. RELATIVE SEMANTICS: BY TRANSLATION 25

but there is no way to define J+K in terms of zero and successor. Instead, we need subcases:

JmK
 if n = 0
Jm + nK = succ(JmK) if n = 1

J(m + n1 ) + n2 K if n = n1 + n2

This corresponds to the usual definition of addition, i.e., J+K, by induction.

Other common examples of non-compositional translations are


• several important logical theorems such as
– cut elimination, which is the translation from sequent calculus with cut to sequent calculus without cut,
– the deduction theorem, which is the translation from natural deduction to Hilbert calculus,
• almost anything done by an optimizing compiler, e.g., loop unrolling or function inlining,
• query optimization done by a database, e.g., turning a WHERE of a join into a join of WHEREs,
• almost all translations between natural languages, e.g., when words are ambiguous and a different translation
must be chosen for the same word based on the context (The introduction of richer intermediate structures
like ASTs and functions as values into the translation can recover some compositionality here).
Typical sources of non-compositionality in formal language translations are:
• A case in the translation function requires subcases which inspect the ei and treat them differently.
• A case in the translation function requires subcases which translate an expression differently based on the
context in which it occurs.
• The translation function requires nested inductions, i.e., a case in the translation function (which is already
inductive) requires a sub-induction on one of the sub-expressions.
• The semantic prefix is not fixed but depends on the translated object, i.e, the top-level case of the translation
scans through the entire argument X to collect all occurrences of a particular feature and then custom-builds
the semantic prefix of JXK.
See also Ex. 7.6.
Such non-compositional translations are undesirable for multiple reasons:
• The implementation is more complicated and error-prone.
• Reasoning about the translation is more difficult.
• The custom semantic prefix can be large.
But most importantly, non-compositional translations are less robust. Firstly, if we add a production to the syntax,
a compositional translation is easy to extend: just add a case to the translation. But a non-compositional translation
may additionally require a new subcase wherever subcases/subinductions are used. Moreover, if a custom semantic
prefix is used, its definition may have to be amended, at least it must be rechecked.
Secondly, in practice there are two sources of complex expressions: the ones already mentioned in the language,
and the ones used later for other reasons. For example, some complex expressions occur already statically in the
definition of a vocabulary V . But others might be appear dynamically later, e.g., when talking about V , proving
properties of V , or running queries on V . Thus, the definition of V and the use of complex expressions are decoupled:
V is defined statically once and for all, and complex expressions relative to V can be created and used dynamically.
But if a custom semantic prefix is used, only the static occurrences inside V can be considered for building the
prefix. Thus, it is not possible to translate expressions dynamically unless the semantic prefix is extended all the
time while V is used.
26 CHAPTER 4. REPRESENTING SYNTAX AND SEMANTICS
Chapter 5

Representing Data

5.1 Overview
This chapter section presented ongoing research on developing infrastructure for the semantic representation and
interchange of data across systems. It is presented on the slides and the lecture videos.
This includes Exercise 4 and 5.

27
28 CHAPTER 5. REPRESENTING DATA
Chapter 6

Ontologies

6.1 General Principles


Motivation An ontology is an abstract representation of the main concepts in some domain. Here domain refers
to any area of the real world such as mathematics, biology, diseases and medications, human relationships, etc.
Many examples can be found at https://bioportal.bioontology.org/, including the Gene ontology one of the
biggest.
Contrary to the other four aspects, ontological knowledge representations do not aim at capturing the entire
semantics of the domain objects. Instead, they focus on defining unique identifiers for those objects and describing
some of their properties and relations to each other.
We use the word ontologization to refer to the process of organizing the knowledge of a domain in ontologies.
Ontologies are most valuable when they are standardized (either sanctioned through a formal body or a quasi-
standard because everyone uses it). A standard ontology allows everybody in the domain to use the identifiers
defined by the ontology in a way that avoids misunderstandings. Thus, in the simplest form, an ontology can be
seen as a dictionary defining the technical terms of a domain. For example, the Gene ontology defines identifier
GO:0000001 to have the formal name ”mitochondrion inheritance” and the informal definition ”The distribution
of mitochondria, including the mitochondrial genome, into daughter cells after mitosis or meiosis, mediated by
interactions between mitochondria and the cytoskeleton.”.

Ontology Languages An ontology is written in ontology language. Common ontology languages are
• description logics such as ALC,
• the W3C ontology language OWL, which is the standard ontology languages of the semantic web,
• the entity-relationship model, which focuses on modeling rather than formal syntax,
• modeling languages like UML, which is the main ontology language used in software engineering.
Ontology languages are not committed to a particular domain — in the Tetrapod model, they correspond to
programming languages and logics, which are similarly uncommitted. Instead, an ontology language is a formal
language that standardizes the syntax of how ontologies can be written as well as their semantics.

Ontologies The details of the syntax vary between ontology languages. But as a general rule, every ontology
declares
• individual — concrete objects that exist in the real world, e.g., ”Florian Rabe” or ”WuV”
• concept — abstract groups of individuals, e.g., ”instructor” or ”course”
• relation — binary relations between two individuals, e.g., ”teach”
• properties — binary relations between an individuals and a concrete value (a number, a date, etc.), e.g.,
”creditValue”
• concept assertions — the statement that a particular individual is an instance of a particular concept
• relation assertions — the statement that a particular relation holds about two individuals
• property assertions — the statement that a particular individual has a particular value for a particular
property
• axioms — statements about relations between concepts, typically in the form subconcept of statements like

29
30 CHAPTER 6. ONTOLOGIES

”instructor” v ”person”
All assertions can be understood and spoken as subject-predicate-object triples as follows:

Assertion Triple
Subject Predicate Object
concept assertion ”Florian Rabe” is-a ”instructor”
relation assertion ”Florian Rabe” ”teach” ”WuV”
property assertion ”WuV” ”creditValue” 7.5

This uses a special relation is-a between individuals and concepts. Some languages group is-a with the other
binary relations between individuals for simplicity although it is technically a little different.
The possible values of properties must be fixed by the ontology language. Typically, it includes at least standard
types such as integers, floating point numbers, and strings. But arbitrary extensions are possible such as dates,
RGB-colors, lists, etc. In advanced languages, it is possible that the ontology even introduces its own basic types
and values.
Ontologies are often divided into two parts:
• The abstract part contains everything that holds in general independent of which individuals: concepts,
relations, properties, and axioms. It describes the general rules how the worlds works without committing to
a particular set of inhabitants of the world. This part is commonly called the TBox (T for terminological).
• The concrete part contains everything that depends on the choice of individuals: individuals and assertions.
It populates the world with inhabitants. This part is commonly called the ABox (A for assertional).
A separate division into two parts is the following:
• The signature part contains everything that introduces a named entity: individuals, concepts, relations,
and properties.
• The theory part contains everything that describes which statements about the named entities are true:
assertions and axioms.

Synonyms Because these principles pervade all formal languages, many competing synonyms are used in different
domains. Common synonyms are:

Here OWL Description logics ER model UML semantics via logics


individual instance individual entity object, instance constant
concept class concept entity-type class unary predicate
relation object property role role association binary predicate
property data property (not common) attribute field of base type binary predicate

In particular, the individual-concept relation occurs everywhere and is known under many names:

domain individual concept


type theory, logic constant, term type
set theory element set
database row table
philosophy1 object property
grammar proper noun common noun

6.2 A Basic Ontology Language


6.2. A BASIC ONTOLOGY LANGUAGE 31

Vocabularies: Ontologies
O ::= D∗

Declarations
D ::= individual ID atomic individual
| concept ID atomic concept
| relation ID atomic relation
| property ID : T atomic property
| I is-a C concept assertion
| IRI relation assertion
| IP V property assertion
| F other axioms

Formulas
F ::= C ≡ C concept equality
| CvC concept subsumption
| I is-a C concept formula
| IRI relation formula
| IP V property formula

Individual expressions
I ::= ID atomic individuals

Concept expressions
C ::= ID atomic concepts
| C tC union of concepts
| C uC intersection of concepts
| ∀R.C universal relativization
| ∃R.C existential relativization
| domR domain of a relation
| rngR range of a relation
| domP domain of a property

Relation expressions
R ::= ID atomic relations
| R∪R union of relations
| R∩R intersection of relations
| R; R composition of relations
| R∗ transitive closure of a relation
| R−1 dual relation
| ∆C identity relation of a concept

Property expressions
P ::= ID atomic properties

Identifiers
ID ::= alphanumeric string

Basic types and values


T ::= int | float | bool | string types
V ::= (omitted) values

Figure 6.1: Syntax of BOL


32 CHAPTER 6. ONTOLOGIES

We could study practical ontology languages like ALC or OWL now. But those feature a lot of other details that
can block the view onto the essential parts. Therefore, we first define a basic ontology language ourselves in order
to have full control over the details.

Definition 6.1 (Syntax of BOL). A BOL-ontology is given by the grammar in Fig. 6.1. It is well-formed if
• no identifier is declared twice,
• every property assertion assigns a value of the type required by the property declaration,
• every reference to an atomic individual/concept/relation/property is declared as such.

The above grammar exhibits some general structure that we find throughout formal KR languages. In particular,
an ontology consists of named declarations of four different kinds of entities as well as some assertions and
axioms about them. Each entity declaration clarifies which kind it is (in our case by starting with a keyword) and
introduces a new entity identifier. For each kind, there are complex expressions. These are anonymous and built
inductively; their base cases are references to the corresponding identifiers. Sometimes (in our case: individuals and
properties), the references are the only expressions of the kind. Sometimes (in our case: concepts and relations),
there can be many productions for complex expressions. The complex expressions are used to build axioms; in our
case, these are the three kinds of assertions and other formulas.

Remark 6.2 (Formulas vs. Assertions). In Fig. 6.1, the three productions in gray are duplicated: they occur both
as assertions and as formulas.
We could remove the three productions for assertions and treat them as special cases of axioms. But We keep the
duplication here because assertions are often treated differently from the other axioms. They are grouped with
the individuals in the ABox whereas the other axioms are seen as part of the TBox. Moreover, when used as
assertions, they may have to be interpreted differently than when used as formulas as we will see in Ch. 7.
Alternatively, we could remove the three productions in gray. But then we would lose the ability to talk about
formulas that are not true. That will become relevant in Ch. 9.

Axioms The role of the axioms is two-fold:


• They can be used to perform consequence closure: a formula may expresses a closure operation that defines
assertions that are automatically added to the ontology. That can be difficult as some kind of exhaustive
reasoning is needed. For example, if there is a subconcept axiom ”instructor” v ”person” and a concept
assertion ”FlorianRabe” is-a ”instructor”, we have to add the implied concept assertion ”Florian Rabe” is-a
”person”.
• They can be used to perform consistency conditions that must not violated by the ontology. For example,
ontologies may contain contradictory assertions or violations of uniqueness constraints such as a person should
only have one father or fathers should be male. The axioms exclude such cases. That should succeed if the
assertions are already consequence-closed.
But spelling out how that works is part of the semantics, not the syntax.

Example 6.3. We give a simple ontology that could be used to represent knowledge in the context of a university:
i n d i v i d u a l FlorianRabe
i n d i v i d u a l WuV
concept person
c o n c e p t male
concept i n s t r u c t o r
concept course
r e l a t i o n teach
property creditValue : f l o a t
F l o r i a n R a b e i s −a i n s t r u c t o r u male
WuV i s −a c o u r s e
F l o r i a n R a b e t e a c h WuV
WuV c r e d i t V a l u e 7 . 5
male v p e r s o n
i n s t r u c t o r v person
dom t e a c h v i n s t r u c t o r
6.3. REPRESENTING ONTOLOGIES AS TRIPLES 33

rng t e a c h v c o u r s e
dom c r e d i t V a l u e ≡ c o u r s e
c o u r s e v ∃ t e a c h −1 i n s t r u c t o r
The axioms are meant to state that males and instructors are persons, teaching is done by instructors to courses,
exactly the courses have credits, and (the last axiom) every course is taught by at least one instructor. Whether
they actually do mean that, depends on the semantics.
The consequence closure (as defined by the semantics) should add the assertion FlorianRabe is-a person.
Alternatively, if we use the axioms for consistency checking, we should add that assertion from the beginning.
Otherwise, the axioms would not be true.
If we use axioms for the consequence closure, we can even omit the two concept assertions — they should be
inferred using the domain and range axioms for the relation.
The assertion FlorianRabe is−a instructor u male could also be split into two assertions
FlorianRabe is−a instructor and FlorianRabe is−a male. That will be important as some semantics might
have difficulties handling all cases. So it can be helpful to use a variant that does not need u operator.

6.3 Representing Ontologies as Triples


It is common to represent an entire ontology as a set of subject-predicate-object triples. That makes handling
ontologies very simple and efficient. This is the preferred representation of the semantic web.
However, while, e.g., relation assertions are naturally triples, not all declarations are, and some tricks may be
necessary.

Inferring the Entity Declarations The entity declarations are not naturally triples. But we can usually infer
them from the assertions as follows: any identifier that occurs in a position where an entity of a certain kind is
expected is assumed to be declared as an entity for that kind.
For example, the individuals are what occurs as the subject of a concept, relation, or property assertion or as the
object of a relation assertion. It is conceivable that there are individuals that occur in none of these. But that is
unusual because they would be disconnected from everything in the ontology.
If we give TBox and ABox together, this inference approach usually works well. But if we only give a TBox, this
would often not allow inferring all entities. The only place where they could occur in the TBox is in the axioms,
and it is quite possible to have concept, relation, and property declarations that are not used in the axioms. In
fact, it is not unusual not to have any axioms.

Special Predicates To turn declarations into triples, we can use reflection, i.e., the process of talking about our
language constructs as if they were data.
Reflection requires introducing some built-in entities that represent the features of the language. In the semantic
web area, this is performed using the following entities:
• ”rdfs:Resource”: a built-in concept of which all individuals are an instance and thus of which every concept
is a subconcept
• ”rdf:type”: a special predicate that relates an entity to its type:
– an individual to its concept (corresponding to is-a above)
– other entities to their special type (see below)
• ”rdfs:Class”: a special class to be used as the type of classes
• ”rdf:Property”: a special class to be used as the type of properties
• ”rdfs:subClassOf”: a special relation that relates a subconcept to a superconcept
• ”rdfs:domain”: a special relation that relates a relation to the concepts of its subjects
• ”rdfs:range”: a special relation that relates a relation/property to the concept/type of its objects
Here ”rdf” and ”rdfs” refer to the RDF (Resource Description Framework) and RDFS (RDF Schema) namespaces,
which correspond to W3C standards defining those special entities.
Thus, we can represent many and in particular the most important entity declarations as triples:
34 CHAPTER 6. ONTOLOGIES

Assertion Triple
Subject Predicate Object
individual individual ”rdf:type” ”rdfs:Resource”
concept concept ”rdf:type” ”rdf:Class”
relation relation ”rdf:type” ”rdf:Property”
property property ”rdf:type” ”rdf:Property”
concept assertion individual ”rdf:type” concept
relation assertion individual relation individual
property assertion individual property value
for special forms of axioms
cvd c ”rdfs:subClassOf” d
dom r ≡ c r ”rdfs:domain” c
rng r ≡ c r ”rdfs:range” c

This is subject to the restriction that only atomic concepts and relations can be handled. For example, only concept
assertions can be handled that make an individual an instance of an atomic concept. This is particularly severe for
axioms, where complex expressions occur most commonly in practice. Here, the special relations allow capturing
the most common axioms as triples.

Problems Reflection is subtle and can easily lead to inconsistencies. We can see this in how the approach of
RDF(S) special entities breaks the semantics via FOL.
For example, it treats classes both as concepts (when they occur as the object of a concept assertion) and as
individuals (when they occur as subject or object of a ”rdfs:subClassOf” relation assertion). Similarly, ”rdfs:Class”
is used both as an individual and as a class. In fact, the standard prescribes that ”rdfs:Class” is an instance of
itself.
In practice, this is handled pragmatically by using ontologies that make sense. A formal way to disentangle this is
to assume that there are two variants of ”rdfs:Class”, one as an individual and one as a class. The translation must
then translate ”rdfs:Class” differently depending on how it is used.
It would be better if RDFS were described in a way that is consistent under the implicitly intended FOL semantics.
But the more pragmatic approach has the advantage of being more flexible. For example, being able to treat every
class, relation, or property also as an individual makes it easy to annotate metadata to them. Metadata is a set of
properties such as ”rdfs:seeAlso” or ”owl:versionInfo”, whose subjects can be any entity.

Subject-Centered Representations When giving a set of triples, there are usually a lot of triples with the
same subject. For example, we could use a simple concrete syntax with one triple per line and whitespace separating
subject, predicate, and object:
” FlorianRabe ” i s −a ” i n s t r u c t o r ”
” FlorianRabe ” i s −a ” male ”
” FlorianRabe ” ” t e a c h ” ”WuV”
” FlorianRabe ” ” t e a c h ” ”KRMT”
” FlorianRabe ” ” age ” 40
” FlorianRabe ” ” o f f i c e ” ”11.137”
It is more human-friendly to group these triples in such a way that the subject only has to be listed once. For
example, we could use a concrete syntax like this, where the subject occurs first and then predicate-object pairs
occur on indented lines:
” F l o r i a n Rabe”
i s −a ” i n s t r u c t o r ”
i s −a ” male ”
” t e a c h ” ”WuV”
” t e a c h ” ”KRMT”
” age ” 40
” o f f i c e ” ”11.137”
If the same predicate occurs with multiple values, we can group those as well. For example, we could give the
objects for the same predicates as a list following the predicate:
6.4. WRITING ONTOLOGIES 35

” F l o r i a n Rabe”
i s −a ” i n s t r u c t o r ” ” male ”
” t e a c h ” ”WuV” ”KRMT”
” age ” 40
” o f f i c e ” ”11.137”

Concrete syntaxes based on the triple representation of ontologies will usually adopt some kind of structure like
this. The details may vary.

6.4 Writing Ontologies


6.4.1 The OWL Language
Abstract Syntax and Semantics Due to their central in knowledge representation, a number of languages for
ontology writing exist. Most importantly, the syntax and semantics of OWL, including several sublanguages, are
standardized by the W3C.
OWL includes a number of built-in special entities. Most importantly, ”owl:Thing” corresponds to ”rdfs:Resource”
as the concept of all individuals.

Concrete Syntax Several concrete syntaxes have been defined and are commonly used for OWL. The OWL2
primer2 systematically describes examples in five different concrete syntaxes.
APIs for OWL implement the abstract syntax along with good support for reading/writing ontologies in any of the
concrete syntaxes.

6.4.2 The Protege Tool


A widely used tool for writing ontologies in OWL is Protege3 .
To get started with Protege without getting confused, we need to continue understand how its key terminology
maps to other contexts.

Here Protege Edited in WebProtege via


individual individual listed in ”Individuals” tab
concept class listed in ”Classes” tab
relation object property listed in ”Properties” tab
property data property listed in ”Properties” tab
concept assertion Type detail area of the individual in ”Individuals” tab
relation assertion Relationship detail area of the subject in ”Individuals” tab
property assertion Relationship detail area of the subject in ”Individuals” tab

Protege’s interface treats some parts of the ontology specially:


• The ”Classes” tab organizes concepts using a tree view based on the subconcept relationship. Superclasses of
a class can also be edited directed by listing parents.
• The ”Properties” tab organizes properties using a tree view based on the subproperty (i.e., implication, subset)
relationship.
• Axioms describing the domain and range of a property can be given directly in its details view.
Note that classes can be in relationships with other classes as well even though that was not considered in the
course so far.

6.4.3 Exercise 1
The topic of Exercise 1 is to use Protege to write an OWL ontology for a university.
2 https://www.w3.org/TR/2012/REC-owl2-primer-20121211/
3 https://protege.stanford.edu/
36 CHAPTER 6. ONTOLOGIES

Protege is a graphical editor for the abstract syntax of OWL. Familiarize yourself with the various concrete syntaxes
of OWL by writing an ontology that uses every feature once, downloading it in all available concrete syntaxes, and
comparing those.
The minimal goal of the exercise session is to get a Hello World example going, at which point the task transitions
into homework. There will be no homework submission, but you will use your ontology throughout the course.
You should make sure you understand and setup the process in a way that supports you when you revisit and
change your ontology many times throughout the semester.
Other than that, the task is deliberately unconstrained to mimic the typical situation at the beginning of a big
project, where it is unclear what the ultimate requirements will be.
Chapter 7

Semantics for BOL

7.1 Overview
In the sequel, we give four different semantics of BOL — using the four other aspects:

Section Aspect kind of semantic language semantic language


7.2 deduction logic SFOL
7.3 concretization database language SQL
7.4 computation programming language Scala
7.5 narration natural language English

All semantics will be compositional.


Note that we could also give an ontological semantics of BOL, e.g., by using OWL as the semantic language. BOL
and OWL are already so similar that the translation would be rather straightforward. Therefore, we omit it.

7.2 Deductive Semantics


We fix one language that we have already understood and define an interpretation function that maps all complex
expression of BOL to the semantic language. For simple ontology languages like BOL, ALC, OWL, etc., it is
common to use first-order logic (FOL) as the deductive semantic language. More specifically, we use SFOL, the
typed variant of FOL:

7.2.1 A Basic Semantic Language: SFOL


We give typed first-order logic (SFOL) as a language system.

Definition 7.1. Fig. 7.1 gives the context-free grammar. The vocabulary symbol is T hy. The expression symbols
are Y , T , and F .

7.2.2 Semantics

Definition 7.2 (Deductive Semantics of BOL). The semantic prefix is the SFOL-theory containing
• a type ι (for individuals),
• additional types and constants corresponding to base types and values of BOL.
Every BOL-ontology O is interpreted as the SFOL-theory P, JOK, where JOK is defined in Fig. 7.2.

As foreshadowed above, we can observe some general principles: Every BOL-declaration is translated to an SFOL-
declaration for the same name, and ontologies are translated declaration-wise. For every kind of complex BOL-
expression, there is one inductive function mapping BOL-expressions to SFOL-expressions. The base cases of

37
38 CHAPTER 7. SEMANTICS FOR BOL

Vocabularies: theories
T hy ::= D∗

Declarations
D ::= type ID : type type declaration
| fun ID : Y ∗ → Y function symbol declaration
| pred ID : Y ∗ → prop predicate symbol declaration
| axiom F axiom

type expressions
Y ::= ID atomic type

term expressions
T ::= ID(T ∗ ) function symbol applied to arguments
| ID term variables

formulas expressions
F ::= ID(T ∗ ) predicate symbol applied to arguments
.
| T =Y T equality of terms at a type
| > truth
| ⊥ falsity
| F ∧F conjunction
| F ∨F disjunction
| F ⇒F implication
| ¬F negation
| ∀ID : Y.F universal quantification at a type
| ∃ID : Y.F existential quantification at a type

Identifiers
ID ::= alphanumeric string

Figure 7.1: Syntax of SFOL

references to declared BOL-identifiers are translated to themselves, i.e., to the identifiers of the same name declared
in the SFOL-theory. The other cases are compositional: every case for a complex BOL-expression recurses only
into the semantics of the direct subexpressions.
The consequence closure of SFOL, using the usual semantics of SFOL, induces the desired consequence closure for
BOL:

Definition 7.3 (Consequence Closure). We say that a BOL-statement F is a consequence of an ontology O if


JF K is an SFOL-theorem of P, JOK.

Example 7.4. We interpret the example ontology from Ex. 6.3. Excluding the semantic prefix, it results in

FlorianRabe : ι, WuV ⊆ ι, person ⊆ ι, instructor ⊆ ι, course ⊆ ι, teach ⊆ ι × ι, creditValue ⊆ ι × float


instructor(FlorianRabe) ∧ male(FlorianRabe), course(WuV) teach(FlorianRabe, WuV) creditValue(WuV, 7.5)
∀x : ι.male(x) ⇒ person(x),
∀x : ι.instructor(x) ⇒ person(x),
∀x : ι.(∃y : ι.teach(x, y)) ⇒ instructor(x),
∀x : ι.(∃y : ι.teach(y, x)) ⇒ course(x),
∀x : ι.(∃y : ι.teach(y, x)) ⇔ course(x),
∀x : ι.course(x) ⇒ ∃y : ι.teach(y, x) ∧ instructor(y)
7.2. DEDUCTIVE SEMANTICS 39

BOL Syntax X Semantics JXK in SFOL


ontology SFOL theory
D1 , . . . , Dn JD1 K, . . . , JDn K
BOL declaration FOL declaration
individual i nullary function symbol i : ι
concept i unary predicate symbol i ⊆ ι
relation i binary predicate symbol i ⊆ ι × ι
property i : T binary predicate symbol i ⊆ ι × T
I is-a C axiom JCK(JIK)
I1 R I2 axiom JRK(JI1 K, JI2 K)
IP V axiom JP K(JIK, JV K)
F axiom JF K
Formula Formula without free variables
C1 ≡ C2 ∀x : ι.JC1 K(x) ⇔ JC2 K(x)
C1 v C2 ∀x : ι.JC1 K(x) ⇒ JC2 K(x)
I is-a C JCK(JIK)
I1 R I2 JRK(JI1 K, JI2 K)
IP V JP K(JIK, JV K)
Individual Terms of type ι
i i
Concept Formula with free variable x : ι
i i(x)
C1 t C2 JC1 K(x) ∨ JC2 K(x)
C1 u C2 JC1 K(x) ∧ JC2 K(x)
∀R.C ∀y : ι.JRK(x, y) ⇒ JCK(y)
∃R.C ∃y : ι.JRK(x, y) ∧ JCK(y)
dom R ∃y : ι.JRK(x, y)
rng R ∃y : ι.JRK(y, x)
dom P ∃y : T.JP K(x, y) (T is type of P )
Relation Formula with free variables x : ι, y : ι
i i(x, y)
R1 ∪ R2 JR1 K(x, y) ∨ JR2 K(x, y)
R1 ∩ R2 JR1 K(x, y) ∧ JR2 K(x, y)
R1 ; R2 ∃m : ι.JR1 K(x, m) ∧ JR2 K(m, y)
R−1 JRK(y, x)
R∗ (tricky, omitted)
.
∆C x = y ∧ JCK(x)
Property of type T Formula with free variables x : ι, y : T
i i(x, y)

Figure 7.2: Interpretation Function for BOL into SFOL


40 CHAPTER 7. SEMANTICS FOR BOL

Example 7.5 (Compositionality). The interpretation of BOL is compositional.


For example, consider the case of composition of relations:

JR1 ; R2 K = ∃m : ι.JR1 K(x, m) ∧ JR2 K(m, y)

Here we have an expression E(e1 , . . . , en ) with n = 2 and E is the ;-operator mapping (e1 , e2 ) 7→ e1 ; e2 , i.e., R1
and R2 are the direct subexpressions of R1 ; R2 . The semantics is a relatively complicated FOL-formula, but it
only depends on JR1 K and JR2 K — everything else is fixed. We have J; K = (p1 , p2 ) 7→ ∃m : ι.p1 (x, m) ∧ p2 (m, y),
i.e., the interpretation of the ;-operator is the function that maps two predicates p1 , p2 to the formula ∃m :
ι.p1 (x, m) ∧ p2 (m, y). Then we have
JR1 ; R2 K = J; K(JR1 K, JR2 K).

Example 7.6 (Non-Compositional Translation via Custom Semantic Prefix). In Fig. 7.2, we omitted the case for
the transitive closure. That was because it is not possible to translate it compositionally into FOL. We can only
do it non-compositionally with a custom semantic prefix:
We define the FOL-interpretation of an ontology O by JOK = PO , JOK, where PO is a custom semantic prefix. PO
is different for every ontology O and is defined as follows:
1. We scan through O and collect all occurrences of R∗ for any (not necessarily atomic) relation R.
2. PO contains the following declarations for each R:
• A binary predicate symbol CR ⊆ i × i. Note that R may be a complex expression; so we have to
generate a fresh name CR here.
• The axiom ∀x : ι, y : ι. R(x, y) ⇒ CR (x, y), i.e., CR extends R.
• The axiom ∀x : ι, y : ι, z : ι. CR (x, y) ∧ CR (y, z) ⇒ CR (x, z), i.e., CR is transitive.
3. We add the case JR∗ K = CR (x, y) to the interpretation function.
Intuitively, every occurrence of the ∗ -operator is removed from the language and replaced with a fresh name that
is axiomatized to have the needed properties. All of these axioms are added to the semantic prefix.

7.3 Concretized Semantics


We give an alternative semantics using a semantic language for concrete data. Specifically we use the database
language SQL.

7.3.1 An SQL-Inspired Basic Database Language


We give an SQL-like database language as a language system.

Definition 7.7. Fig. 7.3 gives the context-free grammar. The vocabulary symbol is S. The expression symbols
are T , R, V , and F .

7.3.2 Semantics
Even though this is a very different knowledge aspect, the general principles of the semantics are the same: Every
BOL-declaration is translated to an SQL declaration, and ontologies are translated declaration-wise. For every kind
of complex expression, there is one inductive function mapping BOL-expressions to SQL-expressions.
In SQL, we can nicely see the difference between declarations and expressions: the former are translated to side
effect-ful statements, the latter to side effect-free queries.

Definition 7.8 (Concretized Semantic of BOL). The semantic prefix consists of the following SQL-statements
• a type ID of identifiers (if not already supported anyway by the underlying database)
• declarations of all base types and values of BOL (if not already supported anyway by the underlying database)
• CREATE TABLE individuals (id ID, name string), where the id field is unique and automatically generated
when inserting values
7.4. COMPUTATIONAL SEMANTICS 41

Every BOL-ontology O is interpreted as the sequence P, JOK of SQL statements, where JOK is defined in Fig. 7.4.

Remark 7.9 (Limitations). Our interpretation of BOL in SQL is restricted to assertions using only atomic expres-
sions. For example, in the case for I is-a C, we assume that I and C are names. Thus, we have already created
an individual for I and a table for C, and we can thus insert the former into the latter. The general case would
be more complicated but is much less important in practice. But other expressions very quickly become more
difficult.
The interpretation of formulas into SQL is less obvious because SQL is not a logic and therefore does not define a
consequence closure. Thus, we can only use axioms for consistency checks in SQL. But that requires first carrying
out an explicit consequence closure that adds all implied assertions to the database.

Example 7.10. We interpret the example ontology from Ex. 6.3. Excluding the semantic prefix, the entity decla-
rations and assertions result in the following
INSERT INTO i n d i v i d u a l s ( name ) VALUES ( ” F l o r i a n R a b e ” )
INSERT INTO i n d i v i d u a l s ( name ) VALUES ( ”WuV” )
CREATE TABLE p e r s o n ( i d ID )
CREATE TABLE male ( i d ID )
CREATE TABLE i n s t r u c t o r ( i d ID )
CREATE TABLE c o u r s e ( i d ID )
CREATE TABLE t e a c h ( s u b j e c t ID , o b j e c t ID )
CREATE TABLE c r e d i t V a l u e ( s u b j e c t ID , o b j e c t f l o a t )
INSERT INTO c o u r s e VALUES ( 2 )
INSERT INTO t e a c h VALUES ( 1 , 2 )
INSERT INTO c r e d i t V a l u e VALUES ( 1 , 7 . 5 )
Here we assume that inserting into the table individuals has automatically assigned the ids 1 and 2 to our two
individuals.
The concept assertion about FlorianRabe using u cannot be handled by this semantics. Therefore, we skip that
assertion. The two missing assertions
INSERT INTO i n s t r u c t o r VALUES ( 1 )
INSERT INTO male VALUES ( 1 )
must then be provided by performing the consequence closure.
Moreover, the axioms result in the following consistency checks, i.e., queries that should be empty:
SELECT ∗ FROM male \ SELECT ∗ FROM p e r s o n
SELECT ∗ FROM i n s t r u c t o r \ SELECT ∗ FROM p e r s o n
SELECT ∗ FROM (SELECT DISTINCT s u b j e c t FROM t e a c h ) \ SELECT ∗ FROM i n s t r u c t o r
SELECT ∗ FROM (SELECT DISTINCT o b j e c t FROM t e a c h ) \ SELECT ∗ FROM c o u r s e
(SELECT ∗ FROM (SELECT DISTINCT s u b j e c t FROM c r e d i t V a l u e ) \ SELECT ∗ FROM c o u r s e )
UNION (SELECT ∗ FROM c o u r s e \ SELECT DISTINCT s u b j e c t FROM c r e d i t V a l u e )
SELECT ∗ FROM c o u r s e \
(SELECT DISTINCT s u b j e c t
FROM (SELECT o b j e c t AS s u b j e c t , s u b j e c t AS o b j e c t FROM t e a c h ) , i n s t r u c t o r
WHERE o b j e c t=i d )

Some of these checks will only succeed after performing the consequence closure. In particular, the table person
misses the entry 1 for the individual FlorianRabe because the assertion FlorianRabe is-a person is only present
as a consequence

7.4 Computational Semantics


We give an alternative semantics using computation, i.e., by using a programming language as the semantic language.
Specifically, we use the programming language Scala.
42 CHAPTER 7. SEMANTICS FOR BOL

7.4.1 A Scala-Inspired Basic Programming Language


We give a simple programming language as a language system.

Definition 7.11. Fig. 7.5 gives the context-free grammar. The vocabulary symbol is P . The expression symbols
are Y , V , and F .

7.4.2 Semantics
Again, the general principles are the same: Every BOL-declaration is translated to a Scala-declaration, and ontolo-
gies are translated declaration-wise to Scala-programs. For every kind of complex expression, there is one inductive
function mapping BOL-expressions to Scala-objects.

Definition 7.12 (Computational Semantic of BOL). The semantic prefix consists of the following Scala state-
ments
• classes for all BOL-base types and values for them (if not already present in Scala)
• classes for individuals and hash sets of objects:
import s c a l a . c o l l e c t i o n . mutable . HashSet
v a l i n d i v i d u a l s = new HashSet [ S t r i n g ]
Every BOL-ontology O is interpreted as the Scala program P, JOK, where JOK is defined in Fig. 7.6.

Remark 7.13 (Scala Syntax). In Scala, val x = e evaluates e and stores the result in x. {d1 ; . . . ; dn } is evaluated
by executing all di in order and returning the result of dn .
(A, B) is the product type A × B with pairing operator (x, y) and projection functions 1 and 2. x ⇒ F (x) is
λx.F (x).
The class HashSet is part of the standard library and offers function += and -= to add/remove elements, contains
to test elementhood, and forall, foreach to quantifiy/iterate over elements.
Types of variables are inferred if omitted.

Remark 7.14 (Limitations). Our interpretation of BOL in Scala has similar problems as the one in SQL. We
restrict entities in assertions to be atomic. And we assume that all assertions implied by the consequence closure
have already been obtained and added to the ontology.

Example 7.15. We interpret the example ontology from Ex. 6.3. Excluding the semantic prefix, the entity decla-
rations and assertions result in the following
i n d i v i d u a l s += ” F l o r i a n R a b e ”
i n d i v i d u a l s += ”WuV”
v a l p e r s o n = new HashSet [ S t r i n g ]
v a l male = new HashSet [ S t r i n g ]
v a l p e r s o n = new HashSet [ S t r i n g ]
v a l c o u r s e = new HashSet [ S t r i n g ]
v a l t e a c h = new new HashSet [ ( S t r i n g , S t r i n g ) ]
v a l c r e d i t V a l u e = new HashSet [ ( S t r i n g , f l o a t ) ]
c o u r s e += WuV
t e a c h += ( ” F l o r i a n R a b e ” , WuV)
c r e d i t V a l u e += (WuV, 7 . 5 )

Like for SQL, the two statements


i n s t r u c t o r += ” F l o r i a n R a b e ”
male += ” F l o r i a n R a b e ”
7.5. NARRATIVE SEMANTICS 43

must be obtained by consequence closure because we cannot handle the u assertion. Note that we could easily
compute the hash set instructor.diff(male) and add to it. But that would not add anything to the two
constituent sets.
If we thing of the axioms as consistency checks, we can translate them to assertions, i.e., Boolean expressions that
must be true. We only give some examples:
{ v a l c1 = male ; v a l c2 = p e r s o n ; c1 . f o r a l l ( x ⇒ c2 . c o n t a i n s ( x ) ) }

{
v a l c1 = c o u r s e ;
v a l c2 = {
val c = instructor ;
val r = {
v a l r = new HashSet [ ( S t r i n g , S t r i n g ) ] ;
t e a c h . f o r e a c h ( x ⇒ r += ( x . 2 , x . 1 ) ) ;
r
}
v a l e = new HashSet [ S t r i n g ] ;
r . f o r e a c h ( x ⇒ i f ( c . c o n t a i n s ( x . 2 ) ) e += x . 1 ) ;
e
};
c1 . f o r a l l ( x ⇒ c2 . c o n t a i n s ( x ) )
}

7.5 Narrative Semantics


We give an alternative semantics using narration, i.e., by using a natural language as the semantic language.
Specifically, we use the natural language English.
Again, the general principles are the same: Every BOL-declaration is translated to an English sentence, and
ontologies are translated declaration-wise to English texts. For every kind of complex expression, there is one
inductive function mapping BOL-expressions to English phrases.

Definition 7.16 (Narrative Semantic of BOL). The semantic prefix consists of English statements explaining
• the base types of BOL (if they are not universally known),
• that we rely on a lexicon to correctly form plurals (indicated by -s) and verb forms (indicated by -s, -ing,
-ed).
Every BOL-ontology O is interpreted as the English text P, JOK, where JOK is defined in Fig. 7.7.

Natural language defines a consequence closure by appealing to consequence in natural language. That is well-
defined as long as we express ourselves precisely enough.

Definition 7.17 (Consequence Closure). We say that a BOL-statement F is a consequence of an ontology O if


JF K is a consequence of P, JOK.

Example 7.18. We interpret the example ontology from Ex. 6.3. Excluding the semantic prefix and the lexicon
lookup, it results in the following text:
FlorianRabe is a proper noun.
WuV is a proper noun.
person is a common noun.
male is a common noun.
instructor is a common noun.
course is a common noun.
teach is a transitive verb.
44 CHAPTER 7. SEMANTICS FOR BOL

creditValue is a common noun for a property that can take float-values.


FlorianRabe is a instructor and is a male.
WuV is a course.
FlorianRabe teachs WuV.
WuV has creditValue 7.5.
everything that is a male also is a person.
everything that is a instructor also is a person.
everything that teachs is a instructor.
everything that is teached by something is a instructor.
has some creditValue is the same as is a course.
everything that is a course also is teached by something that is a instructor.
This English is very clunky of course. Multiple tweaks would be needed to get the grammar right:
• It is ”teaches” and ”taught” instead of ”teachs” and ”teached”,
• It is ”an instructor” instead of ”a instructor”,
• Sentences start with upper case letters.
• Proper nouns often have different names in the ontology than in reality, e.g., it should be ”Florian Rabe”
and ”credit value” instead of ”FlorianRabe” and ”creditValue”.
Moreover, the language could be polished in many places. For example, ”is a instructor and is a male” could
become ”is a instructor and a male” with a relatively easy special case treatment, or it could become ”is a male
instructor” with a more complex semantics that interprets some concepts via nouns and some via adjectives.

Remark 7.19 (Variants of English). We are relatively open as to what kind of English we want to use as the
semantic language. The simplest choice would be to use plain English as you could find in a novel or newspaper
article. But for many applications (e.g., formal ontologies in the STEM fields), we would rather use STEM English,
i.e., English interspersed with formulas, diagrams, and epistemic cues like “Definition”, “Theorem”, “Proof”, and
even . For this kind of English, LATEX is a good target format. We can even use special LATEX dialects like
sTeX [Koh08] where we can capture more of the semantic properties.

Remark 7.20 (Better Language Generation). While the target languages in the other translations are formal
languages engineered for regularity and simplicity (in terms of language primitives), natural languages have evolved
in practical human communication. As a consequence, the translation in Def. 7.16 results in English that is clumsy
at best and non-grammatical in general. We can think of the result as BOL-pidgin English.
Let us have a look at some of the problems that appear in both translations:

• We need a lexicon to obtain inflection information and the translation tries to remedy that by appending
“s” in various places. This works in some cases but not in others.
• there are many linguistic devices that serve an important role in natural language, but which we are not
targeting. An example is plural objects for aggregation. Say we have P is − aC, M is − aC, this would
translate to “P is a JCK, M is a JCK” in BOL-pidgin, whereas in natural English we would aggregate this to
“P and M are JCKs”.

A way out is to utilize special systems for dealing with the surface structure of natural language. An example of this
is the Grammatical Framework (GF, [Ran11]): it allows specifying a rich formal language of abstract syntax
trees for natural language (ASTs) together with language-specific linearizations, which amount to recursive
functions that translate ASTs to language-specific strings. GF comes with a large resource library that provides
a comprehensive, language-independent AST specification and linearizations for over 35 languages. We will not
pursue this here, but there is a special course “Logic-based Natural Language Semantics” at FAU in the Winter
Semesters that covers these and related topics. One of the major issues that need to be addressed there and here
is the notion of compositionality, which is central to all processing and semantics. We will address it next, and
come back to it time and again later.

7.6 Exercise 2
Implement the syntax and semantics of BOL.
7.6. EXERCISE 2 45

You can choose the programming language to use. We will use Scala in our examples.
You can choose which semantics to implement. The ones that translate directly to Scala or to English are easier
because it does not require implementing the syntax of the target language as well. The ones that translate to FOL
or to SQL require an implementation of the syntax of the respective target language. You have to implement that
as well or use a library for it.
We recommend not focusing on implementing the syntax and semantics in their entirety. It is more instructive to
save time by choosing a sublanguage of BOL (by omitting some productions) and to use the time to implement a
second semantics.
46 CHAPTER 7. SEMANTICS FOR BOL

Vocabularies: Schemas
S ::= D∗

Declarations
D ::= TABLE ID {CT ∗ } table
| INSERT R INTO ID row in a table
CT ::= ID : Y column type

table expressions (i.e., table-valued queries)


T ::= ID atomic tables
| JOIN T ∗ join of tables
| UNION T ∗ union of tables
| INTER T ∗ intersection of tables
| SELECT ID∗ FROM T WHERE F subtable
| T AGGREGATE CA∗ aggregation
CA ::= COUNT(ID) count values in a column
| MAX(ID) maximum value in a column
| ... minimum, sum, etc.

row expressions (i.e., single row–valued queries)


R ::= VALUES (CD∗ ) explicit row
| T table, but may only contain one row
CD ::= ID = V column definition

cell expressions (i.e., single column, single row–valued queries)


V ::= R row, but may only contain one column
| ID(V ∗ ) built-in function symbol applied to values

formulas
F ::= V boolean value
| V =V equality of values
| R IN T containment of rows in tables
| ... boolean operators

Identifiers
ID ::= alphanumeric string

Figure 7.3: Syntax of BDL


7.6. EXERCISE 2 47

BOL Syntax X Semantics JXK in SQL


ontology SQL statements
D1 , . . . , Dn JD1 K, . . . , JDn K
BOL declaration (I, C, R, P atomic) SQL statement
individual i INSERT INTO individuals (name) VALUES (”i”)
concept i CREATE TABLE i (id ID)
relation i CREATE TABLE i (subject ID, object ID)
property i : T CREATE TABLE i (subject ID, object T )
I is-a C INSERT INTO C VALUES (JIK)
I1 R I 2 INSERT INTO R (subject, object) VALUES (JI1 K, JI2 K)
IP V INSERT INTO P (subject, object) VALUES (JIK, V )
F consistency check, consequence closure (omitted)
Formula Query that returns empty result iff formula is true
C1 ≡ C2 (JC1 K \ JC2 K) UNION (JC2 K \ JC1 K)
C1 v C2 JC1 K \ JC2 K
I is-a C JIK IN C
I1 R I2 (JI1 K, JI2 K) IN R
IP V (JIK, V ) IN P
Individual an identifier from the table individuals
i SELECT id FROM individuals WHERE name=”i”
Concept SQL query for one-column table
i SELECT * FROM i
C1 t C2 JC1 K UNION JC2 K
C1 u C2 JC1 K INTERSECT JC2 K
∀R.C individuals \ (SELECT subject FROM JRK WHERE object NOT IN JCK)
∃R.C SELECT DISTINCT subject FROM JRK, JCK WHERE object=id
dom R SELECT DISTINCT subject FROM JRK
rng R SELECT DISTINCT object FROM JRK
dom P SELECT DISTINCT subject FROM JP K
Relation SQL query for two-column table
i SELECT * FROM i
R1 ∪ R2 JR1 K UNION JR2 K
R1 ∩ R2 JR1 K INTERSECT JR2 K
R1 ; R2 SELECT DISTINCT l.subject, r.object FROM JR1 K AS l, JR2 K AS r
WHERE l.object = r.subject
R−1 SELECT object AS subject, subject AS object FROM JRK
R∗ (tricky, omitted)
∆C SELECT id AS subject, id AS object FROM JCK
Property of type T SQL query for two-column table
i SELECT * FROM i

Using the abbreviation: S \ T = SELECT id FROM S WHERE id NOT IN T

Figure 7.4: Interpretation Function for BOL into SQL


48 CHAPTER 7. SEMANTICS FOR BOL

Vocabularies: Programs
P ::= D∗

Declarations
D ::= class ID[ID∗ ] extends ID∗ {d∗ } class definition
| object ID extends ID∗ {CF ∗ } object definition
d ::= val ID : Y [ = T ] immutable field in a class/object, possibly abstract
| var ID : Y = T mutable field in a class, with initial value

type expressions
Y ::= ID atomic type (class)
| ID built-in type (booleans, int, etc.)
| Y (Y ∗ ) type operator applied to type arguments

term expressions
T ::= ID atomic value (class, value, variable)
| ID built-in value (boolean operators, etc.)
.
| T =Y T equality of terms at a type
| T :Y instance check
| T (T ∗ ) function applied to values
| new ID {d∗ } new instance of class
| T.ID field access in an object
| T.ID = T assignment to a mutable field in an object
| {T ∗ } sequencing (;-operator)
| d local declaration
| ID = T assignment to a local variable
| if (T ) T else T if-then-else
| while (T ) T while-loop

Formulas
F ::= T terms of boolean type

Identifiers
ID ::= alphanumeric string

Figure 7.5: Syntax of BPL


7.6. EXERCISE 2 49

BOL Syntax X Semantics JXK in Scala


ontology Scala program
D1 , . . . , Dn JD1 K, . . . , JDn K
BOL declaration (I, C, R, P atomic) Scala declaration
individual i val i = ”i”; individuals += i
concept i val i = new HashSet[String]
relation i val i = new HashSet[(String,String)]
property i : T val i = new HashSet[(String,T)]
I is-a C JCK += JIK
I1 R I2 JRK += (JI1 K, JI2 K)
IP V JP K += (JIK, JV K)
F assertions, consequence closure (omitted)
Formula Program that evaluates the formula to a Boolean
C1 ≡ C2 {val c1 = JC1 K; val c2 = JC2 K;
c1 .forall(x ⇒ c2 .contains(x)) && c2 .forall(x ⇒ c1 .contains(x))}
C1 v C2 {val c1 = JC1 K; val c2 = JC2 K; c1 .forall(x ⇒ c2 .contains(x))}
I is-a C JCK.contains(JIK)
I1 R I2 JRK.contains((JI1 K, JI2 K))
IP V JP K.contains((JIK, JV K))
Individual String object
i i
Concept HashSet[String] object
i i
C1 t C2 JC1 K.union(JC2 K)
C1 u C2 JC1 K.inter(JC2 K)
∀R.C {val c = JCK; val r = JRK; val e = individuals.clone;
r.foreach(x ⇒ if (!c.contains(x. 2)) e -= x. 1); e}
∃R.C {val c = JCK; val r = JRK; val e = new HashSet[String];
r.foreach(x⇒ if (c.contains(x. 2)) e += x. 1); e}
dom R {val c = new HashSet[String]; JRK.foreach(x ⇒ c += x. 1); c}
rng R {val c = new HashSet[String]; JRK.foreach(x ⇒ c += x. 2); c}
dom P {val c = new HashSet[String]; JP K.foreach(x ⇒ c += x. 1); c}
Relation HashSet[(String,String)] object
i i
R1 ∪ R2 JR1 K.union(JR2 K)
R1 ∩ R2 JR1 K.inter(JR2 K)
R1 ; R2 {val r1 = JR1 K; val r2 = JR2 K; val e = new HashSet[(String,String)];
r1.foreach(x ⇒ r2.foreach(y ⇒ if (x. 2 == y. 1) e += (x. 1,y. 2))); e}
R−1 {val r = new HashSet[(String,String)]; JRK.foreach(x ⇒ r += (x. 2,x. 1)); r}
R∗ (omitted)
∆C {val r = new HashSet[(String,String)]; JCK.foreach(x ⇒ r += (x,x)); r}
Property of type T HashSet[(String,T)] object
i i

Figure 7.6: Interpretation Function for BOL into Scala


50 CHAPTER 7. SEMANTICS FOR BOL

BOL Syntax X Semantics JXK in English


ontology English text
D1 , . . . , Dn JD1 K, . . . , JDn K
BOL declaration dictionary entry or true sentence
individual i i is a proper noun.
concept i i is a common noun.
relation i i is a transitive verb.
property i : T i is a common noun for a property that can take T -values.
I is-a C JIK JCK.
I1 R I2 JI1 K JRKs JI2 K.
IP V JIK has JP K JV K.
F JF K.
Formula sentence
C1 ≡ C2 JC1 K is the same as JC2 K.
C1 v C2 everything that JC1 Ks also JC2 Ks.
I is-a C JIK JCK.
I1 R I2 JI1 K JRKs JI2 K.
IP V JIK has JP K JV K.
Individual noun phrase (to be used as a subject or object)
i i
Concept intransitive verb phrase (to be plugged after a subject)
i is a i
C1 t C2 JC1 K or JC2 K
C1 u C2 JC1 K and JC2 K
∀R.C JRKs only things that JCK
∃R.C JRKs something that JCKs
dom R JRKs something
rng R is JRKed by something
dom P has some JP K
Relation transitive verb phrase (to be plugged between subject and object)
i i
R1 ∪ R2 JR1 Ks or JR2 Ks
R1 ∩ R2 JR1 Ks and JR2 Ks
R1 ; R2 JR1 Ks something that JR2 Ks
R−1 is JRKed by
R∗ JRKs something that JRKs something and so on that JRKs
∆C JCKs and is the same as
Property of type T property phrase
i i

Figure 7.7: Interpretation Function for BOL into English (intransitive VP version)
Chapter 8

Type Systems for Ontology Languages

8.1 Intrinsic vs. Extrinsic Typing


8.1.1 Overview
We write x : A to say that x has type A. There are two fundamentally different methods for introducing the types
A, which are summarized in the following table:

intrinsic extrinsic
goes back to λ-calculus by Church Curry
general idea objects carry their type with them types are designated by the environment
typing is a function from objects to types relation between objects and types
objects have unique type any number of types
types often interpreted as disjoint sets unary predicates on a universal set
type inference for x uniquely infer A from x try to find minimal A such that x : A
type checking compare inferred and expected type prove x : A
subtyping A <: B mimicked by casting from A to B defined by x : A implies x : B for all x
typing decidable yes unless too expressive no unless expressivity restricted
typing errors are detected usually statically (compile-time) dynamically (run-time)
type of name introduced as part of declaration additional axiom
example individual ”WuV”:”course” individual ”Wuv”, ”WuV” is-a ”course”
advantages easy flexible
unique type inference allows subtyping
examples SFOL, SQL OWL, Scala, English
most logics, functional PLs ontology, OO, natural languages
many type theories set theories

Example 8.1 (Extrinsically Typed Ontology Language). In BOL, the objects are the individuals, the types are the
concepts, and is-a is the typing relation between them. The typing is extrinsic:
• Individuals and their concept assertions are introduced in separate declarations.
• An individual may be an instance of any number of concepts.
• There is no primary concept that could be returned as the inferred type of an individual.
• Concepts are subject to subtyping C v C 0 .
• Whether an individual is an instance of a concept, must be checked by reasoning about the is-a relation.
Therefore, all semantics must interpret individuals as elements of a universal collection, and types as unary
predicates on that. Specifically, we have
semantics in universal collection unary predicate typing relation i is-a c
FOL type ι predicate c ⊆ ι c(i) true
SQL table Individuals table containing ids id of i in table c
Scala String hash set of strings c.contains(i)
English proper nouns common nouns ”i is a c” is true

51
52 CHAPTER 8. TYPE SYSTEMS FOR ONTOLOGY LANGUAGES

We can also think of relations as objects. However, BOL cannot express relation types at all, and there is no
intrinsic typing. Instead, the domain and range of a relation r are given extrinsically via axioms about dom r and
rng r. Like for individuals that allows flexibility as the same relations may have multiple types.

Example 8.2 (Intrinsically Typed Ontology Language). We could define TOL, a typed ontology language that
arises as a variant of BOL. The main differences would be
• Individuals are declared with a concept that serves as their type: individual i : C.
• Concept assertions are dropped. They are now part of the individual declarations.
• Relations are declared with two concepts for their domain D and range R: relation r <: D × R.
• Properties are declared with a concept for their domain C: property p <: C × T .
TOL would make many ontologies more concise. For example, we could simply write
concept i n s t r u c t o r
concept course
i n d i v i d u a l FlorianRabe : i n s t r u c t o r
t e a c h <: i n s t r c t o r × c o u r s e

However, we would lose flexibility. If we want to add the concept ”male”, it would be difficult to make
FlorianRabe have both types. We might be able to remedy that by allowing intersections and declaring
individual FlorianRabe: instructor u male. But even then, we would have to commit to the type of each in-
dividual right away — we cannot add different concept assertions for the same individual in different places, a
common occurrence in building large ontologies.
Allowing u would also introduce subtyping. If we are careful in the design of TOL, that may still result in an
elegant scalable language. In particular, typing may remain decidable (depending on what other operations we
allow). But if we go too far, it may end up so complex that it would have been easier to go with extrinsic typing.
That is why we use intrinsic typing only in two related places in BOL:
• The base types and values use an intrinsic type system (whose details we omitted).
• The range of properties is given intrinsically by a base type.

Remark 8.3 (Subtyping). Languages with subtyping usually have to use extrinsic type systems. Typical sources
of subtyping are
• explicit subtyping as in N <: Z
• comprehension/refinement as in {x : N|x 6= 0}
• operations like union and intersection on types
• inheritance between classes, in which case subclass = subtype
• anonymous record types as in {x : N, y : Z} <: {x : N}

8.1.2 Combined Definition


Neither intrinsic nor extrinsic typing is strictly better than the other. The choice of type system is a very difficult
trade-off when designing a language.
Many practical languages even combine both methods. In that case, an intrinsic system is used for the most
important high-level types and an extrinsic system is used to refine (some of) the high-level types:

Definition 8.4 (Type System). A type system consists of


• a collection, whose elements are called objects,
• a collection, whose elements are called intrinsic types,
• a function assigning to every object x its intrinsic type I, in which case we write x : I,
• for some intrinsic types I
– an intrinsic type EI
– a relation ∈I between objects with intrinsic types I and EI , called the extrinsic typing relation for I.

We can now recover the intuitions from above as special cases:


• A purely intrinsic type system is one in which EI and ∈I are not given for any I. Thus, only objects and
their intrinsic types remain.
8.2. ABSTRACT DATA TYPES 53

• A purely extrinsic type system has two intrinsic types, namely O (for objects) and EO (for types). ∈O is the
extrinsic typing relation between objects and types.

Example 8.5. We can think of BOL as a combined type system. The objects are all complex expressions. The
intrinsic types are the non-terminals I, C, R, P , and F , which separate the objects into the five kinds of individuals,
concepts, relations, properties, and formulas.
An extrinsic typing relation exists only for I: we have EI = C and ∈I is the is-a relation.

Example 8.6. In set theory, only a few intrinsic types are used for the high-level grouping of objects. These include
at least set and prop. Objects of these intrinsic types are called sets and propositions. Some set theories also use
an intrinsic type class. Moreover, types like set → prop can be allowed as the types of unary predicates on sets.
Extrinsic typing is used only for the type set: we have Eset = set and ∈set is the usual elementhood relation
between sets.

8.2 Abstract Data Types


8.2.1 Motivation
Recall the subject-centered representation of individuals described in Sect. 6.3. Here we introduce an individual
together with all assertions of which it is the subject as in
i n d i v i d u a l ” FlorianRabe ”
i s −a ” i n s t r u c t o r ” ” male ”
” t e a c h ” ”WuV” ”KRMT”
” age ” 40
” o f f i c e ” ”11.137”

It is often desirable to use types to force the presence of such assertions. We might wish require that every
instructor teaches a list of things, and has an office. Moreover, we can use types to specify the objects of the
respective assertions: we can specify that only courses are taught and that the office is a string. Rather than the
relations with subjects ”FlorianRabe” just happening to be around as well, the type system would now force their
existence and the type of the object. Forgetting to give such an assertion or giving it with the wrong object could
be detected statically (i.e., without applying the semantics) and flagged as a typing error.
This leads to the idea of subject-centered types. This could looks as follows:
concept i n s t r u c t o r
teach course∗
age : i n t
office : string

i n d i v i d u a l ” FlorianRabe ” : ” i n s t r u c t o r ”
i s −a ” male ”
” t e a c h ” ”WuV” ”KRMT”
” age ” 40
” o f f i c e ” ”11.137”

Now the type ”instructor” forces the presence of a list of taught courses (The ∗ is meant to indicate a list.), an
integer for the age, and a string for the office.
We can now see that, in fact, every person should have an age, and not just every instructor. Because every
instructor is meant to be a person, we could try to capture this as well to avoid redundancy. Moreover, every male
is meant to be a person, too.
That leads to the idea of modular types. This could look as follows:
concept person
age : i n t

c o n c e p t male <: p e r s o n
54 CHAPTER 8. TYPE SYSTEMS FOR ONTOLOGY LANGUAGES

c o n c e p t i n s t r u c t o r <: p e r s o n
teach course∗
office : string

i n d i v i d u a l ” F l o r i a n R a b e ” : ” i n s t r u c t o r ” u ” male ”
” t e a c h ” ”WuV” ”KRMT”
” age ” 40
” o f f i c e ” ”11.137”
Incidentally, that eliminates the need to independently declare relations and properties. Instead, we can treat their
occurrences inside the concept definitions as their declarations.
That has the added benefit that two relations/properties of the same name declared in different concepts can be
distinguished and can have different types.

8.2.2 Examples
The general thrust of these ideas is to shift more and more information into an increasingly complex type system.
This is part of a trade-off: the more the type system can do,
• the more requirements can be expressed and violations thereof detected statically,
• the more complex the type system and its documentation and implementation become.
Abstract data types have proved to be a particularly interesting trade-off on this expressivity-simplicity spectrum
and are — in one way or another — part of many type systems The following table gives an overview:

aspect language abstract data type


ontologization UML class
concretization SQL table schema
computation Scala class, interface
deduction various theory, specification, module, locale
narration various emergent feature

8.2.3 Abstract vs. Concrete


The words abstract and concrete do not have standard definitions for types. I like the intuitions described below.
A type is called concrete if its values are
• given by their internal form,
• defined along with the type, typically built from already-known pieces.

Example 8.7. Simple products are concrete types.


They are introduced by (among other rules)
• A × B is a type if A and B are
• values of type A × B are of the form (a, b) for a : A and b : B.

Example 8.8. Inductive data types as seen in Def. 4.7 are concrete types. Their values are formed by applying
constructors to other values.
I like calling them concrete data types.

A type is called abstract if its values are


• given by their externally visible properties,
• defined in any environment that understands the type definition.
This is the case for abstract data types.

Example 8.9 (Classes). A UML class is an abstract data type. Its values are the instances of implementing classes.
A UML class only defines what methods should be available. How they are implemented by specific values of the
type is left to the programming languages.
8.2. ABSTRACT DATA TYPES 55

Thus, different programming languages could have different values for the same abstract data type. They certainly
look different, e.g., in Java and Scala implementations of the same UML class. But the languages might also be
fundamentally different in expressivity, e.g., a Turing-complete programming language might have strictly more
values for the same abstract data type than a non-Turing-complete one.
Moreover, which instances actually exist changes during the run time of the program. If we take this into account,
the values of the abstract data type are not even fixed within a programming language.

Example 8.10 (Schemas). An SQL table schema is an abstract data type. Its values are the rows.
The schema only defines what types the columns of a table have. Different database systems might theoretically
provide different ways to build rows for the table.
However, this does not happen in practice because SQL table columns are typed by base types, which have the
same values across database systems. This would be different if we allowed table columns to have function types.

Example 8.11 (Theories). A logical theory (e.g., Monoid) is an abstract data type. Its values are the models of
the theory (e.g., for Monoid: (N, +, 0) or (N, ∗, 1)).
The theory only defines what operations a model must provide (for Monoid: binary operation and neutral element)
and which axioms it must satisfy (for Monoid: associativity, neutrality). How we build the models is left open.
We usually build models in mathematical language and naively assume that fixes the models once and for all. But
that is too naive: depending on which mathematical foundation we use (e.g., set theory with or without axioms
of choice), we can build different models. Moreover, we can also build models in type theories (which underly
many deduction systems such as Coq or Isabelle). We can even build them in programming languages, e.g., by
implementing theories as classes (typically moving the axioms into comments).
The choice of language substantially changes what the values of the abstract data type are.

8.2.4 Rigorous Definition


There are many subtle design choices in defining abstract data types. Therefore, they tend to look and behave
a little differently in every type system that features them. Here, we introduce a fairly general definition that
subsumes many practical languages.

Definition 8.12 (Abstract Data Type). Consider an arbitrary type system.


An abstract data type (ADT) is
• a flat type of the form
{c1 : T1 [= ti ], . . . , cn : Tn [= ti ]}
where the ci are distinct names, the Ti are types, and the ti are optional and wherever given must have type
Ti , or
• a mixin type of the form A1 ∗ A2 for ADTs Ai .
We say that a type system has internal ADTs if all ADTs are types (and thus may in particular occur as the Ti
in a record type).

The intuition of a mixin A ∗ B is that we merge the fields of A and B. However, this union dependent: if B is flat,
its fields may refer to fields introduced in A.
The most important special case of an ADT are classes:

Definition 8.13 (Class). A class definition defines an ADT abbreviation of the form

a = a1 ∗ . . . ∗ am ∗ {c1 : T1 , . . . , cn : Tn }

where the ai are names of previously defined ADTs.


We call the ai the superclasses or parent classes and say that a inherits from the ai . We call the ci the fields
or members of a.

In an OO-language, a class definition is more commonly written somehow like


56 CHAPTER 8. TYPE SYSTEMS FOR ONTOLOGY LANGUAGES

a b s t r a c t c l a s s a e x t e n d s a1 with . . . with am {
c1 : T1
..
.
cn : Tn
}

The details can vary, and special care must be taken in programming languages where initialization may have side
effects.
Flat ADTs are the standard case, and all mixin ADTs can be simplified into flat ones. This can be seen as a
semantics in the sense that the language of flat and mixin ADT is translated to the language of flat ADTs.

Definition 8.14 (Mixin Semantics). The flattening A[ of an ADT A is defined as follows:


• if A is flat: A[ = A
• if A is of the form A1 ∗ . . . ∗ An : A[ arises by concatenating the fields of all A[i where duplicate field names
are handled as follows:
– if the same field (same name, types equal, definitions equal or both absent) occurs more than once,
only the first occurrence is kept,
– if the fields c : T1 [= ti ] and c : T2 [= t2 ] occur for inequal types Ti , A is ill-formed,
– if the fields c : T = t1 and c : T = t2 occur for inequal objects ti , A is ill-formed,
– if the fields c : T = t and c : T occur, only the defined one is kept (∗).

Remark 8.15 (Dependency Between Fields). Our definition sweeps a very important but subtle detail under the
rug: in a flat ADT with a field c : T = t, may T and/or t refer to fields declared later? We sketch a few possible
answers.
In the simplest case, we forbid such forward references. Then ADTs are very well-behaved. But we have a
problem with the case (∗) in Def. 8.14: if c : T occurs before c : T = t, we cannot simply drop the former because
intermediate fields may refer to c. A straightforward solution would be to declare the ADT to be ill-formed. But
unfortunately, this case is very important in practice — it occurs whenever c : T is declared in an abstract class
and c : T = t in a concrete class implementing it.
A more common solution is to allow the fields to be mutually recursive. Consider a flat ADT with fields Γ, c : T [=
t], ∆ where Γ and ∆ are lists of fields. Let Γ0 and ∆0 arise by dropping all definitions. Then we require that
• T must be a well-formed type in context Γ0 . Thus, the types may only refer to previous fields.
• t must have type T in context Γ0 , c : T, ∆0 . Thus, the definitions may be mutually recursive.
This makes the case (∗) work. But it comes at the price of recursion, which allows writing non-terminating fields
(a feature in a programming language, but potentially undesirable in other settings).
Even so, the mutual-recursion solution is problematic in the presence of dependent types. Here, dropping defini-
tions is not always allowed: T might be well-formed in context Γ, but Γ0 might not even be a well-formed context
at all. Because OO-languages are usually not dependently-typed, this is not an issue in most settings.

8.3 Database Schemas as Typed Ontologies


This is presented on the slides.

8.3.1 Exercise 3
The topic of Exercise 3 is to build a relational database schema for a univis-like system.
Chapter 9

Querying via a Semantics

9.1 Overview
Let us assume we have a semantics for our syntax. We again write l for the syntax, L for the semantic, and J−K
for the translation function.
We can now use the semantics to answer questions asked in the syntax. Here we use the syntax to phrase a question
and the semantics to determine the answer.
We call this querying. Contrary to standard practice, we will use that word in a very broad sense that covers all
aspects. It is more common to use the word only for concretized querying, where SQL has been developed, which
has shaped many intuitions about querying.
Usually, querying requires the syntax to designate some non-terminals as propositional. A non-terminal is proposi-
tional if the semantics can make its words true. Without a notion of propositions, it is impossible to define what
questions and answers even are.

Definition 9.1 (Propositions). A context-free syntax with propositions is a context-free syntax with some
designated non-terminal symbols.
A semantics with theorems is one that additionally defines some propositions to be theorems. We write ` F
if F is a theorem.

That definition does not mean that any kind of logic is needed for querying. Many languages use highly restricted
notions of propositions that would not generally be considered as logic. For example, languages might use equalities
between objects or even equalities between certain objects as the only propositions. The following table gives an
overview:

aspect typical propositions proposition operators


ontology language assertions, concept equality/subsumption
programming language equality for some types boolean operators
database language equality for base types boolean operators
logic equality for all types boolean operators, quantifiers
natural language sentences and, or, some, every, . . .

Often the development of querying for a language leads to the discovery of omissions in the syntax: certain objects
that are helpful to ask questions were omitted from the syntax because they were not needed to describe the data.
Then sometimes the syntax is extended with non-terminals or productions that seem like dead code: they are not
needed for or not allowed in the official data. The following table gives some examples:

aspect typical extensions


ontology language conjunction of assertions
programming language quantifiers
database language membership in a table
logic (already tries to capture all possible propositions)
natural language (already captures all possible propositions)

57
58 CHAPTER 9. QUERYING VIA A SEMANTICS

Example 9.2 (Propositions in BOL). The obvious choice of propositions for BOL are the formulas.
In Rem. 6.1, we mentioned that the BOL syntax from Fig. 6.2 had some redundant parts that were grayed out.
Assertions are needed for writing ontologies only in such that they behave like axioms, i.e., they are automatically
true. But for querying BOL, we also need them to behave like formulas so that we can use them as questions, i.e.,
we must allow them to be true or false.
Moreover, it is common to also allow conjunctions. Therefore, the BOL propositions are the conjunctions of
formulas.

In the sequel, we will use each of the four kinds of semantics to

Section builds on Section aspect query result


9.2 7.2 deduction proposition yes/no
9.3 7.3 concretization proposition with free variables true ground instances
9.4 7.4 computation term value
9.5 7.5 narration question answer

Remark 9.3 (Meta-Level Questions). Finally, any semantics admits a meta-level where additional questions can be
asked. Examples are asking for the consistency of a theory or the equivalence of two theories/programs/queries. At
the next-higher meta-level, we can ask about the completeness of a semantics or the equivalence of two semantics
(of which completeness is a special case). These meta-questions can usually not be expressed in the syntax, and we
do not consider them a part of querying here. But it is worth mentioning that the need to use yet another language
(a meta-language) to ask these questions can be annoying, and some advancements in language design are about
trying to integrate them into the syntax. For example, reflection is the process of representing a language in itself
so that the language can talk about itself. That way meta-questions become regular questions.

9.2 Deductive Querying


9.2.1 Method
We assume that l is a syntax with propositions and that L is a logic (and thus in particular has propositions) whose
semantics has theorems.
It is not guaranteed J−K translates l-propositions to L-propositions. If not, we assume there is some operation True
in L that we can use to lift the translations of l-propositions to L-propositions.
A deductive query consists of a proposition. The answer to the query is yes or no.
Thus, the deductive semantics must determine which propositions are theorems, i.e., whether `L True(JF K). This
is usually done in one of two ways.
Proof theory uses a calculus for L to derive true propositions. Thus, we can say that an l-proposition F is true
if the calculus derives True(JF K). Accordingly, if L has a negation operator ¬, we can say that F is false if the
calculus derives True(¬JF K).
Model theory uses a second deductive semantics, namely a translation from L to an even stronger deductive
language M , usually some form of set theory. Then, we can say that l-propositions are true if the composition of
the two translations yields a true M -proposition.
Either way, it is determined whether to answer an l-proposition with yes or no.

9.2.2 Challenges
Consistency The L-calculus might derive both F and ¬F . In that case L is inconsistent and usually every
formula. We usually assume L to be consistent even though we do not always prove that.

Decidability Deductive semantics is usually undecidable, i.e., there is no algorithm that takes in F and always
returns yes or no in finite time.
Therefore, deductive querying is very difficult in general. One has to run heuristics (theorem provers) to see if a
proof of F or ¬F can be found.
9.3. CONCRETIZED QUERYING 59

A common compromise is to allow only a restricted set of propositions as queries for which decision procedures
exist. However, it can be tricky to find good restrictions, especially if the syntax allows for function symbols and
equality.
For example, SFOL is undecidable. But many fragments of SFOL are decidable, such as propositional logic and
various fragments in between.
When giving a deductive semantics into SFOL, it is therefore important to check whether the image of J−K falls
inside a decidable fragment. This is typically the case for ontology languages.

Completeness Deductive semantics is usually incomplete, i.e., there are unanswered questions. More precisely,
the L-calculus typically derives F for some propositions, ¬F for some, but neither for some others. The third kind
of proposition cannot be answered by the semantics.

Remark 9.4. The work “complete” is used for two different things in logic.
Firstly, it can be a relation between two semantics, typically proof theory and a model theory. That is the dominant
meaning of the word as in, e.g., the completeness theorem for SFOL and Gödel’s incompleteness theorem.
Secondly, it can mean that a logic proves or disproves every proposition, i.e., there is no F such that neither F
nor ¬F are derivable. That is the sense we use above. This kind of completeness rarely holds, usually only in very
restricted circumstances.

Decidability and completeness are essentially the same problem. Specifically, if completeness holds, we already
obtain a decision procedure for the logic: to decide the truth of F , enumerate all proofs until a proof of F or ¬F
is found. Vice versa, if we have a sound decision procedure, running it on F will prove either F or ¬F .

Efficiency Independent of whether the semantics is complete/decidable, theorem proving is typically very expen-
sive.
Therefore, in addition to identifying decidable fragments of a logic, it is desirable to identify efficiently decidable
fragments. Typically, a semantics meant for efficient practical querying aims for polynomially decidable fragments.
This is the case for very simple ontology languages. But it can quickly become exponential if the language of
propositions becomes more expressive.

9.3 Concretized Querying


This was discussed on the slides.

9.4 Computational Querying


This was discussed on the slides.

9.5 Narrative Querying


This was discussed on the slides.
60 CHAPTER 9. QUERYING VIA A SEMANTICS
Chapter 10

Conclusion

61
62 CHAPTER 10. CONCLUSION
Bibliography

[CFKR20] J. Carette, W. Farmer, M. Kohlhase, and F. Rabe. Big Math and the One-Brain Barrier. The Mathe-
matical Intelligencer, 2020. to appear.

[Koh08] M. Kohlhase. Using LATEX as a Semantic Markup Format. Mathematics in Computer Science, 2(2):279–
304, 2008.
[Ran11] A. Ranta. Grammatical Framework: Programming with Multilingual Grammars. CSLI Publications,
2011.

63

You might also like