WuV_notes
WuV_notes
2020
2
These notes were originally prepared for our CS course at University Erlangen-Nuremberg (FAU) in Summer 2020.
They are directed at 3rd semester CS undergraduates and master students but should be intelligible even for earlier
students and could be interesting also for PhD students and for students from adjacent majors. The course is
recommended both as a first course in the specialization area Artificial Intelligence as well as a one-off overview on
on knowledge representation.
The course was developed in Summer 2020 from scratch and materials were built along the way. It integrated current
directions and recent results in research on knowledge representation pulling together materials in an entirely new
and original way.
Contents
1 Meta-Remarks 5
2 Fundamental Concepts 7
2.1 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Representation and Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Components of Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.1 Syntax and Semantics, Data and Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.2 Semantics as Syntax Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 Heterogeneity of Semantics and Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 The Tetrapod Model of Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.1 Five Aspects of Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.2 Relations between the Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5 Representing Data 27
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3
4 CONTENTS
6 Ontologies 29
6.1 General Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.2 A Basic Ontology Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.3 Representing Ontologies as Triples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.4 Writing Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.4.1 The OWL Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.4.2 The Protege Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.4.3 Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
10 Conclusion 61
Chapter 1
Meta-Remarks
State of these notes I constantly work on my lecture notes. Therefore, keep in mind that:
• I am developing these notes in parallel with the lecture — they can grow or change throughout the semester.
• These notes are neither a subset nor a superset of the material discussed in the lecture. On the one handle,
they may contain more details than mentioned in the lectures. On the other hand, important material such
as background, diagrams, and examples may be part of the lecture but not mentioned in these notes.
• Unless mentioned otherwise, all material in these notes is exam-relevant (in addition to all material discussed
in the lectures).
Collaboration on these notes I am writing these notes using LaTeX and storing them in a git repository on
GitHub at https://github.com/florian-rabe/Teaching. As an experiment in teaching, I am inviting all of you
to collaborate on these lecture notes with me. This would require familiarity with LaTeX as well as Git and GitHub
— that is not part of this lecture, but it is an essential skill for you. Ask in the lecture if you have difficulty figuring
it out on your own.
By forking and by submitting pull requests for this repository, you can suggest changes to these notes. For example,
you are encouraged to:
• Fix typos and other errors.
• Add examples and diagrams that I develop on the board during lectures.
• Add solutions for the homeworks if I did not provide any (of course, I will only integrate solutions after the
deadline).
• Add additional examples, exercises, or explanations that you came up or found in other sources. If you use
material from other sources (e.g., by copying an diagram from some website), make sure that you have the
license to use it and that you acknowledge sources appropriately!
I will review and approve or reject the changes. If you make substantial contributions, I will list you as a contributor
(i.e., something you can put in your CV).
Any improvement you make will not only help your fellow students, it will also increase your own understanding of
the material. Make sure your git commits carry a user name that I can connect to you.)
5
6 CHAPTER 1. META-REMARKS
Chapter 2
Fundamental Concepts
2.1 Abbreviations
knowledge representation and processing KRP the general area of this course
knowledge representation language KRL a languages used in KRP
knowledge representation tool KRT a tool implementing a KPL and processing algorithms for it
2.2 Motivation
2.2.1 Knowledge
Human knowledge pervades all sciences including computer science, mathematics, natural sciences and engineering.
That is not surprising: “science” is derived from the Latin word “scire” meaning “to know”. Similarly, philosophy,
from which all sciences derive, is named after the Greek words “philo” meaning loving and “sophia” meaning
wisdom, and the for common ending “-logy” is derived from Greek “logos” meaning word (i.e., a representation of
knowledge).
In regards to knowledge, computer science is special in two ways: Firstly, many branches of computer science
need to understand KRP as a prerequisite for teaching computers to do knowledge-based tasks. In some sense,
KRP is the foundation and ultimate goal of all artificial intelligence.1 Secondly, modern information technology
enables all sciences to apply computer-based KRP in order to vastly expand on the domain-specific tasks that can
be automated. Currently all sciences are becoming more and more computerized, but most non-CS scientists (and
many computer scientists for that matter) lack a systematic education and understanding of IT-KRP. That often
leads to bad solutions when domain experts cannot see which KRP solutions are applicable or how to apply them.
Representation Processing
Static Dynamic
Situation Change
Be Become
Data Structures Algorithms
Set Function
State Transition
Space Time
1 Indeed, a major problem with the currently very successful machine learning-based AI technology is that it remains unclear when
and how it does KRP. That can be dangerous because it leads to AI systems recommending decisions without being able to explain
why that decision should be trusted.
7
8 CHAPTER 2. FUNDAMENTAL CONCEPTS
Again and again, we distinguish a static concept that describes/represents what is a situation/state is and a
dynamic concept that describes how it changes. If that change is a computer doing something with or acting on
that representation, we speak of “processing”.
It is particular illuminating to contrast KRP to the standard CS course on Data Structures and Algorithms (DA).2
Generally speaking, DA teaches the methods, and KRP teaches how to apply them. Data structures are a critical
prerequisite for representing knowledge. But data structures alone do not capture what the data means (i.e., the
knowledge) or if a particular representation makes any sense. Similarly, algorithms are the critical prerequisite
for processing knowledge. But while algorithms can be systematically analyzed for efficiency, it is much harder to
analyze if an algorithm processes knowledge correctly. The latter requires understanding what the input and output
data means.
Capturing knowledge in computers is much harder than developing data structures and algorithms. It is ultimately
the same challenge as figuring out if a computer system is working correctly — a problem that is well-known to be
undecidable in general and very difficult in each individual case.
Syntax Data
Semantics Knowledge
All four concepts are primitive, i.e., they cannot be defined in simpler terms. All sciences have few carefully-chosen
primitives on which everything builds. This is done most systematically in mathematics (where primitives include
set or function). While mathematical primitives as well as some primitives in physics or CS are specified formally,
the above four concepts can only be described informally, ultimately appealing to pre-existing human understanding.
Moreover, this description is not standardized — different courses may use very different descriptions even they
ultimately try to capture the same elusive ideas.
Data (in the narrow sense of computer science) is any object that can be stored in a computer, typically combined
with the ability to input/output, transfer, and change the object. This includes bits, strings, numbers, files, etc.
Data by itself is useless because we would have no idea what to do with it. For example, the object O =
((49.5739143, 11.0264941), ”2020 − 04 − 21T 16 : 15 : 00CEST ”) is useless data without additional information
about its syntax and semantics. Similarly, a file is useless data unless we know which file format it uses.
Syntax is a system of rules that describes which data is well-formed. For O above the syntax could be “a pair of
(a pair of two IEEE double precision floating point numbers) and a string encoding of an time stamp”. For a file,
the syntax is often indicated by the file name extension, e.g., the syntax of an html file is given in Section 12 of the
current HTML standard3 .
Syntax alone is useless unless we know what the semantics, i.e., what the data means and thus how to correctly in-
terpret and process the data. For example, the syntax of O allows to check that O is well-formed, i.e., indeed contains
two numbers and a timestamp string. That allows rejecting ill-formed data such as ((49.5739143, 11.0264941), ”f oo”).
The HTML syntax allows us to check that a file conforms to the standard.
Semantics is a system of rules that determines the meaning of well-formed data. For example, ISO 8601 specifies
that timestamp string refer to a particular date and time in a particular time zone. Further semantics for O might
be implicit in the algorithms that produce and consume it: such as “the first component of the pair contains two
numbers between 0 and 180 resp. 0 and 360 indicating latitude resp. longitude of a location on earth”. Semantics
might be multi-staged, and further semantics about O might be that O indicates the location and time of the first
lecture of this course. Similarly, Section 14 of the HTML standard specifies the semantics of well-formed HTML
files by describing how they are to be rendered in a web browser.
Knowledge is the combining of some data with its syntax and semantics. That allows applying the semantics to
obtain the meaning of the data (if syntactically well-formed and signaling an error otherwise). In computer systems,
2 The course is typically called “Algorithms and Data Structures”, but that is arguably awkward because algorithms can only exist
if there are data structure to work with. Compare my notes on that course in this repository, where I emphasize data structures much
more than is commonly done in that course.
3 https://html.spec.whatwg.org/multipage/
2.3. COMPONENTS OF KNOWLEDGE 9
• data is represented using primitive data (ultimately the bits provided by the hardware) and encodings of more
complex data (bytes, arrays, strings, etc.) in terms of simpler ones,
• syntax is theoretically specified using grammars and practically implemented in programming languages using
data structures,
• semantics is represented using algorithms that process syntactically well-formed data,
• knowledge is elusive and often emerges from executing the semantics, e.g., rendering of an HTML file.
Thus, the role of syntax vs. semantics may depend on the context: just like one function’s output can be another
function’s input, one interpretation’s knowledge can be another one’s syntax. For example, we can first compile a
program into binary and then execute it to returns its value.
Such hierarchies of evaluation levels are very common in computer systems. In fact, most state-of-the-art compilers
are subdivided into multiple phases each further interpreting the output of the previous one. Thus, if knowledge is
represented in computers, it is invariably data itself but relative to a different syntax.
data in a way that preserves semantics — can be difficult to implement if both tools use entirely different paradigms
to specify semantics.
Ontologization focuses on developing and curating a coherent and comprehensive ontology of concepts. This
focuses on identifying the central concepts in a domain and their relations. For example, a medical ontology would
define concepts for every symptom, disease, and medication and then define relations for which symptoms and
medications are related to which disease.
Ontologies typically abstract from the knowledge: they standardize identifiers for the concepts and spell out some
properties and relations but do not try to capture all details of the knowledge. Well-designed ontologies can capture
exactly the knowledge that different KPTs must share and can thus serve as interoperability layers between them.
While organization can use ontology languages such as OWL or RDF, the inherent complexity of formal objects in
computer science and mathematics usually requires going beyond general purpose ontology languages (similar to how
the programming languages underlying computer algebra systems usually go beyond general purpose programming
languages).
Concretization uses languages based on numbers, strings, lists, and records to obtain concrete representations of
datasets in order to store and query their properties efficiently. Because concrete objects are so simple and widely
used, it is possible and common to build concrete datasets on top of general purpose data representation languages
and tools such as JSON or SQL.
Computation uses specification and programming languages to represent algorithmic knowledge.
Deduction uses logics and theorem provers to obtain verifiable correctness.
Narration uses natural language to obtain texts that are easy to understand for humans. Because narrative
languages are not well-standardized (apart from general purpose languages such as free text or LATEX), it is common
to develop narrative libraries on top of ad-hoc languages that impose some formal structure on top of informal text,
such as a fixed tree structure whose leafs are free text or a particular set of LATEX macros that must be used.
Narrative libraries can be classified based on whether entries are derived from publications (e.g., one abstract per
paper in zbMATH) or mathematical concepts (e.g., one page per concept in nLab).
Narration
Ontologization
Deduction Concretization
Computation
characteristic
Aspect objects advantage joint advantage application
of the other as-
pects
deduction formal proofs correctness ease of use verification
computation programs efficiency well-definedness execution
concretization concrete objects tangibility abstraction storage/retrieval
narration texts flexibility formal seman- human understanding
tics
Aspect pair characteristic advantage
ded./comp. rich meta-theory
narr./conc. simple languages
ded./narr. theorems and proofs
comp./conc. normalization
ded./conc. decidable well-definedness
comp./narr. Turing completeness
is trivially well-defined, but a computational definition of a function may throw exceptions when running; but only
the latter can store and compute functions efficiently. Consequently, dedicated and mostly disjoint communities
have evolved that have produced large aspect-specific datasets.
12 CHAPTER 2. FUNDAMENTAL CONCEPTS
Chapter 3
3.1 Structure
The subsequent parts of this course follow the Tetrapod model with one part per aspect. Each of these will describe
the concepts, languages, and tools of the respective aspect as well as their relation to other aspects.
The aspects of the Tetrapod are typically handled in individual courses, which describe highly specialized languages
and tools in depth. On the contrary, the overall goal of this course will be seeing all of them as different approaches
to semantics and knowledge representation. The course will focus on universal principles and their commonalities
and differences as well as their advantages and disadvantages.
The subsequent chapters of this first part will be dedicated to aspect-independent material. These will not ne-
cessarily be taught in the order in which they appear in these notes. Instead, some of them will be discussed in
connection to how they are relevant in individual aspects.
13
14 CHAPTER 3. OVERVIEW OF THIS COURSE
Chapter 4
CFG IDT
non-terminal type
production constructor
non-terminal on left of production return type of constructor
non-terminals on right of production arguments types of constructor
terminals on right of production notation of constructor
words derived from non-terminal N expressions of type N
Definition 4.1 (Context-Free Grammar). Given a set Σ of characters (containing the terminal symbols), a
context-free grammar consists of
• a set N of names called non-terminal symbols
• a set of productions each consisting of
– an element of N , called the left-hand side
– a word over Σ ∪ N , called the right-hand side
.
Example 4.2. Let Σ = {0, 1, +, ·, =, ≤}. We give a grammar for arithmetic expressions and formulas about them:
E ::= 0
| 1
| E+E
| E·E
.
F ::= E = E
| E≤E
Here we use the BNF style of writing grammars, where the productions are grouped by their left-hand side and
written with ::= and | . We have N = {E, F }.
15
16 CHAPTER 4. REPRESENTING SYNTAX AND SEMANTICS
Definition 4.3 (Context-Free Grammar with Named Productions). Given a set Σ of characters (containing the
terminal symbols), a context-free grammar consists of
• a set N of names called non-terminal symbols
• a set of productions each consisting of
– a name
– an element of N , called the left-hand side
– a word over Σ ∪ N , called the right-hand side
Example 4.4. The grammar from above with names written to the right of each production
E ::= 0 zero
| 1 one
| E+E sum
| E·E product
.
F ::= E = E equality
| E≤E lessOrEqual
Definition 4.5 (Context-Free Grammar with Named Productions and Base Types). Given a set Σ of characters
(containing the terminal symbols) and a set T of names (containing the base types allowed in productions), a
context-free grammar consists of
• a set N of names called non-terminal symbols
• a set of productions each consisting of
– a name
– an element of N , called the left-hand side
– a word over Σ ∪ T ∪ N , called the right-hand side
The intuition behind base types is that we commonly like to delegate some primitive parts of the grammar to
be defined elsewhere. A typical example are literals such as numbers 0, 1, 2, . . .: We could give regular expression
syntax for digit-strings. Instead, it is nicer to just assume we have a set of base types that we can use to insert an
infinite set of literals into the grammar.
Example 4.6. Let N at be the type of natural numbers and let T = {N at}. Then we can improve the grammar
from above as follows:
E ::= N at literal
| E+E sum
| E∗E product
.
F ::= E = E equality
| E ≤ E lessOrEqual
Definition 4.7 (Inductive Data Type). Given a set of names T (containing the types known in the current
context), an inductive data type consists of
• a name, called the type,
• a set of constructors each consisting of
– a name n
– a list of elements of T ∪ {n}, called the argument types
4.1. CONTEXT-FREE SYNTAX 17
Example 4.8. Let N at be the type of natural numbers and T = {N at}. We give an inductive type for arithmetic
expressions:
E = literal of N at | sum of E ∗ E | product of E ∗ E
Here we use ML-style notation for inductive data types, which separates constructors by | and writes them as
name of argument-type-product.
Definition 4.9 (Mutually Inductive Data Types). Given a set T of names (containing the types known in the
current context), a family of mutually inductive data type consists of
• a set N of names, called the types,
• a set of constructors each consisting of
– a name
– an element of N , called the return type
– a list of elements of N ∪ T , called the argument types
Example 4.10. We extend the type definition from above by adding a second type for formulas. Thus, N = {E, F }.
Definition 4.11 (Mutually Inductive Data Types with Notations). Given a set Σ of characters (containing the
terminal symbols) and a set T of names (containing the types known in the current context), a family of mutually
inductive data type with notations consists of
• a set N of names, called the types,
• a set of constructors each consisting of
– a name
– an element of N , called the return type
– a list of elements of T ∪ N , called the argument types
– a word over the alphabet Σ ∪ T ∪ N containing the argument types in order and only elements from Σ
otherwise, called the notation of the constructor
The intuition behind notations is that it can get cumbersome to write all constructor applications as N ame(arguments).
It is more convenient to attach a notation to them such as
Example 4.12. We extend the type definitions from above by adding notations to each constructor. We use the
.
set Σ = {+, ·, =, ≤} as terminals in the notations.
Here we write the constructors as name of argument-type-product # notation. It is easy to see that this has
introduced redundancy: we can infer the argument types from the notation. So we can just drop the argument
types:
E = literal # N at | sum # E + E | product # E · E
.
F = equality # E = E | lessOrEqual # E ≤ E
Theorem 4.13. Given a set Σ of characters and a set T of names, the following notions are equivalent:
• a family of mutually inductive data types in the context of types T with notations using characters from Σ,
• a context-free grammar with named productions, terminal symbols from Σ, and base types T .
In implementations in programming languages, we often drop the notations. Instead, those are handled, if needed,
by special parsing and serialization functions.
However, in an implementation, it is often helpful to additionally give names to each argument of a production/-
constructor. That yields the following definition:
Definition 4.14 (Context-Free Syntax). Given a set Σ of characters and a set T of names, a context-free syntax
consists of
• a set N of names, called the non-terminals/types,
• a set of productions/constructors each consisting of
– a name
– an element of N , called the left-hand side/return type
– a sequence of objects, called the right-hand side/arguments which are one of the following
∗ an element of Σ
∗ a pair written (n : t) of a name n, called the argument name, and an element t ∈ T ∪ N called
the argument type.
Example 4.15. Using ad hoc language to write the constructors, our example from above as a context-free syntax
could look as follows:
E = literal # (value : N at) | sum # (lef t : E) + (right : E) | product # (lef t : E) · (right : E)
.
F = equality # (lef t : E) = (right : E) | lessOrEqual # (lef t : E) ≤ (right : E)
4.1.4 Contexts
We assume a context-free language l.
Remark 4.17. Sometimes the grammar itself has specific productions for contexts and variables. In that case, we
speak of meta-variable contexts and meta-variables to distinguish them from those of the language.
Definition 4.18 (Expressions in Context). Given a context Γ, a word E derived from non-terminal N that may
additionally use the productions of the context is called an expression of type N in context Γ.
We write this as Γ `l E : N .
4.2. IMPLEMENTATION 19
We often want to substitute only a single variable x : N even though E may be defined in a larger context Γ. This
is often written E[x := N ]. That is just an abbreviation for E[γ], where γ contains x := N as well as y := y for
every other variable y of Γ.
4.2 Implementation
Context-free syntax can be implemented systematically in all programming languages. But, depending on the style
of the language, they make drastically different. We give the two most important paradigms as examples.
Because ML has inductive data types as primitives, pattern-matching on our syntax comes for free. We will get
back to that when defining the semantics.
Example 4.23. We define our example syntax in a generic OO-language somewhat similar to Scala.1
In particular, we assume that the sy
abstract class E {
def toString : String
}
c l a s s l i t e r a l extends E {
f i e l d v a l u e : Nat
c o n s t r u c t o r ( v a l u e : Nat ) {
t h i s . value = value
}
def toString = value . toString
}
c l a s s sum e x t e n d s E {
f i e l d l e f t : Nat
f i e l d r i g h t : Nat
c o n s t r u c t o r ( l e f t : E , r i g h t : E) {
this . left = left
this . right = right
}
d e f t o S t r i n g = l e f t . t o S t r i n g + ”+” + r i g h t . t o S t r i n g
}
c l a s s product extends E {
f i e l d l e f t : Nat
f i e l d r i g h t : Nat
c o n s t r u c t o r ( l e f t : E , r i g h t : E) {
this . left = left
this . right = right
}
d e f t o S t r i n g = l e f t . t o S t r i n g + ”·” + r i g h t . t o S t r i n g
}
abstract class F {
def toString : String
}
c l a s s eq ua lit y extends E {
f i e l d l e f t : Nat
f i e l d r i g h t : Nat
c o n s t r u c t o r ( l e f t : E , r i g h t : E) {
this . left = left
this . right = right
4.2. IMPLEMENTATION 21
}
.
d e f t o S t r i n g = l e f t . t o S t r i n g + ” =” + r i g h t . t o S t r i n g
}
c l a s s product extends E {
f i e l d l e f t : Nat
f i e l d r i g h t : Nat
c o n s t r u c t o r ( l e f t : E , r i g h t : E) {
this . left = left
this . right = right
}
d e f t o S t r i n g = l e f t . t o S t r i n g + ”≤” + r i g h t . t o S t r i n g
}
Because OO-languages do not have inductive data types as primitives, pattern-matching on our syntax requires
awkward switch statements. We will get back to that when defining the semantics.
The Scala language combines ideas from functional and OO-programming. That makes its representation of context-
free syntax particularly elegant.
In Scala, the constructor arguments are listed right after the class name. These are automatically fields of the class,
and a default constructor always exists that defines those fields. That gets rid of a lot of boilerplate.
If we want to make those fields public (and we do because those are the projection functions, we add the keyword
val in front of them. But even that is too much boilerplate. So Scala defines a convenience modifier: if we put
case in front of the classes corresponding to constructors of our syntax, Scala puts in the val automatically. It also
generates a default implementation of toString, which we have to override if we want to implement notations, too.
Finally, Scala also generates pattern-matching functions so that we can pattern-match in the same way as in ML.
Then our example becomes (as usual, assuming a class Nat already exists):
abstract class E {
def toString : String
}
c a s e c l a s s l i t e r a l ( v a l u e : Nat ) e x t e n d s E {
override def toString = value . toString
}
c a s e c l a s s sum ( l e f t : Nat , r i g h t : Nat ) e x t e n d s E {
o v e r r i d e d e f t o S t r i n g = l e f t . t o S t r i n g + ”+” + r i g h t . t o S t r i n g
}
c a s e c l a s s p r o d u c t ( l e f t : Nat , r i g h t : Nat ) e x t e n d s E {
o v e r r i d e d e f t o S t r i n g = l e f t . t o S t r i n g + ”·” + r i g h t . t o S t r i n g
}
abstract class F {
def toString : String
}
c a s e c l a s s e q u a l i t y ( l e f t : Nat , r i g h t : Nat ) e x t e n d s E {
.
o v e r r i d e d e f t o S t r i n g = l e f t . t o S t r i n g + ” =” + r i g h t . t o S t r i n g
}
c a s e c l a s s l e s s O r E q u a l ( l e f t : Nat , r i g h t : Nat ) e x t e n d s E {
o v e r r i d e d e f t o S t r i n g = l e f t . t o S t r i n g + ”≤” + r i g h t . t o S t r i n g
}
22 CHAPTER 4. REPRESENTING SYNTAX AND SEMANTICS
Remark 4.25 (Terminology). “language system” is not a standard term. We usually just say “language”.
“Well-formed E-expression over Θ” can be a mouthful. Therefore, it is common to simply say that E is an
E-expression, or that E is a Θ-expression, and expect readers to fill in the details.
It is also common to give the non-terminal E names, such as “term”, “type”, or “formula”. Then we simply say
“term” instead of “term-expression” and so on.
The vocabularies are typically lists of typically named declarations. They introduce the names that can be used to
form expressions. The expression kinds almost always include formulas.
Often declarations contain additional expressions, most importantly types or definitions. In general, all expressions
may occur in declarations, but many language systems do not use all of them.
Very different names are used for the vocabularies in different communities. The following table gives an overview:
In practice, it is most useful to think of a language system as family of languages: one language (containing the
expressions) for every vocabulary.
Critically, the semantic language (which is itself a formal language and can thus have a semantics itself) must be a
language whose semantics we already know. Therefore, it is often important to give multiple equivalent semantics
— choosing a different semantics for different audiences, who might be familiar with different languages.
The role of the semantic prefix P is to define once and for all the L-material that we need in general to interpret
l-theories (in our case: ontologies). It occurs at the beginning of all interpretations of ontologies. In particular, it
is equal to the interpretation of empty ontology.
for some semantic operation JEK. Compositionality is also called the substitution property or the homomorphism
property. See also Def. 7.5.
More rigorously, we define a compositional translation as follows:
Definition 4.27 (Compositional Semantics). Consider a semantics for syntax grammar l and interpretation
function J−K.
J−K is compositional if it is defined as follows:
• a family of functions J−KN , one for every non-terminal N of l
• for every expressions E derived from N , we put JEK = JEKN
• each J−KN is defined by induction on the productions for N
• for each production N ::= ∗ (N1 , . . . , Nr ) and all expressions ei derived from Ni
Compositional Translations of Contexts We can extend every compositional translation to contexts, substi-
tutions, and expressions in contexts:
Definition 4.28. Given a translation J−K as above, for a non-terminal N , we define JN K as the non-terminal
from which the translations of N -expressions are derived.
Then we define:
Jx1 : N1 , . . . , xn : Nn K := x1 : JN1 K, . . . , xn : JNn K
Jx1 := w1 , . . . , xn := wn K := x1 := Jw1 K, . . . , xn := Jwn K
JxK := x
JE[γ]KN = JEK[JγK]
Formulated without substitutions, this means that for every syntax expression E(e1 , . . . , er ) derived from N , where
the ei are subexpression derived from non-terminal Ni , we have
Simply put, a semantics is compositional iff it is defined by mutually inductive translation functions with only
compositional cases. The latter is very easy to check by inspecting the shape of the finitely many cases of the
definition. The former is a powerful property because it applies to any of the infinitely many expressions of the
syntax.
but there is no way to define J+K in terms of zero and successor. Instead, we need subcases:
JmK
if n = 0
Jm + nK = succ(JmK) if n = 1
J(m + n1 ) + n2 K if n = n1 + n2
Representing Data
5.1 Overview
This chapter section presented ongoing research on developing infrastructure for the semantic representation and
interchange of data across systems. It is presented on the slides and the lecture videos.
This includes Exercise 4 and 5.
27
28 CHAPTER 5. REPRESENTING DATA
Chapter 6
Ontologies
Ontology Languages An ontology is written in ontology language. Common ontology languages are
• description logics such as ALC,
• the W3C ontology language OWL, which is the standard ontology languages of the semantic web,
• the entity-relationship model, which focuses on modeling rather than formal syntax,
• modeling languages like UML, which is the main ontology language used in software engineering.
Ontology languages are not committed to a particular domain — in the Tetrapod model, they correspond to
programming languages and logics, which are similarly uncommitted. Instead, an ontology language is a formal
language that standardizes the syntax of how ontologies can be written as well as their semantics.
Ontologies The details of the syntax vary between ontology languages. But as a general rule, every ontology
declares
• individual — concrete objects that exist in the real world, e.g., ”Florian Rabe” or ”WuV”
• concept — abstract groups of individuals, e.g., ”instructor” or ”course”
• relation — binary relations between two individuals, e.g., ”teach”
• properties — binary relations between an individuals and a concrete value (a number, a date, etc.), e.g.,
”creditValue”
• concept assertions — the statement that a particular individual is an instance of a particular concept
• relation assertions — the statement that a particular relation holds about two individuals
• property assertions — the statement that a particular individual has a particular value for a particular
property
• axioms — statements about relations between concepts, typically in the form subconcept of statements like
29
30 CHAPTER 6. ONTOLOGIES
”instructor” v ”person”
All assertions can be understood and spoken as subject-predicate-object triples as follows:
Assertion Triple
Subject Predicate Object
concept assertion ”Florian Rabe” is-a ”instructor”
relation assertion ”Florian Rabe” ”teach” ”WuV”
property assertion ”WuV” ”creditValue” 7.5
This uses a special relation is-a between individuals and concepts. Some languages group is-a with the other
binary relations between individuals for simplicity although it is technically a little different.
The possible values of properties must be fixed by the ontology language. Typically, it includes at least standard
types such as integers, floating point numbers, and strings. But arbitrary extensions are possible such as dates,
RGB-colors, lists, etc. In advanced languages, it is possible that the ontology even introduces its own basic types
and values.
Ontologies are often divided into two parts:
• The abstract part contains everything that holds in general independent of which individuals: concepts,
relations, properties, and axioms. It describes the general rules how the worlds works without committing to
a particular set of inhabitants of the world. This part is commonly called the TBox (T for terminological).
• The concrete part contains everything that depends on the choice of individuals: individuals and assertions.
It populates the world with inhabitants. This part is commonly called the ABox (A for assertional).
A separate division into two parts is the following:
• The signature part contains everything that introduces a named entity: individuals, concepts, relations,
and properties.
• The theory part contains everything that describes which statements about the named entities are true:
assertions and axioms.
Synonyms Because these principles pervade all formal languages, many competing synonyms are used in different
domains. Common synonyms are:
In particular, the individual-concept relation occurs everywhere and is known under many names:
Vocabularies: Ontologies
O ::= D∗
Declarations
D ::= individual ID atomic individual
| concept ID atomic concept
| relation ID atomic relation
| property ID : T atomic property
| I is-a C concept assertion
| IRI relation assertion
| IP V property assertion
| F other axioms
Formulas
F ::= C ≡ C concept equality
| CvC concept subsumption
| I is-a C concept formula
| IRI relation formula
| IP V property formula
Individual expressions
I ::= ID atomic individuals
Concept expressions
C ::= ID atomic concepts
| C tC union of concepts
| C uC intersection of concepts
| ∀R.C universal relativization
| ∃R.C existential relativization
| domR domain of a relation
| rngR range of a relation
| domP domain of a property
Relation expressions
R ::= ID atomic relations
| R∪R union of relations
| R∩R intersection of relations
| R; R composition of relations
| R∗ transitive closure of a relation
| R−1 dual relation
| ∆C identity relation of a concept
Property expressions
P ::= ID atomic properties
Identifiers
ID ::= alphanumeric string
We could study practical ontology languages like ALC or OWL now. But those feature a lot of other details that
can block the view onto the essential parts. Therefore, we first define a basic ontology language ourselves in order
to have full control over the details.
Definition 6.1 (Syntax of BOL). A BOL-ontology is given by the grammar in Fig. 6.1. It is well-formed if
• no identifier is declared twice,
• every property assertion assigns a value of the type required by the property declaration,
• every reference to an atomic individual/concept/relation/property is declared as such.
The above grammar exhibits some general structure that we find throughout formal KR languages. In particular,
an ontology consists of named declarations of four different kinds of entities as well as some assertions and
axioms about them. Each entity declaration clarifies which kind it is (in our case by starting with a keyword) and
introduces a new entity identifier. For each kind, there are complex expressions. These are anonymous and built
inductively; their base cases are references to the corresponding identifiers. Sometimes (in our case: individuals and
properties), the references are the only expressions of the kind. Sometimes (in our case: concepts and relations),
there can be many productions for complex expressions. The complex expressions are used to build axioms; in our
case, these are the three kinds of assertions and other formulas.
Remark 6.2 (Formulas vs. Assertions). In Fig. 6.1, the three productions in gray are duplicated: they occur both
as assertions and as formulas.
We could remove the three productions for assertions and treat them as special cases of axioms. But We keep the
duplication here because assertions are often treated differently from the other axioms. They are grouped with
the individuals in the ABox whereas the other axioms are seen as part of the TBox. Moreover, when used as
assertions, they may have to be interpreted differently than when used as formulas as we will see in Ch. 7.
Alternatively, we could remove the three productions in gray. But then we would lose the ability to talk about
formulas that are not true. That will become relevant in Ch. 9.
Example 6.3. We give a simple ontology that could be used to represent knowledge in the context of a university:
i n d i v i d u a l FlorianRabe
i n d i v i d u a l WuV
concept person
c o n c e p t male
concept i n s t r u c t o r
concept course
r e l a t i o n teach
property creditValue : f l o a t
F l o r i a n R a b e i s −a i n s t r u c t o r u male
WuV i s −a c o u r s e
F l o r i a n R a b e t e a c h WuV
WuV c r e d i t V a l u e 7 . 5
male v p e r s o n
i n s t r u c t o r v person
dom t e a c h v i n s t r u c t o r
6.3. REPRESENTING ONTOLOGIES AS TRIPLES 33
rng t e a c h v c o u r s e
dom c r e d i t V a l u e ≡ c o u r s e
c o u r s e v ∃ t e a c h −1 i n s t r u c t o r
The axioms are meant to state that males and instructors are persons, teaching is done by instructors to courses,
exactly the courses have credits, and (the last axiom) every course is taught by at least one instructor. Whether
they actually do mean that, depends on the semantics.
The consequence closure (as defined by the semantics) should add the assertion FlorianRabe is-a person.
Alternatively, if we use the axioms for consistency checking, we should add that assertion from the beginning.
Otherwise, the axioms would not be true.
If we use axioms for the consequence closure, we can even omit the two concept assertions — they should be
inferred using the domain and range axioms for the relation.
The assertion FlorianRabe is−a instructor u male could also be split into two assertions
FlorianRabe is−a instructor and FlorianRabe is−a male. That will be important as some semantics might
have difficulties handling all cases. So it can be helpful to use a variant that does not need u operator.
Inferring the Entity Declarations The entity declarations are not naturally triples. But we can usually infer
them from the assertions as follows: any identifier that occurs in a position where an entity of a certain kind is
expected is assumed to be declared as an entity for that kind.
For example, the individuals are what occurs as the subject of a concept, relation, or property assertion or as the
object of a relation assertion. It is conceivable that there are individuals that occur in none of these. But that is
unusual because they would be disconnected from everything in the ontology.
If we give TBox and ABox together, this inference approach usually works well. But if we only give a TBox, this
would often not allow inferring all entities. The only place where they could occur in the TBox is in the axioms,
and it is quite possible to have concept, relation, and property declarations that are not used in the axioms. In
fact, it is not unusual not to have any axioms.
Special Predicates To turn declarations into triples, we can use reflection, i.e., the process of talking about our
language constructs as if they were data.
Reflection requires introducing some built-in entities that represent the features of the language. In the semantic
web area, this is performed using the following entities:
• ”rdfs:Resource”: a built-in concept of which all individuals are an instance and thus of which every concept
is a subconcept
• ”rdf:type”: a special predicate that relates an entity to its type:
– an individual to its concept (corresponding to is-a above)
– other entities to their special type (see below)
• ”rdfs:Class”: a special class to be used as the type of classes
• ”rdf:Property”: a special class to be used as the type of properties
• ”rdfs:subClassOf”: a special relation that relates a subconcept to a superconcept
• ”rdfs:domain”: a special relation that relates a relation to the concepts of its subjects
• ”rdfs:range”: a special relation that relates a relation/property to the concept/type of its objects
Here ”rdf” and ”rdfs” refer to the RDF (Resource Description Framework) and RDFS (RDF Schema) namespaces,
which correspond to W3C standards defining those special entities.
Thus, we can represent many and in particular the most important entity declarations as triples:
34 CHAPTER 6. ONTOLOGIES
Assertion Triple
Subject Predicate Object
individual individual ”rdf:type” ”rdfs:Resource”
concept concept ”rdf:type” ”rdf:Class”
relation relation ”rdf:type” ”rdf:Property”
property property ”rdf:type” ”rdf:Property”
concept assertion individual ”rdf:type” concept
relation assertion individual relation individual
property assertion individual property value
for special forms of axioms
cvd c ”rdfs:subClassOf” d
dom r ≡ c r ”rdfs:domain” c
rng r ≡ c r ”rdfs:range” c
This is subject to the restriction that only atomic concepts and relations can be handled. For example, only concept
assertions can be handled that make an individual an instance of an atomic concept. This is particularly severe for
axioms, where complex expressions occur most commonly in practice. Here, the special relations allow capturing
the most common axioms as triples.
Problems Reflection is subtle and can easily lead to inconsistencies. We can see this in how the approach of
RDF(S) special entities breaks the semantics via FOL.
For example, it treats classes both as concepts (when they occur as the object of a concept assertion) and as
individuals (when they occur as subject or object of a ”rdfs:subClassOf” relation assertion). Similarly, ”rdfs:Class”
is used both as an individual and as a class. In fact, the standard prescribes that ”rdfs:Class” is an instance of
itself.
In practice, this is handled pragmatically by using ontologies that make sense. A formal way to disentangle this is
to assume that there are two variants of ”rdfs:Class”, one as an individual and one as a class. The translation must
then translate ”rdfs:Class” differently depending on how it is used.
It would be better if RDFS were described in a way that is consistent under the implicitly intended FOL semantics.
But the more pragmatic approach has the advantage of being more flexible. For example, being able to treat every
class, relation, or property also as an individual makes it easy to annotate metadata to them. Metadata is a set of
properties such as ”rdfs:seeAlso” or ”owl:versionInfo”, whose subjects can be any entity.
Subject-Centered Representations When giving a set of triples, there are usually a lot of triples with the
same subject. For example, we could use a simple concrete syntax with one triple per line and whitespace separating
subject, predicate, and object:
” FlorianRabe ” i s −a ” i n s t r u c t o r ”
” FlorianRabe ” i s −a ” male ”
” FlorianRabe ” ” t e a c h ” ”WuV”
” FlorianRabe ” ” t e a c h ” ”KRMT”
” FlorianRabe ” ” age ” 40
” FlorianRabe ” ” o f f i c e ” ”11.137”
It is more human-friendly to group these triples in such a way that the subject only has to be listed once. For
example, we could use a concrete syntax like this, where the subject occurs first and then predicate-object pairs
occur on indented lines:
” F l o r i a n Rabe”
i s −a ” i n s t r u c t o r ”
i s −a ” male ”
” t e a c h ” ”WuV”
” t e a c h ” ”KRMT”
” age ” 40
” o f f i c e ” ”11.137”
If the same predicate occurs with multiple values, we can group those as well. For example, we could give the
objects for the same predicates as a list following the predicate:
6.4. WRITING ONTOLOGIES 35
” F l o r i a n Rabe”
i s −a ” i n s t r u c t o r ” ” male ”
” t e a c h ” ”WuV” ”KRMT”
” age ” 40
” o f f i c e ” ”11.137”
Concrete syntaxes based on the triple representation of ontologies will usually adopt some kind of structure like
this. The details may vary.
Concrete Syntax Several concrete syntaxes have been defined and are commonly used for OWL. The OWL2
primer2 systematically describes examples in five different concrete syntaxes.
APIs for OWL implement the abstract syntax along with good support for reading/writing ontologies in any of the
concrete syntaxes.
6.4.3 Exercise 1
The topic of Exercise 1 is to use Protege to write an OWL ontology for a university.
2 https://www.w3.org/TR/2012/REC-owl2-primer-20121211/
3 https://protege.stanford.edu/
36 CHAPTER 6. ONTOLOGIES
Protege is a graphical editor for the abstract syntax of OWL. Familiarize yourself with the various concrete syntaxes
of OWL by writing an ontology that uses every feature once, downloading it in all available concrete syntaxes, and
comparing those.
The minimal goal of the exercise session is to get a Hello World example going, at which point the task transitions
into homework. There will be no homework submission, but you will use your ontology throughout the course.
You should make sure you understand and setup the process in a way that supports you when you revisit and
change your ontology many times throughout the semester.
Other than that, the task is deliberately unconstrained to mimic the typical situation at the beginning of a big
project, where it is unclear what the ultimate requirements will be.
Chapter 7
7.1 Overview
In the sequel, we give four different semantics of BOL — using the four other aspects:
Definition 7.1. Fig. 7.1 gives the context-free grammar. The vocabulary symbol is T hy. The expression symbols
are Y , T , and F .
7.2.2 Semantics
Definition 7.2 (Deductive Semantics of BOL). The semantic prefix is the SFOL-theory containing
• a type ι (for individuals),
• additional types and constants corresponding to base types and values of BOL.
Every BOL-ontology O is interpreted as the SFOL-theory P, JOK, where JOK is defined in Fig. 7.2.
As foreshadowed above, we can observe some general principles: Every BOL-declaration is translated to an SFOL-
declaration for the same name, and ontologies are translated declaration-wise. For every kind of complex BOL-
expression, there is one inductive function mapping BOL-expressions to SFOL-expressions. The base cases of
37
38 CHAPTER 7. SEMANTICS FOR BOL
Vocabularies: theories
T hy ::= D∗
Declarations
D ::= type ID : type type declaration
| fun ID : Y ∗ → Y function symbol declaration
| pred ID : Y ∗ → prop predicate symbol declaration
| axiom F axiom
type expressions
Y ::= ID atomic type
term expressions
T ::= ID(T ∗ ) function symbol applied to arguments
| ID term variables
formulas expressions
F ::= ID(T ∗ ) predicate symbol applied to arguments
.
| T =Y T equality of terms at a type
| > truth
| ⊥ falsity
| F ∧F conjunction
| F ∨F disjunction
| F ⇒F implication
| ¬F negation
| ∀ID : Y.F universal quantification at a type
| ∃ID : Y.F existential quantification at a type
Identifiers
ID ::= alphanumeric string
references to declared BOL-identifiers are translated to themselves, i.e., to the identifiers of the same name declared
in the SFOL-theory. The other cases are compositional: every case for a complex BOL-expression recurses only
into the semantics of the direct subexpressions.
The consequence closure of SFOL, using the usual semantics of SFOL, induces the desired consequence closure for
BOL:
Example 7.4. We interpret the example ontology from Ex. 6.3. Excluding the semantic prefix, it results in
Here we have an expression E(e1 , . . . , en ) with n = 2 and E is the ;-operator mapping (e1 , e2 ) 7→ e1 ; e2 , i.e., R1
and R2 are the direct subexpressions of R1 ; R2 . The semantics is a relatively complicated FOL-formula, but it
only depends on JR1 K and JR2 K — everything else is fixed. We have J; K = (p1 , p2 ) 7→ ∃m : ι.p1 (x, m) ∧ p2 (m, y),
i.e., the interpretation of the ;-operator is the function that maps two predicates p1 , p2 to the formula ∃m :
ι.p1 (x, m) ∧ p2 (m, y). Then we have
JR1 ; R2 K = J; K(JR1 K, JR2 K).
Example 7.6 (Non-Compositional Translation via Custom Semantic Prefix). In Fig. 7.2, we omitted the case for
the transitive closure. That was because it is not possible to translate it compositionally into FOL. We can only
do it non-compositionally with a custom semantic prefix:
We define the FOL-interpretation of an ontology O by JOK = PO , JOK, where PO is a custom semantic prefix. PO
is different for every ontology O and is defined as follows:
1. We scan through O and collect all occurrences of R∗ for any (not necessarily atomic) relation R.
2. PO contains the following declarations for each R:
• A binary predicate symbol CR ⊆ i × i. Note that R may be a complex expression; so we have to
generate a fresh name CR here.
• The axiom ∀x : ι, y : ι. R(x, y) ⇒ CR (x, y), i.e., CR extends R.
• The axiom ∀x : ι, y : ι, z : ι. CR (x, y) ∧ CR (y, z) ⇒ CR (x, z), i.e., CR is transitive.
3. We add the case JR∗ K = CR (x, y) to the interpretation function.
Intuitively, every occurrence of the ∗ -operator is removed from the language and replaced with a fresh name that
is axiomatized to have the needed properties. All of these axioms are added to the semantic prefix.
Definition 7.7. Fig. 7.3 gives the context-free grammar. The vocabulary symbol is S. The expression symbols
are T , R, V , and F .
7.3.2 Semantics
Even though this is a very different knowledge aspect, the general principles of the semantics are the same: Every
BOL-declaration is translated to an SQL declaration, and ontologies are translated declaration-wise. For every kind
of complex expression, there is one inductive function mapping BOL-expressions to SQL-expressions.
In SQL, we can nicely see the difference between declarations and expressions: the former are translated to side
effect-ful statements, the latter to side effect-free queries.
Definition 7.8 (Concretized Semantic of BOL). The semantic prefix consists of the following SQL-statements
• a type ID of identifiers (if not already supported anyway by the underlying database)
• declarations of all base types and values of BOL (if not already supported anyway by the underlying database)
• CREATE TABLE individuals (id ID, name string), where the id field is unique and automatically generated
when inserting values
7.4. COMPUTATIONAL SEMANTICS 41
Every BOL-ontology O is interpreted as the sequence P, JOK of SQL statements, where JOK is defined in Fig. 7.4.
Remark 7.9 (Limitations). Our interpretation of BOL in SQL is restricted to assertions using only atomic expres-
sions. For example, in the case for I is-a C, we assume that I and C are names. Thus, we have already created
an individual for I and a table for C, and we can thus insert the former into the latter. The general case would
be more complicated but is much less important in practice. But other expressions very quickly become more
difficult.
The interpretation of formulas into SQL is less obvious because SQL is not a logic and therefore does not define a
consequence closure. Thus, we can only use axioms for consistency checks in SQL. But that requires first carrying
out an explicit consequence closure that adds all implied assertions to the database.
Example 7.10. We interpret the example ontology from Ex. 6.3. Excluding the semantic prefix, the entity decla-
rations and assertions result in the following
INSERT INTO i n d i v i d u a l s ( name ) VALUES ( ” F l o r i a n R a b e ” )
INSERT INTO i n d i v i d u a l s ( name ) VALUES ( ”WuV” )
CREATE TABLE p e r s o n ( i d ID )
CREATE TABLE male ( i d ID )
CREATE TABLE i n s t r u c t o r ( i d ID )
CREATE TABLE c o u r s e ( i d ID )
CREATE TABLE t e a c h ( s u b j e c t ID , o b j e c t ID )
CREATE TABLE c r e d i t V a l u e ( s u b j e c t ID , o b j e c t f l o a t )
INSERT INTO c o u r s e VALUES ( 2 )
INSERT INTO t e a c h VALUES ( 1 , 2 )
INSERT INTO c r e d i t V a l u e VALUES ( 1 , 7 . 5 )
Here we assume that inserting into the table individuals has automatically assigned the ids 1 and 2 to our two
individuals.
The concept assertion about FlorianRabe using u cannot be handled by this semantics. Therefore, we skip that
assertion. The two missing assertions
INSERT INTO i n s t r u c t o r VALUES ( 1 )
INSERT INTO male VALUES ( 1 )
must then be provided by performing the consequence closure.
Moreover, the axioms result in the following consistency checks, i.e., queries that should be empty:
SELECT ∗ FROM male \ SELECT ∗ FROM p e r s o n
SELECT ∗ FROM i n s t r u c t o r \ SELECT ∗ FROM p e r s o n
SELECT ∗ FROM (SELECT DISTINCT s u b j e c t FROM t e a c h ) \ SELECT ∗ FROM i n s t r u c t o r
SELECT ∗ FROM (SELECT DISTINCT o b j e c t FROM t e a c h ) \ SELECT ∗ FROM c o u r s e
(SELECT ∗ FROM (SELECT DISTINCT s u b j e c t FROM c r e d i t V a l u e ) \ SELECT ∗ FROM c o u r s e )
UNION (SELECT ∗ FROM c o u r s e \ SELECT DISTINCT s u b j e c t FROM c r e d i t V a l u e )
SELECT ∗ FROM c o u r s e \
(SELECT DISTINCT s u b j e c t
FROM (SELECT o b j e c t AS s u b j e c t , s u b j e c t AS o b j e c t FROM t e a c h ) , i n s t r u c t o r
WHERE o b j e c t=i d )
Some of these checks will only succeed after performing the consequence closure. In particular, the table person
misses the entry 1 for the individual FlorianRabe because the assertion FlorianRabe is-a person is only present
as a consequence
Definition 7.11. Fig. 7.5 gives the context-free grammar. The vocabulary symbol is P . The expression symbols
are Y , V , and F .
7.4.2 Semantics
Again, the general principles are the same: Every BOL-declaration is translated to a Scala-declaration, and ontolo-
gies are translated declaration-wise to Scala-programs. For every kind of complex expression, there is one inductive
function mapping BOL-expressions to Scala-objects.
Definition 7.12 (Computational Semantic of BOL). The semantic prefix consists of the following Scala state-
ments
• classes for all BOL-base types and values for them (if not already present in Scala)
• classes for individuals and hash sets of objects:
import s c a l a . c o l l e c t i o n . mutable . HashSet
v a l i n d i v i d u a l s = new HashSet [ S t r i n g ]
Every BOL-ontology O is interpreted as the Scala program P, JOK, where JOK is defined in Fig. 7.6.
Remark 7.13 (Scala Syntax). In Scala, val x = e evaluates e and stores the result in x. {d1 ; . . . ; dn } is evaluated
by executing all di in order and returning the result of dn .
(A, B) is the product type A × B with pairing operator (x, y) and projection functions 1 and 2. x ⇒ F (x) is
λx.F (x).
The class HashSet is part of the standard library and offers function += and -= to add/remove elements, contains
to test elementhood, and forall, foreach to quantifiy/iterate over elements.
Types of variables are inferred if omitted.
Remark 7.14 (Limitations). Our interpretation of BOL in Scala has similar problems as the one in SQL. We
restrict entities in assertions to be atomic. And we assume that all assertions implied by the consequence closure
have already been obtained and added to the ontology.
Example 7.15. We interpret the example ontology from Ex. 6.3. Excluding the semantic prefix, the entity decla-
rations and assertions result in the following
i n d i v i d u a l s += ” F l o r i a n R a b e ”
i n d i v i d u a l s += ”WuV”
v a l p e r s o n = new HashSet [ S t r i n g ]
v a l male = new HashSet [ S t r i n g ]
v a l p e r s o n = new HashSet [ S t r i n g ]
v a l c o u r s e = new HashSet [ S t r i n g ]
v a l t e a c h = new new HashSet [ ( S t r i n g , S t r i n g ) ]
v a l c r e d i t V a l u e = new HashSet [ ( S t r i n g , f l o a t ) ]
c o u r s e += WuV
t e a c h += ( ” F l o r i a n R a b e ” , WuV)
c r e d i t V a l u e += (WuV, 7 . 5 )
must be obtained by consequence closure because we cannot handle the u assertion. Note that we could easily
compute the hash set instructor.diff(male) and add to it. But that would not add anything to the two
constituent sets.
If we thing of the axioms as consistency checks, we can translate them to assertions, i.e., Boolean expressions that
must be true. We only give some examples:
{ v a l c1 = male ; v a l c2 = p e r s o n ; c1 . f o r a l l ( x ⇒ c2 . c o n t a i n s ( x ) ) }
{
v a l c1 = c o u r s e ;
v a l c2 = {
val c = instructor ;
val r = {
v a l r = new HashSet [ ( S t r i n g , S t r i n g ) ] ;
t e a c h . f o r e a c h ( x ⇒ r += ( x . 2 , x . 1 ) ) ;
r
}
v a l e = new HashSet [ S t r i n g ] ;
r . f o r e a c h ( x ⇒ i f ( c . c o n t a i n s ( x . 2 ) ) e += x . 1 ) ;
e
};
c1 . f o r a l l ( x ⇒ c2 . c o n t a i n s ( x ) )
}
Definition 7.16 (Narrative Semantic of BOL). The semantic prefix consists of English statements explaining
• the base types of BOL (if they are not universally known),
• that we rely on a lexicon to correctly form plurals (indicated by -s) and verb forms (indicated by -s, -ing,
-ed).
Every BOL-ontology O is interpreted as the English text P, JOK, where JOK is defined in Fig. 7.7.
Natural language defines a consequence closure by appealing to consequence in natural language. That is well-
defined as long as we express ourselves precisely enough.
Example 7.18. We interpret the example ontology from Ex. 6.3. Excluding the semantic prefix and the lexicon
lookup, it results in the following text:
FlorianRabe is a proper noun.
WuV is a proper noun.
person is a common noun.
male is a common noun.
instructor is a common noun.
course is a common noun.
teach is a transitive verb.
44 CHAPTER 7. SEMANTICS FOR BOL
Remark 7.19 (Variants of English). We are relatively open as to what kind of English we want to use as the
semantic language. The simplest choice would be to use plain English as you could find in a novel or newspaper
article. But for many applications (e.g., formal ontologies in the STEM fields), we would rather use STEM English,
i.e., English interspersed with formulas, diagrams, and epistemic cues like “Definition”, “Theorem”, “Proof”, and
even . For this kind of English, LATEX is a good target format. We can even use special LATEX dialects like
sTeX [Koh08] where we can capture more of the semantic properties.
Remark 7.20 (Better Language Generation). While the target languages in the other translations are formal
languages engineered for regularity and simplicity (in terms of language primitives), natural languages have evolved
in practical human communication. As a consequence, the translation in Def. 7.16 results in English that is clumsy
at best and non-grammatical in general. We can think of the result as BOL-pidgin English.
Let us have a look at some of the problems that appear in both translations:
• We need a lexicon to obtain inflection information and the translation tries to remedy that by appending
“s” in various places. This works in some cases but not in others.
• there are many linguistic devices that serve an important role in natural language, but which we are not
targeting. An example is plural objects for aggregation. Say we have P is − aC, M is − aC, this would
translate to “P is a JCK, M is a JCK” in BOL-pidgin, whereas in natural English we would aggregate this to
“P and M are JCKs”.
A way out is to utilize special systems for dealing with the surface structure of natural language. An example of this
is the Grammatical Framework (GF, [Ran11]): it allows specifying a rich formal language of abstract syntax
trees for natural language (ASTs) together with language-specific linearizations, which amount to recursive
functions that translate ASTs to language-specific strings. GF comes with a large resource library that provides
a comprehensive, language-independent AST specification and linearizations for over 35 languages. We will not
pursue this here, but there is a special course “Logic-based Natural Language Semantics” at FAU in the Winter
Semesters that covers these and related topics. One of the major issues that need to be addressed there and here
is the notion of compositionality, which is central to all processing and semantics. We will address it next, and
come back to it time and again later.
7.6 Exercise 2
Implement the syntax and semantics of BOL.
7.6. EXERCISE 2 45
You can choose the programming language to use. We will use Scala in our examples.
You can choose which semantics to implement. The ones that translate directly to Scala or to English are easier
because it does not require implementing the syntax of the target language as well. The ones that translate to FOL
or to SQL require an implementation of the syntax of the respective target language. You have to implement that
as well or use a library for it.
We recommend not focusing on implementing the syntax and semantics in their entirety. It is more instructive to
save time by choosing a sublanguage of BOL (by omitting some productions) and to use the time to implement a
second semantics.
46 CHAPTER 7. SEMANTICS FOR BOL
Vocabularies: Schemas
S ::= D∗
Declarations
D ::= TABLE ID {CT ∗ } table
| INSERT R INTO ID row in a table
CT ::= ID : Y column type
formulas
F ::= V boolean value
| V =V equality of values
| R IN T containment of rows in tables
| ... boolean operators
Identifiers
ID ::= alphanumeric string
Vocabularies: Programs
P ::= D∗
Declarations
D ::= class ID[ID∗ ] extends ID∗ {d∗ } class definition
| object ID extends ID∗ {CF ∗ } object definition
d ::= val ID : Y [ = T ] immutable field in a class/object, possibly abstract
| var ID : Y = T mutable field in a class, with initial value
type expressions
Y ::= ID atomic type (class)
| ID built-in type (booleans, int, etc.)
| Y (Y ∗ ) type operator applied to type arguments
term expressions
T ::= ID atomic value (class, value, variable)
| ID built-in value (boolean operators, etc.)
.
| T =Y T equality of terms at a type
| T :Y instance check
| T (T ∗ ) function applied to values
| new ID {d∗ } new instance of class
| T.ID field access in an object
| T.ID = T assignment to a mutable field in an object
| {T ∗ } sequencing (;-operator)
| d local declaration
| ID = T assignment to a local variable
| if (T ) T else T if-then-else
| while (T ) T while-loop
Formulas
F ::= T terms of boolean type
Identifiers
ID ::= alphanumeric string
Figure 7.7: Interpretation Function for BOL into English (intransitive VP version)
Chapter 8
intrinsic extrinsic
goes back to λ-calculus by Church Curry
general idea objects carry their type with them types are designated by the environment
typing is a function from objects to types relation between objects and types
objects have unique type any number of types
types often interpreted as disjoint sets unary predicates on a universal set
type inference for x uniquely infer A from x try to find minimal A such that x : A
type checking compare inferred and expected type prove x : A
subtyping A <: B mimicked by casting from A to B defined by x : A implies x : B for all x
typing decidable yes unless too expressive no unless expressivity restricted
typing errors are detected usually statically (compile-time) dynamically (run-time)
type of name introduced as part of declaration additional axiom
example individual ”WuV”:”course” individual ”Wuv”, ”WuV” is-a ”course”
advantages easy flexible
unique type inference allows subtyping
examples SFOL, SQL OWL, Scala, English
most logics, functional PLs ontology, OO, natural languages
many type theories set theories
Example 8.1 (Extrinsically Typed Ontology Language). In BOL, the objects are the individuals, the types are the
concepts, and is-a is the typing relation between them. The typing is extrinsic:
• Individuals and their concept assertions are introduced in separate declarations.
• An individual may be an instance of any number of concepts.
• There is no primary concept that could be returned as the inferred type of an individual.
• Concepts are subject to subtyping C v C 0 .
• Whether an individual is an instance of a concept, must be checked by reasoning about the is-a relation.
Therefore, all semantics must interpret individuals as elements of a universal collection, and types as unary
predicates on that. Specifically, we have
semantics in universal collection unary predicate typing relation i is-a c
FOL type ι predicate c ⊆ ι c(i) true
SQL table Individuals table containing ids id of i in table c
Scala String hash set of strings c.contains(i)
English proper nouns common nouns ”i is a c” is true
51
52 CHAPTER 8. TYPE SYSTEMS FOR ONTOLOGY LANGUAGES
We can also think of relations as objects. However, BOL cannot express relation types at all, and there is no
intrinsic typing. Instead, the domain and range of a relation r are given extrinsically via axioms about dom r and
rng r. Like for individuals that allows flexibility as the same relations may have multiple types.
Example 8.2 (Intrinsically Typed Ontology Language). We could define TOL, a typed ontology language that
arises as a variant of BOL. The main differences would be
• Individuals are declared with a concept that serves as their type: individual i : C.
• Concept assertions are dropped. They are now part of the individual declarations.
• Relations are declared with two concepts for their domain D and range R: relation r <: D × R.
• Properties are declared with a concept for their domain C: property p <: C × T .
TOL would make many ontologies more concise. For example, we could simply write
concept i n s t r u c t o r
concept course
i n d i v i d u a l FlorianRabe : i n s t r u c t o r
t e a c h <: i n s t r c t o r × c o u r s e
However, we would lose flexibility. If we want to add the concept ”male”, it would be difficult to make
FlorianRabe have both types. We might be able to remedy that by allowing intersections and declaring
individual FlorianRabe: instructor u male. But even then, we would have to commit to the type of each in-
dividual right away — we cannot add different concept assertions for the same individual in different places, a
common occurrence in building large ontologies.
Allowing u would also introduce subtyping. If we are careful in the design of TOL, that may still result in an
elegant scalable language. In particular, typing may remain decidable (depending on what other operations we
allow). But if we go too far, it may end up so complex that it would have been easier to go with extrinsic typing.
That is why we use intrinsic typing only in two related places in BOL:
• The base types and values use an intrinsic type system (whose details we omitted).
• The range of properties is given intrinsically by a base type.
Remark 8.3 (Subtyping). Languages with subtyping usually have to use extrinsic type systems. Typical sources
of subtyping are
• explicit subtyping as in N <: Z
• comprehension/refinement as in {x : N|x 6= 0}
• operations like union and intersection on types
• inheritance between classes, in which case subclass = subtype
• anonymous record types as in {x : N, y : Z} <: {x : N}
• A purely extrinsic type system has two intrinsic types, namely O (for objects) and EO (for types). ∈O is the
extrinsic typing relation between objects and types.
Example 8.5. We can think of BOL as a combined type system. The objects are all complex expressions. The
intrinsic types are the non-terminals I, C, R, P , and F , which separate the objects into the five kinds of individuals,
concepts, relations, properties, and formulas.
An extrinsic typing relation exists only for I: we have EI = C and ∈I is the is-a relation.
Example 8.6. In set theory, only a few intrinsic types are used for the high-level grouping of objects. These include
at least set and prop. Objects of these intrinsic types are called sets and propositions. Some set theories also use
an intrinsic type class. Moreover, types like set → prop can be allowed as the types of unary predicates on sets.
Extrinsic typing is used only for the type set: we have Eset = set and ∈set is the usual elementhood relation
between sets.
It is often desirable to use types to force the presence of such assertions. We might wish require that every
instructor teaches a list of things, and has an office. Moreover, we can use types to specify the objects of the
respective assertions: we can specify that only courses are taught and that the office is a string. Rather than the
relations with subjects ”FlorianRabe” just happening to be around as well, the type system would now force their
existence and the type of the object. Forgetting to give such an assertion or giving it with the wrong object could
be detected statically (i.e., without applying the semantics) and flagged as a typing error.
This leads to the idea of subject-centered types. This could looks as follows:
concept i n s t r u c t o r
teach course∗
age : i n t
office : string
i n d i v i d u a l ” FlorianRabe ” : ” i n s t r u c t o r ”
i s −a ” male ”
” t e a c h ” ”WuV” ”KRMT”
” age ” 40
” o f f i c e ” ”11.137”
Now the type ”instructor” forces the presence of a list of taught courses (The ∗ is meant to indicate a list.), an
integer for the age, and a string for the office.
We can now see that, in fact, every person should have an age, and not just every instructor. Because every
instructor is meant to be a person, we could try to capture this as well to avoid redundancy. Moreover, every male
is meant to be a person, too.
That leads to the idea of modular types. This could look as follows:
concept person
age : i n t
c o n c e p t male <: p e r s o n
54 CHAPTER 8. TYPE SYSTEMS FOR ONTOLOGY LANGUAGES
c o n c e p t i n s t r u c t o r <: p e r s o n
teach course∗
office : string
i n d i v i d u a l ” F l o r i a n R a b e ” : ” i n s t r u c t o r ” u ” male ”
” t e a c h ” ”WuV” ”KRMT”
” age ” 40
” o f f i c e ” ”11.137”
Incidentally, that eliminates the need to independently declare relations and properties. Instead, we can treat their
occurrences inside the concept definitions as their declarations.
That has the added benefit that two relations/properties of the same name declared in different concepts can be
distinguished and can have different types.
8.2.2 Examples
The general thrust of these ideas is to shift more and more information into an increasingly complex type system.
This is part of a trade-off: the more the type system can do,
• the more requirements can be expressed and violations thereof detected statically,
• the more complex the type system and its documentation and implementation become.
Abstract data types have proved to be a particularly interesting trade-off on this expressivity-simplicity spectrum
and are — in one way or another — part of many type systems The following table gives an overview:
Example 8.8. Inductive data types as seen in Def. 4.7 are concrete types. Their values are formed by applying
constructors to other values.
I like calling them concrete data types.
Example 8.9 (Classes). A UML class is an abstract data type. Its values are the instances of implementing classes.
A UML class only defines what methods should be available. How they are implemented by specific values of the
type is left to the programming languages.
8.2. ABSTRACT DATA TYPES 55
Thus, different programming languages could have different values for the same abstract data type. They certainly
look different, e.g., in Java and Scala implementations of the same UML class. But the languages might also be
fundamentally different in expressivity, e.g., a Turing-complete programming language might have strictly more
values for the same abstract data type than a non-Turing-complete one.
Moreover, which instances actually exist changes during the run time of the program. If we take this into account,
the values of the abstract data type are not even fixed within a programming language.
Example 8.10 (Schemas). An SQL table schema is an abstract data type. Its values are the rows.
The schema only defines what types the columns of a table have. Different database systems might theoretically
provide different ways to build rows for the table.
However, this does not happen in practice because SQL table columns are typed by base types, which have the
same values across database systems. This would be different if we allowed table columns to have function types.
Example 8.11 (Theories). A logical theory (e.g., Monoid) is an abstract data type. Its values are the models of
the theory (e.g., for Monoid: (N, +, 0) or (N, ∗, 1)).
The theory only defines what operations a model must provide (for Monoid: binary operation and neutral element)
and which axioms it must satisfy (for Monoid: associativity, neutrality). How we build the models is left open.
We usually build models in mathematical language and naively assume that fixes the models once and for all. But
that is too naive: depending on which mathematical foundation we use (e.g., set theory with or without axioms
of choice), we can build different models. Moreover, we can also build models in type theories (which underly
many deduction systems such as Coq or Isabelle). We can even build them in programming languages, e.g., by
implementing theories as classes (typically moving the axioms into comments).
The choice of language substantially changes what the values of the abstract data type are.
The intuition of a mixin A ∗ B is that we merge the fields of A and B. However, this union dependent: if B is flat,
its fields may refer to fields introduced in A.
The most important special case of an ADT are classes:
Definition 8.13 (Class). A class definition defines an ADT abbreviation of the form
a = a1 ∗ . . . ∗ am ∗ {c1 : T1 , . . . , cn : Tn }
a b s t r a c t c l a s s a e x t e n d s a1 with . . . with am {
c1 : T1
..
.
cn : Tn
}
The details can vary, and special care must be taken in programming languages where initialization may have side
effects.
Flat ADTs are the standard case, and all mixin ADTs can be simplified into flat ones. This can be seen as a
semantics in the sense that the language of flat and mixin ADT is translated to the language of flat ADTs.
Remark 8.15 (Dependency Between Fields). Our definition sweeps a very important but subtle detail under the
rug: in a flat ADT with a field c : T = t, may T and/or t refer to fields declared later? We sketch a few possible
answers.
In the simplest case, we forbid such forward references. Then ADTs are very well-behaved. But we have a
problem with the case (∗) in Def. 8.14: if c : T occurs before c : T = t, we cannot simply drop the former because
intermediate fields may refer to c. A straightforward solution would be to declare the ADT to be ill-formed. But
unfortunately, this case is very important in practice — it occurs whenever c : T is declared in an abstract class
and c : T = t in a concrete class implementing it.
A more common solution is to allow the fields to be mutually recursive. Consider a flat ADT with fields Γ, c : T [=
t], ∆ where Γ and ∆ are lists of fields. Let Γ0 and ∆0 arise by dropping all definitions. Then we require that
• T must be a well-formed type in context Γ0 . Thus, the types may only refer to previous fields.
• t must have type T in context Γ0 , c : T, ∆0 . Thus, the definitions may be mutually recursive.
This makes the case (∗) work. But it comes at the price of recursion, which allows writing non-terminating fields
(a feature in a programming language, but potentially undesirable in other settings).
Even so, the mutual-recursion solution is problematic in the presence of dependent types. Here, dropping defini-
tions is not always allowed: T might be well-formed in context Γ, but Γ0 might not even be a well-formed context
at all. Because OO-languages are usually not dependently-typed, this is not an issue in most settings.
8.3.1 Exercise 3
The topic of Exercise 3 is to build a relational database schema for a univis-like system.
Chapter 9
9.1 Overview
Let us assume we have a semantics for our syntax. We again write l for the syntax, L for the semantic, and J−K
for the translation function.
We can now use the semantics to answer questions asked in the syntax. Here we use the syntax to phrase a question
and the semantics to determine the answer.
We call this querying. Contrary to standard practice, we will use that word in a very broad sense that covers all
aspects. It is more common to use the word only for concretized querying, where SQL has been developed, which
has shaped many intuitions about querying.
Usually, querying requires the syntax to designate some non-terminals as propositional. A non-terminal is proposi-
tional if the semantics can make its words true. Without a notion of propositions, it is impossible to define what
questions and answers even are.
Definition 9.1 (Propositions). A context-free syntax with propositions is a context-free syntax with some
designated non-terminal symbols.
A semantics with theorems is one that additionally defines some propositions to be theorems. We write ` F
if F is a theorem.
That definition does not mean that any kind of logic is needed for querying. Many languages use highly restricted
notions of propositions that would not generally be considered as logic. For example, languages might use equalities
between objects or even equalities between certain objects as the only propositions. The following table gives an
overview:
Often the development of querying for a language leads to the discovery of omissions in the syntax: certain objects
that are helpful to ask questions were omitted from the syntax because they were not needed to describe the data.
Then sometimes the syntax is extended with non-terminals or productions that seem like dead code: they are not
needed for or not allowed in the official data. The following table gives some examples:
57
58 CHAPTER 9. QUERYING VIA A SEMANTICS
Example 9.2 (Propositions in BOL). The obvious choice of propositions for BOL are the formulas.
In Rem. 6.1, we mentioned that the BOL syntax from Fig. 6.2 had some redundant parts that were grayed out.
Assertions are needed for writing ontologies only in such that they behave like axioms, i.e., they are automatically
true. But for querying BOL, we also need them to behave like formulas so that we can use them as questions, i.e.,
we must allow them to be true or false.
Moreover, it is common to also allow conjunctions. Therefore, the BOL propositions are the conjunctions of
formulas.
Remark 9.3 (Meta-Level Questions). Finally, any semantics admits a meta-level where additional questions can be
asked. Examples are asking for the consistency of a theory or the equivalence of two theories/programs/queries. At
the next-higher meta-level, we can ask about the completeness of a semantics or the equivalence of two semantics
(of which completeness is a special case). These meta-questions can usually not be expressed in the syntax, and we
do not consider them a part of querying here. But it is worth mentioning that the need to use yet another language
(a meta-language) to ask these questions can be annoying, and some advancements in language design are about
trying to integrate them into the syntax. For example, reflection is the process of representing a language in itself
so that the language can talk about itself. That way meta-questions become regular questions.
9.2.2 Challenges
Consistency The L-calculus might derive both F and ¬F . In that case L is inconsistent and usually every
formula. We usually assume L to be consistent even though we do not always prove that.
Decidability Deductive semantics is usually undecidable, i.e., there is no algorithm that takes in F and always
returns yes or no in finite time.
Therefore, deductive querying is very difficult in general. One has to run heuristics (theorem provers) to see if a
proof of F or ¬F can be found.
9.3. CONCRETIZED QUERYING 59
A common compromise is to allow only a restricted set of propositions as queries for which decision procedures
exist. However, it can be tricky to find good restrictions, especially if the syntax allows for function symbols and
equality.
For example, SFOL is undecidable. But many fragments of SFOL are decidable, such as propositional logic and
various fragments in between.
When giving a deductive semantics into SFOL, it is therefore important to check whether the image of J−K falls
inside a decidable fragment. This is typically the case for ontology languages.
Completeness Deductive semantics is usually incomplete, i.e., there are unanswered questions. More precisely,
the L-calculus typically derives F for some propositions, ¬F for some, but neither for some others. The third kind
of proposition cannot be answered by the semantics.
Remark 9.4. The work “complete” is used for two different things in logic.
Firstly, it can be a relation between two semantics, typically proof theory and a model theory. That is the dominant
meaning of the word as in, e.g., the completeness theorem for SFOL and Gödel’s incompleteness theorem.
Secondly, it can mean that a logic proves or disproves every proposition, i.e., there is no F such that neither F
nor ¬F are derivable. That is the sense we use above. This kind of completeness rarely holds, usually only in very
restricted circumstances.
Decidability and completeness are essentially the same problem. Specifically, if completeness holds, we already
obtain a decision procedure for the logic: to decide the truth of F , enumerate all proofs until a proof of F or ¬F
is found. Vice versa, if we have a sound decision procedure, running it on F will prove either F or ¬F .
Efficiency Independent of whether the semantics is complete/decidable, theorem proving is typically very expen-
sive.
Therefore, in addition to identifying decidable fragments of a logic, it is desirable to identify efficiently decidable
fragments. Typically, a semantics meant for efficient practical querying aims for polynomially decidable fragments.
This is the case for very simple ontology languages. But it can quickly become exponential if the language of
propositions becomes more expressive.
Conclusion
61
62 CHAPTER 10. CONCLUSION
Bibliography
[CFKR20] J. Carette, W. Farmer, M. Kohlhase, and F. Rabe. Big Math and the One-Brain Barrier. The Mathe-
matical Intelligencer, 2020. to appear.
[Koh08] M. Kohlhase. Using LATEX as a Semantic Markup Format. Mathematics in Computer Science, 2(2):279–
304, 2008.
[Ran11] A. Ranta. Grammatical Framework: Programming with Multilingual Grammars. CSLI Publications,
2011.
63