0% found this document useful (0 votes)
26 views

Fundamentals of Computer Networks Chapter 2 Summary

Uploaded by

Tifa K3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Fundamentals of Computer Networks Chapter 2 Summary

Uploaded by

Tifa K3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Chapter 2: Application Layer

Network applications are the raison d'être of a computer network. They include text
email, remote access to computers, file transfers, the World Wide Web (mid 90s), web
searching, e-commerce, Twitter/Facebook, Amazon, Netflix, Youtube , WoW...

2.1 Principles of Network Applications


At the core of network application development is writing programs that run on
different end systems and communicate with each over the network. The programs
running on end systems might be different (server-client architecture) or identical (Peer-
to-Peer architecture). Importantly we write programs that run on end systems/hosts,
not on network-core devices (routers/link-layer switches).

2.1.1 Network Application Architectures

From the application developer's perspective, the network architecture is fixed and
provides a specific set of services to applications. The application architecture, on the
other hand, is chosen by him. In choosing the application architecture, a developer will
likely draw one of the two predominant architectural paradigms used in modern
network applications:

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

1
Client-server architecture: there is an always on host, called the server which serves
requests from many other hosts, called clients: [Web Browser and Web Server]. Clients
do not communicate directly with each other. The server has a fixed, well-known
address, called an IP address that clients use to connect to him. often, a single server
host is incapable of keeping up with all the requests from clients, for this reason, a data
center, housing a large number of hosts, is often used to create a powerful virtual server
(via proxyin).

P2P architecture: there is minimal or no reliance on dedicated servers in data centers,


the application exploits direct communication between pairs of intermittently
connected bots, called peers. They are end systems owned and controlled by users.
[Bittorrent, Skype]. P2P applications provide self-scalability (the network load is
distributed) They are also cost-effective since they don't require significant
infrastructure and server bandwidth.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

2
P2P face challenges:

i. ISP Friendly (asymmetric nature of residential ISPs)

ii. Security

iii. Incentives (convincing users to participate)

Some applications have hybrid architectures, such as for many instant messaging
applications: a server keeps track of the IP addresses of users, but user-to-user
messages are sent directly between users.

2.1.2 Processes Communicating

In the jargon of operating systems, it's not programs but processes that communicate. A
process can be thought of as a program that is running within an end system. Processes
on two different end systems communicate with each other by exchanging messages
across the computer network: a sending process creates and sends messages into the
network; a receiving process receives these messages and possibly responds by sending
messages back.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

3
Client and Server Processes

A network application consists of pairs of processes that send messages to each other
over a network. For each pair of communicating processes we label:

 The process that initiates the communication as the client [web browser]

 The process that waits to be contacted to begin the session as the server [web
server]

This labels stand even for P2P applications in the context of a communication session.

The Interface between the Process and the Computer Network

A process sends messages into, and receives messages from, the network through a
software interface called a socket. A socket is the interface between the application
layer and the transport layer within a host, it is also referred to as the Application
Programming Interface (API) between the application and the network. The application
developer has control of everything on the application-layer of the socket but has little
control of the transport-layer side of the socket.

The only control that he has over the transport-layer is:

1. The choice of the transport protocol

2. Perhaps the ability to fix a few transport-layer parameters such as maximum buffer
and maximum segment sizes

Addressing Processes

In order for a process running on one host to send packets to a process running on
another host, the receiving process needs to have an address. To identify the receiving
processes, two pieces of information need to be specified:
Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

4
1. The address of the host. In the Internet, the host is identified by its IP Address, a 32-
bit (or 64) quantity that identifies the host uniquely.

2. An identifier that specifies the receiving process in the destination host: the
destination port number. Popular applications have been assigned specific port
numbers (web server -> 80)

2.1.3 Transport Services Available to Applications

What are the services that a transport-layer protocol can offer to applications invoking
it?

Reliable Data Transfer

For many applications, such as email, file transfer, web document transfers and financial
applications, packet's drops and data loss can have devastating consequences. If a
protocol provides guarantees that the data sent is delivered completely and correctly, it
is said to provide reliable data transfer. The sending process can just pass its data into
the socket and know with complete confidence that the data will arrive without errors
at the receiving process.

Throughput

In Chapter 1, we introduced the concept of available throughput, which, in the context


of a communication session between two processes along a network path, is the rate at
which the sending process can deliver bits to the receiving process. A transport-layer
protocol could provide guaranteed available throughput at some specific rate.
Applications that have throughput requirements are said to be bandwidth-sensitive
applications. While bandwidth-sensitive applications have specific throughput
requirements, elastic applications can make use of as much, or as little, throughput as
happens to be available. Electronic mail, file transfer, and Web transfers are all elastic
applications.

Timing

A transport-layer protocol can also provide timing guarantees. Example: guarantees that
every bit the sender pumps into the socket arrives at the receiver's socket no more than
100 msec later, interesting for real-time applications such as telephony, virtual
environments...

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

5
Security

Finally, a transport protocol can provide an application with one or more security
services. For example, in the sending host, a transport protocol can encrypt all data
transmitted by the sending process, and in the receiving host, the transport-layer
protocol can decrypt the data before delivering the data to the receiving process. Such a
service would provide confidentiality between the two processes, even if the data is
somehow observed between sending and receiving processes. A transport protocol can
also provide other security services in addition to confidentiality, including data integrity
and end-point authentication.

2.1.4 Transport Services Provided by the Internet

The Internet makes two transport protocols available to applications: TCP and UDP.

TCP Services

TCP includes a connection-oriented service and a reliable data transfer service:

 Connection-oriented service: client and server exchange transport-layer control


information before the application-level messages begin to flow. This so-called
handshaking procedure alerts the client and server, allowing them to prepare for
an onslaught of packets. Then a TCP connection is said to exist between the
sockets of the two processes. When the application finishes sending messages, it
must tear down the connection.

SECURING TCP

Neither TCP nor UDP provide encryption. Therefore the Internet community has
developed an enhancement for TCP called Secure Sockets Layer (SSL), which not only
does everything that traditional TCP does but also provides critical process-to-process
security services including encryption, data integrity and end-point authentication. It is
not a third protocol, but an enhancement of TCP, the enhancement being implemented
in the application layer in both the client and the server side of the application (highly
optimized libraries exist). SSL has its own socket API, similar to the traditional one.
Sending processes passes clear text data to the SSL socket which encrypts it.

 Reliable data transfer service the communicating processes can rely on TCP to
deliver all data sent without error and in the proper order.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

6
TCP also includes a congestion-control mechanism, a service for the general welfare of
the Internet rather than for the direct benefit of the communicating processes. It
throttles a sending process when the network is congested between sender and
receiver.

UDP Services

UDP is a no-frills, lightweight transport protocol, providing minimal services. It is


connectionless, there's no handshaking. The data transfer is unreliable: there are no
guarantees that the message sent will ever reach the receiving process. Furthermore
messages may arrive out of order. UDP does not provide a congestion-control
mechanism neither.

Services Not Provided by Internet Transport Protocols

These two protocols do not provide timing or throughput guarantees, services not
provided by today's Internet transport protocols. We therefore design applications to
cope, to the greatest extent possible, with this lack of guarantees.

2.1.5 Application-Layer Protocols

An application-layer protocol defines how an application's processes, running on


different end systems, pass messages to each other. It defines:

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

7
 The type of the messages exchanged (request/response)

 The syntax of the various message types

 The semantics of the fields (meaning of the information in fields)

 The rules for determining when and how a process sends messages and
responds to messages

2.2 The Web and HTTP


In the early 1990s, a major new application arrived on the scene: the World Wide Web
(Berners-Lee 1994), the first application that caught the general public's eye. The Web
operates on demand: users receive what they want, when they want it. It is enormously
easy for an individual to make information available over the web, hyperlinks and search
engines help us navigate through the ocean of web sites...

2.2.1 Overview of HTTP

The HyperText Transfer Protocol (HTTP), the Web's application-layer protocol is a the
heart of the Web. It is implemented in two programs: a client program and a server
program. The two programs talk to each other by exchanging HTTP messages. A Web
page (or document) consists of objects. An object is simply a file (HTML file, jpeg
image...) that is addressable by a single URL. Most Web pages consist of a base HTML
file and several referenced objects. The HTML file references the other objects in the
page with the objects' URLs. Each URL has two components: the hostname of the server
that houses the object and the object's path name. Web Browsers implement the client
side of HTTP. HTTP uses TCP as its underlying transport protocol. The server sends
requested files to clients without storing any state information about the client: it is a
stateless protocol.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

8
PC running
Firefox browser

server
running
Apache Web
server

iPhone running
Safari browser

2.2.2 Non-Persistent and Persistent Connections

In many Internet applications, the client and server communicate for an extended
period of time, depending on the application and on how the application is being used,
the series of requests may be back-to-back, periodically at regular intervals or
intermittently. When this is happening over TCP, the developer must take an important
decision: should each request/response pair be sent over a separate TCP connection or
should all of the requests and their corresponding responses be sent over the same TCP
connection? In the former approach, the application is said to use non-persistent
connections and in the latter it is said to use persistent connections By default HTTP
uses non-persistent connections but can be configured to be use persistent connections.
To estimate the amount of time that elapses when a client requests the base HTML file
until the entire file is received by the client we define the round-trip time (RTT) which is
the time it takes for a small packet to travel from client to server and then back to the
client.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

9
HTTP with Non-Persistent Connections

For the page and each object it contains, a TCP connection must be opened (handshake
request, handshake answer), we therefore observe an addition RTT, and for each object
we will have a request followed by the reply This model can be expensive on the server
side: a new connection needs to be established for each requested object, for each
connection a TCP buffer must be allocated along some memory to store TCP variables.

HTTP with Persistent Connections

The server leaves the TCP connection open after sending a response, subsequent
requests and responses between the same client and server will be sent over the same
connection. In particular an entire web page (text + objects) can be sent over a single
persistent TCP connection, multiple web pages residing on the same server can be sent
from the server to the same client over a single persistent TCP connection. These
requests can be make back-to-back without waiting for replies to pending requests
(pipelining). When the server receives back-to-back requests, it sends the objects back-
to-back. If connection isn't used for a pre-decided amount of time, it will be closed.

Non-Persistent
HTTP
Persistent HTTP

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

10
2.2.3 HTTP Message Format
Two types of HTTP messages:

HTTP Request Message

carriage return
character
line-feed
request line
(GET, POST, character
GET /index.html HTTP/1.1\r\n
HEAD commands) Host: www-net.cs.umass.edu\r\n
User-Agent: Firefox/3.6.10\r\n
Accept:
header text/html,application/xhtml+xml\r\n
lines Accept-Language: en-us,en;q=0.5\r\n
Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,utf-
carriage return,
8;q=0.7\r\n
line feed at start Keep-Alive: 115\r\n
of line indicates Connection: keep-alive\r\n
end of header lines \r\n

 Ordinary ASCII text

 First line: request line

 Other lines: header lines

 the first lines has 3 fields: method field, URL field, HTTP version field:

o method field possible values: GET, POST, HEAD, PUT, DELETE

The majority of HTTP requests use the GET method, used to request an object.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

11
The entity body (empty with GET) is used by the POST method, for example for filling
out forms. The user is still requesting a Web page but the specific contents of the page
depend on what the user entered into the form fields. When POST is used, the entity
body contains what the user entered into the form fields. Requests can also be made
with GET including the inputted data in the requested URL. The HEAD method is similar
to GET, when a server receives it, it responds with an HTTP message but it leaves out the
requested object. It is often used for debugging. PUT is often used in conjunction with
web publishing tools, to allow users to upload an object to a specific path on the web
servers. Finally, DELETE allows a user or application to delete an object on a web server.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

12
HTTP Response Message

A typical HTTP response message:

 Status line: protocol version, status code, corresponding status message

 six header lines:

o the connection will be closed after sending the message

o date and time when the response was created (when the server retrieves the

o object from the file system, insert object in the message, sends the response

o message)

o Type of the server / software

o Last modified: useful for object caching

o Content-Length: number of bytes in the object

o Content-Type

 entity body: contains the requested object itself (data)

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

13
Some common status codes:

 200 OK: request succeeded, information returned

 301 Moved Permanently: the object has moved, the new location is specified in
the header of the response

 400 Bad Request: generic error code, request not understood

 404 Not Found: The requested document doesn't exist on the server

 505 HTTP Version Not Supported: The requested HTTP protocol version is not
supported by the server

2.2.4 User-Server Interaction: Cookies


An HTTP server is stateless in order to simplify server design and improves
performances. A website can identify users using cookies.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

14
Cookie technology has 4 components:

1. Cookie header in HTTP response message

2. Cookie header in HTTP request message

3. Cookie file on the user's end-system managed by the browser

4. Back-end database at the Website

User connects to website using cookies:

Server creates a unique identification number and creates an entry in its back-end
database indexed by the identification number -server responds to user's browser
including in the header: Set-cookie: identification number The browser will append to
the cookie file the hostname of the server and the identification number header Each
time the browser will request a page, it will consult the cookie file, extract the
identification number for the site and put a cookie header line including the
identification number The server can track the user's activity: it knows exactly what
pages, in which order and at what times that identification number has visited. This is
also why cookies are controversial: a website can learn a lot about a user and sell this
information to a third party.

Therefore cookies can be used to create a user session layer on top of stateless HTTP.

2.2.5 Web Caching


A Web cache, also called proxy server is a network entity that satisfies HTTP requests on
behalf of an origin Web server. It has its own disk storage and keeps copies of recently
requested objects in this storage.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

15
1. The browser establishes a TCP connection to the web cache, sending an HTTP request
for the object to the Web cache.

2. The web cache checks to see if it has a copy of the object stored locally. If yes, it will
return it within an HTTP response message to the browser.

3. If not, the Web cache opens a TCP connection to the origin server, which responds
with the requested object.

4. The Web caches receives the object, stores a copy in its storage and sends a copy,
within an HTTP response message, to the browser over the existing TCP connection.

Therefore a cache is both a server and a client at the same time. Usually caches are
purchased and installed by ISPs.

assumptions:

 avg object size: 100K bits


 avg request rate from browsers to origin
servers:15/sec
 avg data rate to browsers: 1.50 Mbps
 RTT from institutional router to any
origin server: 2 sec
 access link rate: 1.54 Mbps

consequences:

 LAN utilization: 0.15% problem!


 access link utilization = 99%
 total delay = Internet delay + access
delay + LAN delay
= 2 sec + minutes + usecs

Caching Example

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

16
Option 1: buy a faster access link

assumptions:
 avg object size: 100K bits
 avg request rate from browsers to origin servers:15/sec
 avg data rate to browsers: 1.50 Mbps
 RTT from institutional router to any origin server: 2 sec
 access link rate: 1.54 Mbps
154 Mbps

consequences:
 LAN utilization: 0.15%
 access link utilization = 99% 9.9%
 total delay = Internet delay + access delay + LAN delay
= 2 sec + minutes + usecs
Msec
Cost: faster access link (expensive!)

The web cache can substantially reduce the response time for a client request and
substantially reduce traffic on an institution's access link to the Internet. Through the
use of Content Distribution Networks (CDNs) web caches are increasingly playing an
important role in the Internet. A CDN installs many geographically distributed caches
throughout the Internet, localizing much of the traffic.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

17
2.2.6 The Conditional GET

Caches introduce a new problem: what if the copy of an object residing in the cache is
stale? The conditional GET is used to verify that an object is up to date. An HTTP request
message is a conditional get if

1. The request message uses the GET method

2. The request message includes an If-modified-since: header line.

A conditional get message is sent from the cache to server which responds only if the
object has been modified.

2.2.6 HTTP/2
HTTP/2 [RFC 7540], standardized in 2015, was the first new version of HTTP since
HTTP/1.1, which was standardized in 1997. Since standardization, HTTP/2 has taken off,
with over 40% of the top 10 million websites supporting HTTP/2 in 2020 [W3Techs].
Most browsers—including Google Chrome, Internet Explorer, Safari, Opera, and
Firefox—also support HTTP/2.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

18
The primary goals for HTTP/2 are to reduce perceived latency by enabling request and
response multiplexing over a single TCP connection, provide request prioritization and
server push, and provide efficient compression of HTTP header fields. HTTP/2 does not
change HTTP methods, status codes, URLs, or header fields. Instead, HTTP/2 changes
how the data is formatted and transported between the client and server.

To motivate the need for HTTP/2, recall that HTTP/1.1 uses persistent TCP connections,
allowing a Web page to be sent from server to client over a single TCP connection. By
having only one TCP connection per Web page, the number of sockets at the server is
reduced and each transported Web page gets a fair share of the network bandwidth (as
discussed below). But developers of Web browsers quickly discovered that sending all
the objects in a Web page over a single TCP connection has a Head of Line (HOL)
blocking problem.

HTTP 1.1: client requests 1 large object (e.g., video file) and 3 smaller objects (HOL)

To understand HOL blocking, consider a Web page that includes an HTML base page, a
large video clip near the top of Web page, and many small objects below the video.
Further suppose there is a low-to-medium speed bottleneck link (for example, a low-
speed wireless link) on the path between server and client. Using a single TCP
connection, the video clip will take a long time to pass through the bottleneck link, while
the small objects are delayed as they wait behind the video clip; that is, the video clip at
the head of the line blocks the small objects behind it.

HTTP/1.1 browsers typically work around this problem by opening multiple parallel TCP
connections, thereby having objects in the same web page sent in parallel to the
Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

19
browser. This way, the small objects can arrive at and be rendered in the browser much
faster, thereby reducing user-perceived delay.

TCP congestion control, discussed in detail in Chapter 3, also provides browsers an


unintended incentive to use multiple parallel TCP connections rather than a single
persistent connection. Very roughly speaking, TCP congestion control aims to give each
TCP connection sharing a bottleneck link an equal share of the available bandwidth of
that link; so if there are n TCP connections operating over a bottleneck link, then each
connection approximately gets 1/nth of the bandwidth. By opening multiple parallel TCP
connections to transport a single Web page, the browser can “cheat” and grab a larger
portion of the link bandwidth. Many HTTP/1.1 browsers open up to six parallel TCP
connections not only to circumvent HOL blocking but also to obtain more bandwidth.

One of the primary goals of HTTP/2 is to get rid of (or at least reduce the number of)
parallel TCP connections for transporting a single Web page. This not only reduces the
number of sockets that need to be open and maintained at servers, but also allows TCP
congestion control to operate as intended. But with only one TCP connection to
transport a Web page, HTTP/2 requires carefully designed mechanisms to avoid HOL
blocking.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

20
HTTP/2 Framing

The HTTP/2 solution for HOL blocking is to break each message into small frames, and
interleave the request and response messages on the same TCP connection. To
understand this, consider again the example of a Web page consisting of one large video
clip and, say, 8 smaller objects. Thus the server will receive 9 concurrent requests from
any browser wanting to see this Web page. For each of these requests, the server needs
to send 9 competing HTTP response messages to the browser.

Suppose all frames are of fixed length, the video clip consists of 1000 frames, and each
of the smaller objects consists of two frames. With frame interleaving, after sending one
frame from the video clip, the first frames of each of the small objects are sent. Then
after sending the second frame of the video clip, the last frames of each of the small
objects are sent. Thus, all of the smaller objects are sent after sending a total of 18
frames. If interleaving were not used, the smaller objects would be sent only after
sending 1016 frames. Thus the HTTP/2 framing mechanism can significantly decrease
user-perceived delay. The ability to break down an HTTP message into independent
frames, interleave them, and then reassemble them on the other end is the single most
important enhancement of HTTP/2. The framing is done by the framing sub-layer of the
HTTP/2 protocol. When a server wants to send an HTTP response, the response is
processed by the framing sub-layer, where it is broken down into frames. The header
field of the response becomes one frame, and the body of the message is broken down
into one for more additional frames. The frames of the response are then interleaved by
the framing sub-layer in the server with the frames of other responses and sent over the
single persistent TCP connection. As the frames arrive at the client, they are first
Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

21
reassembled into the original response messages at the framing sub-layer and then
processed by the browser as usual. Similarly, a client’s HTTP requests are broken into
frames and interleaved.

In addition to breaking down each HTTP message into independent frames, the framing
sublayer also binary encodes the frames. Binary protocols are more efficient to parse,
lead to slightly smaller frames, and are less error-prone.

Response Message Prioritization and Server Pushing

Message prioritization allows developers to customize the relative priority of requests


to better optimize application performance. As we just learned, the framing sub-layer
organizes messages into parallel streams of data destined to the same requestor. When
a client sends concurrent requests to a server, it can prioritize the responses it is
requesting by assigning a weight between 1 and 256 to each message. The higher
number indicates higher priority. Using these weights, the server can send first the
frames for the responses with the highest priority. In addition to this, the client also
states each message’s dependency on other messages by specifying the ID of the
message on which it depends.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

22
Another feature of HTTP/2 is the ability for a server to send multiple responses for a
single client request. That is, in addition to the response to the original request, the
server can push additional objects to the client, without the client having to request
each one. This is possible since the HTML base page indicates the objects that will be
needed to fully render the Web page. So instead of waiting for the HTTP requests for
these objects, the server can analyze the HTML page, identify the objects that are
needed, and send them to the client before receiving explicit requests for these objects.
Server push eliminates the extra latency due to waiting for the requests.

2.3 Electronic Mail in the Internet


As with ordinary postal mail, e-mail is an asynchronous communication medium—
people send and read messages when it is convenient for them, without having to
coordinate with other people’s schedules. In contrast with postal mail, electronic mail is
fast, easy to distribute, and inexpensive. Modern e-mail has many powerful features,
including messages with attachments, hyperlinks, HTML-formatted text, and embedded
photos. In this section, we examine the application-layer protocols that are at the heart
of Internet e-mail. But before we jump into an in-depth discussion of these protocols,
let’s take a high-level view of the Internet mail system and its key components.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

23
Figure 2.14 presents a high-level view of the Internet mail system. We see from this
diagram that it has three major components: user agents, mail servers, and the Simple
Mail Transfer Protocol (SMTP). We now describe each of these components in the
context of a sender, Alice, sending an e-mail message to a recipient, Bob. User agents
allow users to read, reply to, forward, save, and compose messages. Examples of user
agents for e-mail include Microsoft Outlook, Apple Mail, Web- based Gmail, the Gmail
App running in a smartphone, and so on. When Alice is finished composing her message,
her user agent sends the message to her mail server, where the message is placed in the
mail server’s outgoing message queue. When Bob wants to read a message, his user
agent retrieves the message from his mailbox in his mail server. Mail servers form the
core of the e-mail infrastructure. Each recipient, such as Bob, has a mailbox located in
one of the mail servers. Bob’s mailbox manages and maintains the messages that have
been sent to him. A typical message starts its journey in the sender’s user agent, then
travels to the sender’s mail server, and then travels to the recipient’s mail server, where
it is deposited in the recipient’s mailbox. When Bob wants to access the messages in his
mailbox, the mail server containing his mailbox authenticates Bob (with his username
and password). Alice’s mail server must also deal with failures in Bob’s mail server. If
Alice’s server cannot deliver mail to Bob’s server, Alice’s server holds the message in a
Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

24
message queue and attempts to transfer the message later. Reattempts are often done
every 30 minutes or so; if there is no success after several days, the server removes the
message and notifies the sender (Alice) with an e-mail message.

SMTP

 uses TCP to reliably transfer email


message from client (mail server
initiating connection) to server, port 25
 direct transfer: sending server
(acting like client) to receiving
server
 three phases of transfer
 SMTP handshaking (greeting)
 SMTP transfer of messages
 SMTP closure
 command/response interaction (like
HTTP)
 commands: ASCII text
 response: status code and phrase

SMTP is the principal application-layer protocol for Internet electronic mail. It uses the
reliable data transfer service of TCP to transfer mail from the sender’s mail server to the
recipient’s mail server. As with most application-layer protocols, SMTP has two sides:

A client side, which executes on the sender’s mail server, and a server side, which
executes on the recipient’s mail server. Both the client and server sides of Outgoing
message queue.

SMTP runs on every mail server. When a mail server sends mail to other mail servers, it
acts as an SMTP client. When a mail server receives mail from other mail servers, it acts
as an SMTP server.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

25
S: 220 hamburger.edu
C: HELO crepes.fr
S: 250 Hello crepes.fr, pleased to meet you
C: MAIL FROM: <[email protected]>
S: 250 [email protected]... Sender ok
C: RCPT TO: <[email protected]>
S: 250 [email protected] ... Recipient ok
C: DATA
S: 354 Enter mail, end with "." on a line by itself
C: Do you like ketchup?
C: How about pickles?
C: .
S: 250 Message accepted for delivery
C: QUIT
S: 221 hamburger.edu closing connection

Today, there are two common ways for Bob to retrieve his e-mail from a mail server. If
Bob is using Web-based e-mail or a smartphone app (such as Gmail), then the user
agent will use HTTP to retrieve Bob’s e-mail. This case requires Bob’s mail server to have
an HTTP interface as well as an SMTP interface (to communicate with Alice’s mail
server). The alternative method, typically used with mail clients such as Microsoft
Outlook, is to use the Internet Mail Access Protocol (IMAP) defined in RFC 3501. Both
the HTTP and IMAP approaches allow Bob to manage folders, maintained in Bob’s mail
server. Bob can move messages into the folders he creates, delete messages, mark
messages as important, and so on.

2.5 DNS - The Internet's Directory Service


One identifier for a host is its hostname [ cnn.com , www.yahoo.com ]. Hostnames are

mnemonic and therefore used by humans. Hosts are also identified by IP addresses.

2.5.1 Services provided by DNS

Routers and use IP addresses. The Internet's domain name system (DNS) translates

Host names to IP addresses. The DNS is:

1. A distributed database implemented in a hierarchy of DNS Servers

2. An application-layer protocol that allows hosts to query the distributed database.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

26
DNS servers are often UNIX machines running the Berkeley Internet Name Domaine

(BIND) software.

DNS runs over UDP and uses port 53 It is often employed by other application-layer
protocols (HTTP, FTP...) to translate user-supplied hostnames to IP addresses.

How it works:

 The user machine runs the client side of the DNS application

 The browser extracts www. xxxxx . xxx from the URL and passes the hostname
to the client side of the DNS application

 The DNS sends a query containing the hostname to a DNS server

 The DNS client eventually receives a reply including the IP address for the
hostname

 The browser can initiate a TCP connection.

DNS adds an additional delay

DNS provides other services in addition to translating hostnames to IP addresses:

 host aliasing: a host with a complicated hostname can have more alias names.
the original one is said to be a canonical hostname.

 mail server aliasing: to make email servers' hostnames more mnemonic. This
also allows for an e-mail server and an Web server to have the same hostname.

 load distribution: replicated servers can have the same hostname. In this case, a
set of IP addresses is associated with one canonical hostname. When a client
make a DNS query for a name mapped to a set of addresses, the server responds
with the entire set, but rotates the ordering within each reply.

2.5.2 Overview of How DNS Works


From the perspective of the invoking application in the user's host, DNS is a black box
providing a simple, straightforward translation service. Having one single global DNS
server would be simple, but it's not realistic because it would a single point of failure, it
would have an impossible traffic volume, it would be geographically too distant from
some querying clients, its maintenance would be impossible.
Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

27
A Distributed, Hierarchical Database

The DNS uses a large number of servers, organized in a hierarchical fashion and
distributed around the world.

The three classes of DNS servers:

 Root DNS servers: In the Internet there are 13 root DNS servers, most hosted in
North America, each of these is in reality a network of replicated servers, for
both security and reliability purposes (total: 247)

 Top-level domain (TLD) servers: responsible for top-level domains such as com
org net edu and govand all of the country top-level domains uk fr jp

 Authoritative DNS servers: every organization with publicly accessible hosts


must provide publicly accessible DNS records that map the names of those hosts
to IP addresses. An organization can choose to implement its own authoritative
DNS server or to pay to have the records stored in an authoritative DNS of some
service provider.

Finally there are local DNS servers which are central to the DNS architecture. They are
hosted by ISPs. When a host connects to one of these, the local DNS server provides the
host with the IP addresses of one or more of its local DNS servers. Requests can be up to
the root DNS servers and back down.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

28
We can have both recursive and iterative queries. In recursive queries the user sends
the request its nearest DNS which will ask to a higher-tier server, which will ask to lower
order... the chain, goes on until it reaches a DNS that can reply, the reply will follow he
inverse path that the request had. In iterative queries the same machine sends requests
and receives replies. Any DNS can be iterative or recursive or both.

DNS Caching

DNS extensively exploits DNS caching in order to improve the delay performance and to
reduce the number of DNS messages ricocheting around the Internet. In a query chain,
when a DNS receives a DNS reply it can cache the mapping in its local memory.

2.5.3 DNS Records and Messages


The DNS servers that implement the DNS distributed database store resource records
(RRs) including RRs that provide hostname-to-IP address mappings. Each DNS reply
messages carries one or more resource records. A resource record is a four-tuple that
contains the fields: (Name, Value, Type, TTL) TTL is the time to live of the resource
record (when a resource should be removed from a cache). The meaning of Name and
Value depend on Type :

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

29
DNS: distributed database storing resource records (RR)

RR format: (name, value, type, ttl)

type=A
type=NS  name is hostname type=CNAME
• valueisisdomain
name (e.g., foo.com)
IP address  name is alias name for some
• value is hostname of “canonical” (the real) name
authoritative name server for  www.ibm.com is really
this domain
servereast.backup2.ibm.com
 value is canonical name

type=MX
DNS Messages
 value is name of mail server
associated with name

The only types of DNS messages are DNS queries and reply messages. They have the
same format:

 first 12 bytes in the header section: 16-bit number identifying the query, which
will be copied into the reply query so that the client can match received replies
with sent queries. 1 bit query/reply flag (0 query, 1 reply). 1 bit flag authoritative
flag set in reply messages when DNS server is an authoritative for a queried
name. 1 bit recursion flag if the client desires that the server performs recursion
when it doesn't have a record, 1 bit recursion-available field is set in the reply if
the DNS server supports recursion.

 question section: information about the query: name field containing the name
being queried, type field.

 answer section: resource records for the name originally queried: Type, Value,
TTL. Multiple RRs can be returned if the server has multiple IP addresses

 authority section: records for other authoritative servers.

 additional section: other helpful records: canonical hostnames...


Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

30
Inserting Records into the DNS Database

We created a new company. Next we register the domain name new company.com at a
registrar. A registrar is a commercial entity that verifies the uniqueness of the domain
name, enters it into the DNS database and collects a small fee for these services. When
we register the address, we need the provide the registrar with the IP address of our
primary and secondary authoritative DNS servers, that will make sure that a Type NS
and a Type A records are entered into the TLD com servers for our two DNS servers.

Focus on security: DNS vulnerabilities

 DDoS bandwidth-flooding attack

 MITM: the mitm answers queries with false replies tricking the user into
connecting to another server.

 The DNS infrastructure can be used to launch a DDoS attack against a targeted
host

To date, there hasn't been an attack that that has successfully impeded the DNS service,
DNS has demonstrated itself to be surprisingly robust against attacks. However there
have been successful reflector attacks, these can be addressed by appropriate
configuration of DNS servers.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

31
2.6 Peer-to-Peer Applications
2.6.1 File Distribution

In P2P file distribution, each peer can redistribute any portion of the file it has received
to any peers, thereby assisting the server in the distribution process. As of 2012 the
most popular P2P file distribution protocol is BitTorrent, developed by Bram Cohen.

Scalability of P2P architectures

Denote the upload rate of the server's access link by u_s, the upload rate of the ith
peer's access link by u_i and the download rate of the ith access link by d_i, the size of
them to be distributed in bits () Comparison client-server and P2P.

Client-Server

The server must transmit one copy of the file to N peers, thus it transmits *NF *bits. The
time to distribute the file is at least NF/u_s. Denote d_min = min{ d_i } the link with the
slowest download rate cannot obtain all F bits in less than F/d_min seconds Therefore:

time to distribute F
to N clients using D > max{NF/u ,F/d }
c-s s, min
client-server approach

increases linearly in N

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

32
P2P

When a peer receives some file data, it can use its own upload capacity to redistribute
the data to other peers. At the beginning of the distribution only the server has the file.
It must send all the bits at least once F/u_s

The peer with the lowest download rate cannot obtain all F bits of the file in less than
F/d_min seconds.

The total upload capacity of the system is equal to the summation of the upload rates of
the server and of all the peers. The system must upload F bits to N peers thus delivering
a total of NF bits which can't be done faster than

BitTorrent

In BitTorrent the collection of all peers participating in the distribution of a particular file
is called a torrent. Peers in a torrent download equal-size chunks of the file from one
another with a typical chunk size of 256 KBytes. At the beginning a peer has no chunks,
it accumulates more and more chunks over time. While it downloads chunks it also
uploads chunks to other peers. Once a peer has acquired the entire file it may leave the
torrent or remain in it and continue to upload chunks to other peers (becoming a
seeder). Any peer can leave the torrent at any time and later rejoin it at anytime as well.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

33
Each torrent has an infrastructure node called a tracker: when a peer joins a torrent, it
registers itself with the tracker and periodically informs it that it is still in the torrent.
The tracker keeps track of the peers participating in the torrent. A torrent can have up
to thousands of peers participating at any instant of time.

User joins the torrent, the tracker randomly selects a subset of peers from the set of
participating peers. User establishes concurrent TCP connections with all of these peers,
called neighboring peers. The neighboring peers can change over time. The user will ask
each of his neighboring peers for the list of chunks they have (one list per neighbor).

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

34
The user starts downloading the chunks that have the fewest repeated copies among
the neighbors (rare first technique). In this manner the rarest chunks get more quickly
redistributed, roughly equalizing the numbers of copies of each chunk in the torrent.

Every 10 seconds the user measures the rate at which she receives bits and determines
the four peers that are sending to her at the highest rate. It then reciprocates by
sending chunks to these same four peers. The four peers are called unchocked. Every 30
seconds it also choses one additional neighbor and sends it chunks. These peers are
called optmistically unchocked.

2.6.2 Distributed Hash Tables (DHTs)


How to implement a simple database in a P2P network? In the P2P system each peer
will only hold a small subset of the totality of the (key, value) pairs. Any peer can query
the distributed database with a particular key, the database will locate the peers that
have the corresponding pair and return the pair to querying peer. Any peer can also
insert a new pair in the database. Such a distributed database is referred to as a
distributed hash table (DHT). In a P2P file sharing application a DHT can be used to store
the chunks associated to the IP of the peer in possession of them.

Peer Churn
In a P2P system, a peer can come or go without warning. To keep the DHT overlay in
place in presence of a such peer churn we require each peer to keep track (know to IP
address) of its predecessor and successor, and to periodically verify that its two
successors are alive. If a peer abruptly leaves, its successor and predecessor need to
update their information. The predecessor replaces its first successor with its second
successor and ask it for the identifier and IP address of its immediate successor.

What if a peer joins? If it only knows one peer, it will ask him what will be his
predecessor and successor. The message will reach the predecessor which will send the
new arrived its predecessor and successor information. The new arrived can join the
DHT making its predecessor successor its own successor and by notifying its predecessor
to change its successor information.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

35
2.6 Video Streaming and Content Distribution Networks
By many estimates, streaming video—including Netflix, YouTube and Amazon Prime—
account for about 80% of Internet traffic in 2020 [Cisco 2020]. This section we will
provide an overview of how popular video streaming services are implemented in
today’s Internet. We will see they are implemented using application-level protocols
and servers that function in some ways like a cache.

2.6.1 Internet Video

In streaming stored video applications, the underlying medium is prerecorded video,


such as a movie, a television show, a prerecorded sporting event, or a prerecorded user-
generated video (such as those commonly seen on YouTube). These prerecorded videos
are placed on servers, and users send requests to the servers to view the videos on
demand. Many Internet companies today provide streaming video, including, Netflix,
YouTube (Google), Amazon, and TikTok. But before launching into a discussion of video
streaming, we should first get a quick feel for the video medium itself. A video is a
sequence of images, typically being displayed at a constant rate, for example, at 24 or
30 images per second.

An uncompressed, digitally encoded image consists of an array of pixels, with each pixel
encoded into a number of bits to represent luminance and color. An important
characteristic of video is that it can be compressed, thereby trading off video quality
with bit rate. Today’s off-the-shelf compression algorithms can compress a video to
essentially any bit rate desired. Of course, the higher the bit rate, the better the image
quality and the better the overall user viewing experience.

From a networking perspective, perhaps the most salient characteristic of video is its
high bit rate. Compressed Internet video typically ranges from 100 kbps for low-quality
video to over 4 Mbps for streaming high-definition movies; 4K streaming envisions a
bitrate of more than 10 Mbps. This can translate to huge amount of traffic and storage,
particularly for high-end video.

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

36
2.6.2 HTTP Streaming and DASH

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

37
2.6.3 Content Distribution Networks
challenge: how to stream content (selected from millions of videos) to hundreds of
thousands of simultaneous users?

 option 1: single, large “mega-server”

• single point of failure

• point of network congestion

• long path to distant clients

• multiple copies of video sent over outgoing link

….quite simply: this solution doesn’t scale

 option 2: store/serve multiple copies of videos at multiple geographically


distributed sites (CDN)

• enter deep: push CDN servers deep into many access networks

• close to users

• used by Akamai, 1700 locations

• bring home: smaller number (10’s) of larger clusters in POPs near (but not
within) access networks

• used by Limelight

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

38
Case study: Netflix

2.7 Socket Programming: Creating Network Applications


socket: door between application process and end-end-transport protocol

Two socket types for two transport services:

• UDP: unreliable datagram

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

39
• TCP: reliable, byte stream-oriented

Socket programming with UDP


UDP: no “connection” between client and server:

 no handshaking before sending data

 sender explicitly attaches IP destination address and port # to each packet

 receiver extracts sender IP address and port# from received packet

UDP: transmitted data may be lost or received out-of-order

Application viewpoint:

 UDP provides unreliable transfer of groups of bytes (“datagrams”) between


client and server processes

Client/server socket interaction: UDP

Socket programming with TCP


Client must contact server

 server process must first be running

 server must have created socket (door) that welcomes client’s contact
Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

40
Client contacts server by:

 Creating TCP socket, specifying IP address, port number of server process

 when client creates socket: client TCP establishes connection to server TCP

 when contacted by client, server TCP creates new socket for server process to
communicate with that particular client

 allows server to talk with multiple clients source port numbers used to
distinguish clients

Client/server socket interaction: TCP

Dr. Eman Sanad , Assistant Prof. IT Department, Faculty of computers and Artificial intelligence

41

You might also like