Internet and Intranet
Internet and Intranet
Internet
1.1 What is the Internet?
1.1.1 A nuts-and-bolts description
The Internet is a computer networks that interconnects hundreds of millions of computing devices
through the world. Today not only computers and workstation are being connected to the network,
therefore the term computer network may sound a bit dated.
All the devices connected to the Internet are called hosts or end systems. End systems are
connected together by a network of communication links and packets switches.
Different links can transmit data at different rates, with the transmission rate of a link measured in
bits/second.
When one end system has data to send to another end system, the sending end system segments the
data and adds header bytes to each segment. The resulting packages of information, called packets,
are then sent through the network to the destination and system where they a reassembled into the
original data.
A packet switch takes a packet arriving on one of its incoming communication links and forwards
that packet on one of its outgoing communication links. The two most prominent types of packets
switches are routers and link switches. The sequence of communication links and packet switches
traversed by a packet from the sending end system to the receiving end system is known as route or
path.
End systems access the Internet through Internet Service Providers (ISPs), including residential
ISPs (cable or phone company), corporate, university ISPs ... Each ISP in itself is a network of
packet switches and communication links. Lower tier (which interconnect end-systems) ISPs are
interconnected through national and international upper tier ISP. An upper-tier ISP consists of high
speed routers interconnected with high-speed fiber-optic links. Each ISP network is managed
independently.
End systems, packet switches and other pieces of the Internet run protocols that control the sending
and receiving of information within the Internet.
Types of Delay
Processing Delay
The processing delay consists of the time required to examine the packet's header and determine
where to direct the packet. It may also include other factors, such as the time needed to check for
bit-level errors occurred during transmission. They typically are of the order of microseconds or
less. After processing the packet is sent to the queue preceding the link to the next router
Queuing Delay
At the queue, the packet experiences a queuing delay as it waits to be transmitted onto the link. It
depends on the number of earlier-arriving packets, therefore if the queue is empty, then the packet's
queuing delay will be 0. Typically of the order of microseconds or milliseconds.
Transmission delays
If the length of the packet is L bits, and the transmission rate of the link is R bits/sec, then the
transmission delay is L/R. This is the amount of time required to push (transmit) all of the packet's
bits into the link. Typically on the order of microseconds to milliseconds.
Propagation Delay
The time required to propagate a bit from the beginning of the link to the next router is the
propagation delay. The bit propagates at the propagation speed of the link, which depends on the
physical medium of the link. The propagation delay is the distance between two routers divided by
the propagation speed of the link.
Packet Loss
A queue preceding a link has finite capacity. If a packet finds a full queue, then the router will drop
it, the packet will be lost. The fraction of lost packets increases as the traffic intensity increases.
Protocol Layering
To provide structure to the design of network protocols, the network designers organize protocols in
layers. Each protocol belongs to one of the layers. We are interested in the services that a layer
offers to the layer above, service model of a layer. When taken together, the protocols of the
various layers are called the protocol stack. The Internet protocol stack consists of five layers:
• Application
• Transport
• Network
• Link
• Physical
Application Layer
Where network applications and their applications-layer protocols reside. The Internet's application
layer includes many protocols: HTTP, SMTP, FTP, DNS. An application-layer protocol is
distributed over multiple end systems, with the application in one end system using the protocol to
exchange packets of information with the application in another end system. This packet of
information at the application layer is called message.
Transport Layer
It transports application-layer messages between application endpoints. In the Internet there are two
transport protocols: TCP and UDP. TCP provides a connection-oriented service to its application:
the service includes guaranteed delivery of application-layer messages to the destination and flow
control unit. TCP also breaks long messages into shorter segments and provides a congestion-
control mechanism, so that a source throttles its transmission rate when the network is congested.
HTTP and SMTP use TCP
UDP provides a connectionless service to its applications: it's a no-frills service that provides no
guarantees, no reliability, no flow control and no congestion control. A transport-layer packet is
called segment Skype uses UDP (speed required)
Network Layer
It is responsible for moving network-layer packets known as datagrams from one host to another.
The Internet's network layer includes the IP Protocol. There is only one IP Protocol and all the
Internet components that have a network layer must run it. The Internet's network layer also
contains routing protocols that determine the routes that datagrams take between sources and
destinations. The Internet has many routing protocols. Often it is simply referred to as the IP
protocols, forgetting that it includes routing too.
Link Layer
To move a packet from one node to the next, the network layer relies on the services of the link
layer. The services provided by the link layer depend on the specific link-layer protocol that is
employed over the link. Examples are Ethernet, WiFi. We will refer to the link-layer packets as
frames
Physical Layer
The job of the physical layer is to move the individual bits within the frame from one node to the
next. The protocols are link dependent and further depend of the actual transmission medium of the
link.
1.5.2 Encapsulation
Routers and link-layer switches are both packet switches but routers and link-layer switches do not
implement all of the layers in the protocol stack: link-layer switches implement Physical and Link
while router add the Network Layer too.
From the Application Layer, the message passes to the transport layer, which appends additional
information to it (the Header) that will be used by the receiver-side transport layer. The transport
layer then adds its own header and passes the datagram to the link layer which adds it own link-
layer header information. Thus, we see that at each layer, a packet has two types of fields: header
fields and a payload field, the payload typically being the packet from the layer above.
The process of encapsulation can be more complex: for example a large message may be divided
into multiple transport-layer segments, which will be divided into multiple datagrams....
DoS
Denial-of-Service attacks render a network, host, or other piece of infrastructure unusable by
legittimate users. Most of them fall into one of the three categories:
• Vulnerability Attack: a few well-crafted messages are sent to a vulnerable application or
operating system running on the targeted host. The service might stop or the host might
crash.
• Bandwidth flooding: a deluge of packets is sent to the targeted host, so many packets that the
target's access link becomes clogged preventing legitimate packets from reaching the server
• Connection flooding: a large number of half-open or fully open TCP connections are
established at the targeted host, which can become so bogged down that it stops accepting
legitimate connections.
In a distributed DoS (DDoS) attack the attacker controls multiple sources and has each source blast
traffic at the target.
Sniffing
A passive receiver can record a copy of every packet that passes through the network. It is then
called a packet sniffer. Because packet sniffers are passive (they do not inject packets into the
channel), they are difficult to detect. Some of the best defenses against packet sniffing involve
cryptography.
Spoofing
The ability to inject packets into the Internet with a false source address is known as IP Spoofing
and is but one of many ways in which one user can masquerade as another user. To solve this
problem we will need end-point authentication.
The history of the Internet shaped is structure
The Internet was originally designed to be based on the model of a group of mutually trusting users
attached to a transparent network, a model in which there is no need for security. Many aspects of
the original Internet architecture deeply reflect this notion of mutual trust, such as the ability for one
to send a packet to any other user is the default rather than a requested/granted capability. However
today's Internet certainly does not involve "mutually trusted users": communication among
mutually trusted users is the exception rather the rule.
This labels stand even for P2P applications in the context of a communication session.
Addressing Processes
In order for a process running on one host to send packets to a process running on another host, the
receiving process needs to have an address. To identify the receiving processes, two pieces of
information need to be specified:
1. The address of the host. In the Internet, the host is identified by its IP Address, a 32-bit (or
64) quantity that identifies the host uniquely.
2. An identifier that specifies the receiving process in the destination host: the destination port
number. Popular applications have been assigned specific port numbers (web server -> 80)
Throughput
A transport-layer protocol could provide guaranteed available throughput at some specific rate.
Applications that have throughput requirements are said to be bandwidth-sensitive applications.
Timing
A transport-layer protocol can also provide timing guarantees. Example: guarantees that every bit
the sender pumps into the socket arrives at the receiver's socket no more than 100 msec later,
interesting for real-time applications such as telephony, virtual environments...
Security
A transport-layer protocol can provide an application with one or more security services. It could
encrypt all data transmitted by sending process and in the receiving host decrypt it.
TCP Services
TCP includes a connection-oriented service and a reliable data transfer service:
• Connection-oriented service: client and server exchange transport-layer control
information before the application-level messages begin to flow. This so-called handshaking
procedure alerts the client and server, allowing them to prepare for an onslaught of packets.
Then a TCP connection is said to exist between the sockets of the two processes. When the
application finishes sending messages, it must tear down the connection
SECURING TCP
Nether TCP nor UDP provide encryption. Therefore the Internet community has developed an
enhancement for TCP called Secure Sockets Layer (SSL), which not only does everything that
traditional TCP does but also provides critical process-to-process security services including
encryption, data integrity and end-point authentication. It is not a third protocol, but an
enhancement of TCP, the enhancement being implemented in the application layer in both the
client and the server side of the application (highly optimized libraries exist). SSL has its own
socket API, similar to the traditional one. Sending processes passes cleartext data to the SSL socket
which encrypts it.
• Reliable data transfer service The communicating processes can rely on TCP to deliver all
data sent without error and in the proper order.
TCP also includes a congestion-control mechanism, a service for the general welfare of the
Internet rather than for the direct benefit of the communicating processes. It throttles a sending
process when the network is congested between sender and receiver.
UDP Services
UDP is a no-frills, lightweight transport protocol, providing minimal services. It is connectionless,
there's no handshaking. The data transfer is unreliable: there are no guarantees that the message sent
will ever reach the receiving process. Furthermore messages may arrive out of order. UDP does not
provide a congestion-control mechanism neither.
The majority of HTTP requests use the GET method, used to request an object.
The entity body (empty with GET) is used by the POST method, for example for filling out forms.
The user is still requesting a Web page but the specific contents of the page depend on what the user
entered into the form fields. When POST is used, the entity body contains what the user entered into
the form fields. Requests can also be made with GET including the inputted data in the requested
URL. The HEAD method is similar to GET, when a server receives it, it responds with an HTTP
message but it leaves out the requested object. It is often used for debugging. PUT is often used in
conjunction with web publishing tools, to allow users to upload an object to a specific path on the
web servers. Finally, DELETE allows a user or application to delete an object on a web server.
A conditional get message is sent from the cache to server which responds only if the object has
been modified.
DNS Caching
DNS extensively exploits DNS caching in order to improve the delay performance and to reduce the
number of DNS messages ricocheting around the Internet. In a query chain, when a DNS receives a
DNS reply it can cache the mapping in its local memory.
To date, there hasn't been an attack that that has successfully impeded the DNS service, DNS has
demonstrated itself to be surprisingly robust against attacks. However there have been successful
reflector attacks, these can be addressed by appropriate configuration of DNS servers.
P2P
When a peer receives some file data, it can use its own upload capacity to redistribute the data to
other peers.
• At the beginning of the distribution only the server has the file. It must send all the bits at
least once. $D \geq F/u_s$
• The peer with the lowest download rate cannot obtain all F bits of the file in less than
$F/d_min $ seconds.
• The total upload capacity of the system is equal to the summation of the upload rates of the
server and of all the peers. The system must upload F bits to N peers, thus delivering a total
of NF bits which can't be done faster that $u_total$.
We obtain: $$ D_{P2P} = \max \left{ \frac{F}{u_s} , \frac{F}{d_{min}} , \frac{NF}{u_s +
\sum_{i=1}^N u_j} \right} $$
BitTorrent
In BitTorrent the collection of all peers participating in the distribution of a particular file is called a
torrent. Peers in a torrent download equal-size chunks of the file from one another with a typical
chunk size of 256 KBytes. At the beginning a peer has no chunks, it accumulates more and more
chunks over time. While it downloads chunks it also uploads chunks to other peers. Once a peer has
acquired the entire file it may leave the torrent or remain in it and continue to upload chunks to
other peers (becoming a seeder). Any peer can leave the torrent at any and later rejoin it.
Each torrent has infrastructure node called a tracker: when a peer joins a torrent, it registers itself
with the tracker and periodically inform it that it is still in the torrent. The tracker keeps track of the
peers participating in the torrent. A torrent can have up to thousands of peers participating at any
instant of time.
User joins the torrent, the tracker randomly selects a subset of peers from the set of participating
peers. User establishes concurrent TCP connections with all of these peers, called neighboring
peers. The neighboring peers can change over time. The user will ask each of his neighboring peers
for the list of chunks they have (one list per neighbor). The user starts downloading the chunks that
have the fewest repeated copies among the neighbors (rares first technique). In this manner the
rarest chunks get more quickly redistributed, roughly equalizing the numbers of copies of each
chunk in the torrent.
Every 10 seconds the user measures the rate at which she receives bits and determines the four
peers that are sending to her at the highest rate. It then reciprocates by sending chunks to these same
four peers. The four peers are called unchocked. Every 30 seconds it also choses one additional
neighbor at sends it chunks. These peers are called optmistically unchocked.
2.6.2 Distributed Hash Tables (DHTs)
How to implement a simple database in a P2P network? In the P2P system each peer will only hold
a small subset of the totality of the (key, value) pairs. Any peer can query the distributed database
with a particular key, the database will locate the peers that have the corresponding pair and return
the pair to querying peer. Any peer can also insert a new pair in the databse. Such a distributed
database is referred to as a distributed hash table (DHT). In a P2P file sharing application a DHT
can be used to store the chunks associated to the IP of the peer in possess of them.
An approach: let's assign an identifier to each peer, where the identifier is an integer in the range
$[0, 2^n -1]$ for some fixed $n$. Such identifier can be expressed by an $n$-bit representation. A
hash function is used to transform non-integer values into integer values. We suppose that this
function is available to all peers. How to assign keys to peers? We assign each (key,value) pair to
the peer whose identifier is the closest to key, which is the identifier defined as the closest successor
of the key. To avoid having each peer keeping track of all other peers (scalability issue) we use
Circular DHT
If we organize peers into a circle, each peer only keeps track of its immediate successor and
predecessor (modulo $2^n$). This circular arrangement of peers is a special case of an overlay
network: the peers form an abstract logical network which resides above the "underlay" computer
network, the overlay links are not physical but virtual liaisons between pairs of peers. A single
overlay link typically uses many physical links and physical routers in the underlying network.
In the circle a peer asks "who is responsible for key k?" and it sends the message clockwise around
the circle. Whenever a peer receives such message, it knows the identifier of its predecessor and
predecessor, it can determine whether it is responsible (closest to) for the key in question. If not, it
passes the message to its successor. When the message reaches the peer responsible for the key, this
can send a message back to the querying peer indicating that it is responsible for that key. Using this
system N/2 messages are sent on average (N = number of peers). Designing a DHT there is a
tradeoff between the number of neighbors for each peer and the number of DHT messages needed
to resolve a single query. (1 message if each peer keeps track of all other peers, N/2 messages if
each knows only 2 neighbors). To improve our circular DHT we could add shortcuts so that each
peer not only keeps track of its immediate successor and predecessor but also of relatively small
number of shortcut peers scattered about the circle. How many shortcut neighbors? Studies show
that DHT can be designed so that bot the number of neighbors per peer as well as the number of
messages per query is O(log N) (N the number of peers).
Peer Churn
In a P2P system, a peer can come or go without warning. To keep the DHT overlay in place in
presence of a such peer churn we require each peer to keep track (know to IP address) of its
predecessor and successor, and to periodically verify that its two successors are alive. If a peer
abruptly leaves, its successor and predecessor need to update their information. The predecessor
replaces its first successor with its second successor and ask this for the identifier and IP address of
its immediate successor.
What if a peer joins? If it only knows one peer, it will ask him what will be his predecessor and
successor. The message will reach the predecessor which will send the new arrived its predecessor
and successor information. The new arrived can join the DHT making its predecessor successor its
own successor and by notifying its predecessor to change its successor information.
The server may support many simultaneous TCP connection sockets, with each socket attached to a
process and each socket identified by its own four-tuple. When a TCP segment arrives at the host,
all the fours fields are used to demultiplex the segment to the appropriate socket.
Port Scanning
Can be used both by attackers and system administrator to find vulnerabilities in the target or to
know network applications are running in the network. The most used port scanner is nmap free
and open source. For TCP it scans port looking for port accepting connections, for UDP looking for
UDP ports that respond to transmitted UDP segments. It then returns a list of open, closed or
unreachable ports. A host running nmap can attempt to scan any target anywhere in the Internet
Two basic approaches toward pipelined error recovery can be identified: Go-Back-N and Selective
Repeat
• 32 bit sequence number and acknowledgement number necessary for reliable data
transmission
• 16 bit receive window used for flow control, indicates the number of bytes that a receiver is
willing to accept
• 4 bit header length field. The TCP header can be of a variable length due to the TCP
options field (usually empty therefore usual length is 20 bytes)
• options field used to negotiate MSS or as a window scaling factor for use in high speed
networks.
• flag field: 6 bits:
1. ACK used to indicate that the value carried in the acknowledgement field is valid,
that is the segment contains an acknowledgement for a segment that has been
successfully received.
2. , 3. and 4. RST, SYN, FIN for connection setup and teardown
3. PSH indicates that the receiver should pass the data to upper layer immediately
4. URG indicates that there is data in the segment that the sending side upper layer has
marked as urgent.
Sequence Numbers and Acknowledgment Numbers
TCP views data as an unstructured, but ordered, stream of bytes and TCP's use of sequence
numbers reflects this view: sequence numbers are over the stream of bytes and not over the series of
transmitted segments. The sequence number for a segment is the byte-stream number of the first
byte in the segment. EX 500,000 bytes, MSS = 1,000 bytes => 500 segments are created. First is
numbered 0, second 1000, third 2000.....
The acknowledgement number that Host A puts in its segment is the sequence number of the next
byte Host A is expecting from Host B. TCP is said to provide cumulative acknowledgements: if
sender receives ACK 536 he will know that all the bytes from 0 to 535 have been well received.
What does a host do when it receives out-of-order segments? The receiver buffers the out-of-order
bytes and waits for the missing bytes to fill in the gaps. Usually both sides of a TCP connection
randomly choose an initial sequence number randomly both for security and for minimizing the
possibility that a segment that is still present in the network from an earlier, already terminated
connection between two hosts is mistaken for a valid segment in a later connection between these
same two hosts.
Fast Retransmit
The problem with timeout-triggered retransmission is that the timeout period can be relatively long.
The sender can however often detect packet loss before the timeout event occurs by noting
duplicate ACKs. A duplicate ACK is an ACK that reacknowledges a segment for which the sender
has already received an earlier acknowledgement. When the TCP sender receives three duplicate
ACK for the same data it takes this as an indication that the segment following the segment that
has been ACKed three times has been lost. In the case that three duplicate ACKs are received, the
TCP sender performs a fast restransmit: it retransmits the missing segment before that segment's
timer expires.
B side
1. B allocates a receive buffer to its connection, its size being `RcvBuffer`
2. B also keeps the variables: `LastByteRead` (number of last byte in the data
stream read by the application process) and `LastByteRcvd` (the number of the
last byte arrived from the network and placed in the receive buffer)
Receive window aka the amount of spare room in the buffer rwnd = RcvBuffer -
[LastByteRcvd - LastByteRead] rwnd is dynamic
A side
A keeps track of two variables:
1. `LastByteSent`
2. `LastByteAcked`
Through the connection's life A must make sure that LastByteSent - LastByteSent <=
rwnd
If B's buffer becomes full, he sends rwnd = 0. If B has nothing to send to A, when the application
process empties B's buffer, TCP does not send a new segment with the new value of rwnd to A
(TCP sends to A only if it needs to send data or if it needs to send an ACK). Therefore A is never
informed that B's buffer has some free space and he is blocked and can trasmit no more data. To
solve this problem, TCP requires A to continue to send segments with one data byte when B's
receive window is 0, these segments will be acknowledged by B. Eventually the buffer will begin
to empty and the acknowledgements will contain à non-zero rwnd value.
1. The client-side TCP sends a special TCP segment to server-side TCP. This segment doesn't
contain any application-layer data but the flag bit SYN is set to 1. The segment is
referred to as a SYN segment. The client also randomly chooses an initial sequence
number (client_isn) and puts this number in the sequence number field of the
initial TCP SYN segment. (randomizing client_isn is interesting to avoid security
attacks).
2. The TCP SYN segment arrives at the server-side, it is extracted from the datagram. The
server allocates the TCP buffers and variables to the connection and sends a connection-
granted segment to the client. This segment also contains no application-layer data. The
SYN flag is set to 1, the ACK field in the header is set to client_isn+1. The server
chooses its own initial sequence number server_isn and puts this value in the
sequence number field of the TCP segment header. This segment is referred to as
SYNACK segment.
3. Upon receiving the SYNACK segment, the client also allocates buffers and variables to the
connection. The client then sends the server yet another segment which acknowledges
the SYNACK (server_isn+1 is set the acknowledgement field of the TCP segment
header)
After this setup, all the segments will have the SYN bit set to 0 in their headers.
Tearing down a TCP connection
1 - Slow Start
When a TCP connection begins, cwnd is usually initialized to a small value of 1 MSS and only one
segment is sent. Each acknowledged packet will cause the cwnd to be increased by 1 MSS and
the sender will send now two segments (because the window is increased by one for each ack).
Therefore the number of segments doubles at each RTT, therefore the sending rate also doubles
every RTT. Thus TCP send rate starts slow but grows exponentially during the slow start phase.
When does the growth end?
• Timeout: cwnd is set to 1 MSS and the slow start is started anew. Also the variable slow
start threshold is initialized: ssthresh = cwnd / 2 - (half of value of
cwnd when congestion is detected)
• When cwnd >= ssthresh slow starts is stopped -> congestion avoidance state
• Three duplicate ACKs: fast retransmit and fast recovery state
2 - Congestion Avoidance
TCP suppose congestion is present, how to adapt? Instead of doubling cwnd every RTT, cwnd is
increased by just a single MSS every RTT. When should this linear increase stop?
• Timeout: cwnd is set to 1 MSS, and ssthresh = cwnd (when loss
happened) / 2
• Three duplicate ACKs: cwnd = (cwnd / 2) + 3 MSS and ssthresh = cwnd
(when 3 ACKs received) / 2 -> fast recovery state
3 - Fast Recovery
cwnd is increased by 1 MSS for every duplicate ACK received for the missing state that caused
TCP to enter this state. When the ACK arrives for the missing segment, TCP goes into Congestion
Avoidance after reducing cwnd. If a timeout occurs cwnd is set to 1 MSS and ssthresh is set to
half the value of cwnd when the loss event occurred. Fast recovery is recommended but not
required in TCP, in fact only the newer version of TCP, TCP Reno incorporated fast recovery.
Connection Setup
in some computer networks there is a third really important networks-layer function: connection
setup: a path-wide process that sets up connection state in routers.
Input ports than moves the packet to the switching fabric (eventually queuing them if this is busy)
4.3.2 Switching
Can be performed in different ways:
Some fields:
• Version number: 4 bits specifying the IP protocol version of the datagram (IPv4 or IPv6 )
• Header length: the length of the packet is variable therefore this field tells where the header
ends and the data begins. Usually datagrams contain no option so that the typical IP
datagram has 20-byte header
• Type of service (TOS): allows different types of datagrams to be distinguished from each
other. (eg real time vs non real time)
• Datagram length: 16 bits specifying the total length, that is header + data measured in bytes.
16 bits -> max header length = 65535 bytes, but usually datagrams are rarely larger than
1500 bytes.
• Identifier, flags, fragmentation offset: used for IP fragmentation. (NB: IPv6 doesn't allow
fragmentation at routers)
• Time-to-live (TTL): used to avoid that datagrams circulate forever. It is decreased by one
each time the datagram is processed by a router. When TTL = 0, the datagram is dropped
• Protocol: only used when datagram reaches its final destination, it specifies what transport
protocol to which the data of the datagram should be passed. EX: 6 -> TCP, 17 -> UDP
• Header checksum: helps the router to detect bit errors in a received IP datagram.
Computation: each two bytes in the header are considered as numbers, summed up using
the 1s complement arithmetic. The 1s complement of this sum is then put in the checksum
field. A router computes the checksum for each datagram. If the computed one doesn't equal
the one in the field then the router has detected an error. Usually the datagram is discarded.
As it is recomputed at each router, it may change.
• Source and destination IP addresses
• Options: rarely used, dropped by IPv6
• Data (payload): usually contains the transport layer segment but can also contain ICMP
messages
IP Datagram Fragmentation
The maximum amount of data that a link layer can carry is called the Maximum Transmission
Unit (MTU). As each datagram is encapsulated in a link layer frame, the MTU imposes a hard limit
on the length of the datagram. Each of the links along the route can use different link-layer
protocols and therefore can have different MTU. We therefore have to break the IP datagram into
smaller datagrams, each of which will go in different link layer frames. Each of these smaller
datagrams is referred to as a fragment. A fragment must be reassembled before it can be passed to
the transport layer. To reduce the workload on routers, the designers of IPv4 decided that
reassembling should only be done at the destination end system.
In IPv4, to comply with fragmentation, the header contains the fields:
• Identifiers: identifies the unfragmented datagram (same for all fragments)
• flags: in particular there is one flag set to 0 if the fragment is the last or to 1 if there are more
to come
• fragmentation offset: an integer x, the data in the fragment should be inserted beginning at
byte x * 8
If one fragment contains error or is lost, all the others are dropped and TCP will have the sender
retransmit all the data. Fragmentation complicates the network and end systems and can be used in
lethal DoS attacks such as the Jolt2 attack
From the outside the router looks like a single device with a single IP address. It hides the details of
the internal network from the outside world. Internal addresses can be assigned using DHCP.
Problems with NAT:
• Port number should be used for addressingi processes not hosts
• Routers shouldn't have access to the transport layer (ports)
• NAT violates end-to-end argument (any host should be able to contact any other host)
• NAT interferes with P2P applications (peers hidden by NAT), therefore the need of
connection reversal for NAT traversal
UPnP
NAT traversal is increasingly provided by Universal Plug and Play. It requires both the host and the
NAT to be compatible. Host requests a NAT mapping_ (private IP address, private port number) ->
(public IP address, public port number) If the NAT accepts and creates the mapping, then outsiders
can create connections to (public IP address, public port number).
4.4.4 IPv6
Developed because of IPv4 address space exhaustion
Datagram format
• the size of the source and destination addresses is increased from 32 to 128 bits: every grain
of sand on the planet can be addressable. Unicast and multicast addresses are joind by the
anycast address which allow a datagram to be delivered to any one of a group of hosts.
• A number of IPv4 fields have been dropped or made optional resulting in a 40-byte fixed-
length header which allows faster datagram processing.
• Flow label not clear definition. 20-bit
• Version: 4-bit for IPv6 or 4. If ipv6 -> 0110
• Traffic class: 8 bit similar to TOS
• Payload length: 16 bit unsigned integer indicating number of bytes following the 40-byte
datagram header
• Next header: transport layer protocol
• Hop limit: decremented by one by each router forwarding the datagram, when 0, the
datagram is discarded
Fragmentation and reassembly cannot be done by intermediate routers, only by source and
destination. If a router cannot trasmit a datagram because too big, it drops it and sends back an
ICMP error message "Packet too big". This reduces a lot the workload on the network.
As the transport layer and the link layer already perform check-summing, this functionality has
been removed from the network layer for faster datagram processing.
An option field is no longer part of the header, instead it is one of the possible next headers pointed
to from the header. A new version of ICMP has been defined for IPv6 which includes messages
adapted to IPv6 ("packet too big") and replaces IGMP (Internet Group Management Protocol), used
to manage a host's joining and leaving of multicast groups.
The least cost between x and y d(x,y) can be determined using the Bellman-Ford equation :
d(x,y) = min_v {c(x,y) + d(v,y)}
... to be continued
4.5.3 Hierarchical Routing
In practice it is not possible to have a network of interconnected routers running the same routing
algorithm because of two reasons:
• Scale if the number of routers is large, running LS or DV algorithms for the whole network
becomes prohibitive for memory, processing, storing and timing costs.
• Administrative autonomoy an organization should be able to organize its network as it
wishes, while still being able to connect its network to the outside world.
Therefore routers are organized into autonomous systems (ASs), each of which being under the
same administrative control. Routers in the same AS run the same routing algorithm and have
information about each other. The routing algorithm running within an AS is called an intra-
autonomous system routing protocol. In an AS, one or more routers will have the task of being
responsible for forwarding packets outside the AS, these routers are called gateway routers. To
obtain reachability information from neighboring ASs and propagating the reachability information
to all routers interal to its AS, gateway routers use inter-AS routing protocols. Two
communicating ASs must run the same inter-AS routing protocol.
When a router needs to forward a packet outside its AS and there are multiple gateway routers, the
router has to make a choice. One often employed practice is to use hot-potato routing: the AS gets
rid of the packet as quickly as possible (as inexpensively as possible), the router sends the packet to
the gateway router that has the smallest router-to-gateway cost among all gateways with a path to
the destination. An AS can decide what (internal) destinations to advertise to neighboring ASs: this
a policy decision.
Basics
It is a very complex algorithm. Routers exchange information over semipermanent TCP connections
using port 179. There is typically one such BGP TCP connection for each link directly connecting
two routers in two different ASs but there are also semipermanent TCP connections between routers
in the same AS. For each connection, the two routers at the end of it are called BGP peers and the
connection is called a BGP session. A session spanning two ASs is an external BGP (eBGP)
session and BGP sessions between routers within an AS is called an internal BGP (iBGP) session.
Destinations are not hosts, but CIDRized prefixes, each representing a subnet or collection of
subnets.
Possible attacks:
• eavesdropping: sniffing and recording messages flowing in a channel
• modification, inserion, deletion of messages or message content
Block Ciphers
Today there are two broad classes of symmetric encryption techniques: stream ciphers and block
ciphers(used for PGP, SSL, IPssec) In a block cipher, the message to be encrypted is processed into
blocks of k bis and each block is encrypted independently. To encode a bloc, the cipher uses a on-
to-one mapping to map the k-bit block of cleartext to a k-bit block of ciphertext. To avoid bruteforce
attacks, cipher blocks usually employ large blocks (k=64) but longer blocks implies longer tables to
store the mappings. Block ciphers typically use functions that simulate randomly permuted tables.
EX 64 bit input split into 8 8-bit chunks, each of which is processed by a 8-bit to 8-bit table, each
chunk having its table. The encrypted chunks are reassembled into a 64 bits message which is fed
again to the input. After n such cycles, the function provides a 64-bit block of ciphertext. The key
for this block would be the eight permutation tables, assuming that the scramble function is publicly
known. Popular block ciphers: DES (Data Encryption Standard), 3DES, AES (Advanced
Encryption Standard). These use functions instead of predetermined tables. Each of them uses a
string of bits for a key (64-bit blocks with 56-bit key in DES, 128-bits blocks and 128/192/256 bits-
long keys)
Cipher-Block Chaining
We need to avoid long messages avoiding that two or more identical ciphertexts (produced for
identical cleartexts by a symmetric encryption). (I DON'T FINISH THIS PART, IT GOES TOO
DEEP INTO ENCRYPTION TECHNIQUES WHICH IS NOT WHAT WE ARE INTERESTED
IN)
Version 1.0
Alice simply sends a message to Bob saying "I'm Alice"
Version 2.0
Alice and Bob always communicate using the same addresses. Bob can simply check that the
message has the source IP of Alice. However is fairly easy to spoof an IP address: crafting a special
datagram is feasible using a custom kernel e.g Linux.
Version 3.0
Alice and Bob could share a secret password, a secrete between the authenticator and the person
being authenticated. Alice: I'm Alice, Password. However password can be eavesdropped, sniffed
(read and stored).
Version 3.1
We could encrypt the password using a shared symmetric cryptographic key. However this protocol
is subject to playback attacks an eavesdropper could sniff the encrypted secret and, without having
to decrypt, could send it to impersonate Alice.
Version 4.0
To avoid playback attacks we could use the same principle behind TCP's three way handshake. A
nonce is a number that a protocol will use only once in a lifetime. The procedure is then:
1. Alice sends: I am Alice
2. bob chooses a nonce and sends it to Alice
3. Alice encrypts it using Alice and Bob's symmetric secret key and sends the encrypted nonce.
4. Bob decrypts the received nonce and checks for equality with the one he generated.
8.5.2 PGP
Pretty Good Privacy (PGP) is an e-mail encryption scheme that has become the De Facto standard.
It uses the same design shown above, giving the option of signing, encrypting or both. When PGP is
installed, it creates a public key pair for the user, the public key can be posted online while the
private key is protected by a password which has to be entered every time the user accesses the
private key. A PGP message appears after the MIME header. PGP also provides a mechanism for
public key certification. PGP public keys are certified by Web of Trust: Alice can certify any
key/username pair when she believes the pair really belong together and, in addition, PGP permits
Alice to say that she trusts another user to vouch for the authenticity of more keys. Some PGP users
sign each other's key by holding key-signing parties.
SSL Record
The real SSL record:
• Type: handshake message, data message, connection teardown message
• Length: used to extract the records out of the TCP byte stream
Connection Closure
TCP FIN segments can be crafted by an attacker (truncation attack), therefore they cannot be used.
The type field of SSL records is used for these purpose, even if it sent in the clear, it is authenticated
at the receivers using record's MAC.
An IPsec entity often maintains state information for many SAs (all outside clients) using its
Security Association Database (SAD) which is a data structure in the entity's OS kernel.
3: Application Gateways
Application Gateways look beyond the IP/TCP/UDP headers and make policy decisions based on
application data. An Application Gateway is an application-specific server through which all
application data must pass. Multiple AG can run on the same host, but each gateway is a separate
server with its own processes.
Slotted ALOHA
All frames consist of L bits, time is divided into slots of size L/R seconds, nodes start to transmit
frames only at the beginning of slots. Moreover nodes are synchronized so that each node when the
slot begins. If two or more frames collide in a slot, then all the nodes detect the collision event
before the slot ends.
If p is a probability then the operation of slotted ALOHA in each node is simple:
• each node waits the beginning of the next slot to transmit the entire frame in a slot
• If no collision occurs, the frame is considered delivered
• If collision, this is detect before the end of the slot. The node retransmits its frame in each
subsequent slot with probability p (probability of retransmission) until the frame is
transmitted without a collision.
Slotted ALOHA allows transmission at full rate R, is highly decentralized, and is extremely simple.
The computed maximal efficiency (successfully used slots in transmission / total slots) of Slotted
ALOHA) is 37% thus the effective transmission rate is 0.37R bps.
Aloha
all nodes synchronize their transmissions to start at the beginning of a slot. The node immediately
transmits a frame in its entirety in the channel. In case of collision, the node will then immediately
retransmit the frame with probability p otherwise the node waits for a frame transmission time, after
which it transmits the frame with probability p or wait for another frame with probability 1-p. The
maximum efficiency is 1/(2e) but the protocol is fully decentralized.
CSMA/CD Efficiency
Is the long run fraction of time during which frames are being transmitted without collision. If the
propagation delay approaches 0, the efficiency approaches 1. Also if the propagation delay becomes
very large, efficiency approaches 1.
5.4.2 Ethernet
It has pretty much taken over the wired LAN market. Since its invention in the 70's, it has grown
and become faster. At the beginning the original Ethernet LAN used a coaxial bus to interconnect
the nodes, creating a broadcast LAN. By the late 90s, most companies and universities had replaces
their LANs with Ethernet installation using a hub-based star topology: hosts and routers are directly
connected to a hub with twisted-pair copper wire. A hub is a physical layer device that acts on
individual bits rather than frames. When a hub receives a bit, it simply recreates it boosting its
energy strength and transmits the bit onto all the other interfaces (it's still a broadcast LAN). In the
early 2000s, the star topology evolved: the hub was replaced with a switch, allowing a collision-less
LAN.
Ethernet Technologies
There are many variants and flavors of Ethernet which have been standardized over the years by the
IEEE. They vary in speed: 10 Megabit, 100 Megabit, 1000 Megabit, 10 Gigabit... They can also
vary in the type of traffic they can transport....
Self-Learning
The switch table is build automatically, dynamically and autonomously without any intervention
from a network administrator: switches are self learning.
1. The switch table is initially empty
2. For each incoming frame, the switch stores in its table
1. the MAC address in the frame's source address field
2. the interface from which the frame arrived
3. the current time
3. The switch deletes an address in the table if no frame are received with that address as the
source after some period (aging time) so that to eliminate unused entries from the table
Thus switches are plug-and-play devices: they require no human intervention. Switches are also
full-duplex, meaning any interface can send and receive at the same time.