0% found this document useful (0 votes)
60 views

17EC64 - Module4 - Network Layer Protocol and Unicast Routing

This document discusses network layer protocols and unicast routing. It describes the network layer in the TCP/IP protocol suite, focusing on the current version IPv4. IPv4 is responsible for packetizing, forwarding, and delivering packets at the network layer. The document outlines the format of an IPv4 datagram, including the header fields which contain information for routing and delivery, and the payload which contains data from other protocols using IP. It explains how fields like TTL, protocol, and checksum are used for routing, demultiplexing, and error checking of datagrams.

Uploaded by

Shashidhar kr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

17EC64 - Module4 - Network Layer Protocol and Unicast Routing

This document discusses network layer protocols and unicast routing. It describes the network layer in the TCP/IP protocol suite, focusing on the current version IPv4. IPv4 is responsible for packetizing, forwarding, and delivering packets at the network layer. The document outlines the format of an IPv4 datagram, including the header fields which contain information for routing and delivery, and the payload which contains data from other protocols using IP. It explains how fields like TTL, protocol, and checksum are used for routing, demultiplexing, and error checking of datagrams.

Uploaded by

Shashidhar kr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER

17EC 64
PROTOCOLS AND UNICAST ROUTING

MODULE 4

NETWORK LAYER PROTOCOL AND UNICAST ROUTING

Network Layer Protocols

In this chapter, we show how the network layer is implemented in the TCP/IP protocol suite.
The protocols in the network layer have gone through a few versions; in this chapter, we
concentrate on the current version v4.
Communication at the network layer is host-to-host (computer-to-computer); a computer somewhere in
the world needs to communicate with another computer somewhere else in the world through the
Internet.
The packet transmitted by the sending computer may pass through several LANs or WANs before
reaching the destination computer. A global addressing scheme called logical addressing in required for
this communication. The term IP address refers to the logical address in the network layer of the TCP/IP
protocol suite.
Communication at the network layer in the Internet is connectionless. If reliability is important,
IPv4 must be paired with a reliable transport-layer protocol such as TCP.

Position of IPv4 and other network protocols in TCP/IP protocol suite

The network layer in version 4 can be thought of as one main protocol and three auxiliary ones as shown
in Figure 4.1.
• The main protocol, Internet Protocol version 4 (IPv4), is responsible for packetizing,
forwarding, and delivery of a packet at the network layer.
• The Internet Control Message Protocol version 4 (ICMPv4) helps IPv4 to handle some errors
that may occur in the network-layer delivery.
• The Internet Group Management Protocol (IGMP) is used to help IPv4 in multicasting.
• The Address Resolution Protocol (ARP) is used to glue the network and data-link layers in
mapping network-layer addresses to link-layer addresses

Figure 4.1: Position of IP and other network-layer protocols in TCP/IP protocol suite

Datagram Format
The Internet Protocol version 4 (IPv4) is the delivery mechanism used by the TCP/IP protocols. Packets
used by the IP are called datagrams. A datagram is a variable-length packet consisting of two parts: header
and payload (data). The header is 20 to 60 bytes in length and contains information essential to routing and
delivery. It is customary in TCP/IP to show the header in 4-byte sections.

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 125


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Figure 4.2: IPv4 Datagram

A brief description of each field is in order:

1. Version Number: The 4-bit version number (VER) field defines the version of the IPv4 protocol,
which, obviously, has the value of 4.

2. Header Length: The 4-bit header length (HLEN) field defines the total length of the datagram
header in 4-byte words. The IPv4 datagram has a variable-length header. When a device receives a
datagram, it needs to know when the header stops and the data, which is encapsulated in the packet,
starts. The total length is divided by 4 and the value is inserted in the field. The receiver needs to
multiply the value of this field by 4 to find the total length.
3. Service Type: In the original design of the IP header shown in Fig 4.3, this field was referred to
as type of service (TOS), which defined how the datagram should be handled. In the late 1990s,
IETF redefined the field to provide differentiated services (DiffServ).

Figure 4.3: Service Type


Note: The precedence subfield was part of version 4, but never used

4. Total Length: This 16-bit field defines the total length (header plus data) of the IP datagram in
bytes. A 16-bit number can define a total length of up to 65,535 (when all bits are 1s). However,
the size of the datagram is normally much less than this. This field helps the receiving device to know
when the packet has completely arrived. To find the length of the data coming from the upper
layer, subtract the header length from the total length. The header length can be found by
multiplying the value in the HLEN field by 4.
Note: The total length field defines the total length of the datagram including the header.

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


126
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

5. Identification, Flags, and Fragmentation Offset: These three fields are related to the
fragmentation of the IP datagram when the size of the datagram is larger than the underlying
network can carry.
6. Time-to-live: Due to some malfunctioning of routing protocols (discussed later) a datagram may
be circulating in the Internet, visiting some networks over and over without reaching the
destination. This may create extra traffic in the Internet. The time-to-live (TTL) field is used to
control the maximum number of hops (routers) visited by the datagram. When a source host sends
the datagram, it stores a number in this field. This value is approximately two times the maximum
number of routers between any two hosts. Each router that processes the datagram decrements
this number by one. If this value, after being decremented, is zero, the router discards the datagram.
7. Protocol: In TCP/IP, the data section of a packet, called the payload, carries the whole packet
from another protocol. A datagram, for example, can carry a packet belonging to any transport-
layer protocol such as UDP or TCP. A datagram can also carry a packet from other protocols that
directly use the service of the IP, such as some routing protocols or some auxiliary protocols. The
Internet authority has given any protocol that uses the service of IP a unique 8- bit number which
is inserted in the protocol field. When the payload is encapsulated in a datagram at the source IP,
the corresponding protocol number is inserted in this field; when the datagram arrives at the
destination, the value of this field helps to define to which protocol the payload should be delivered.
In other words, this field provides multiplexing at the source and demultiplexing at the destination

Table Protocol values

Figure 4.4: Multiplexing and demultiplexing using the value of the protocol field

8. Header checksum: IP is not a reliable protocol; it does not check whether the payload carried by
a datagram is corrupted during the transmission. IP puts the burden of error checking of the
payload on the protocol that owns the payload, such as UDP or TCP.
The datagram header, however, is added by IP, and its error-checking is the responsibility of IP.
Errors in the IP header can be a disaster. For example, if the destination IP address is corrupted,
the packet can be delivered to the wrong host. If the protocol field is corrupted, the payload may
be delivered to the wrong protocol. If the fields related to the fragmentation are corrupted, the
datagram cannot be reassembled correctly at the destination, and so on. For these reasons, IP adds
a header checksum field to check the header, but not the payload. We need to remember that, since
the value of some fields, such as TTL, which are related to fragmentation and options, may change
from router to router, the checksum needs to be recalculated at each router. Checksum in the
Internet normally uses a 16-bit field, which is the complement of the sum of other fields calculated
using 1s complement arithmetic.

9. Source and Destination Addresses: These 32-bit source and destination address fields define the
IP address of the source and destination respectively. The source host should know its IP address.
The destination IP address is either known by the protocol that uses the service of IP or is provided
by the DNS. Note that the value of these fields must remain unchanged during the time the IP
datagram travels from the source host to the destination host.

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 127


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

10. Options: A datagram header can have up to 40 bytes of options. Options can be used for network
testing and debugging. Although options are not a required part of the IP header, option processing
is required of the IP software. This means that all implementations must be able to handle options
if they are present in the header. The existence of options in a header creates some burden on the
datagram handling; some options can be changed by routers, which forces each router to recalculate
the header checksum. There are one-byte and multi-byte options
11. Payload: Payload, or data, is the main reason for creating a datagram. Payload is the packet coming
from other protocols that use the service of IP. Comparing a datagram to a postal package, payload
is the content of the package; the header is only the information written on the package
Example 1:
An IPv4 packet has arrived with the first 8 bits as shown : 01000010 The receiver discards the packet. Why?
There is an error in this packet. The 4 leftmost bits (0100) show the version, which is correct. The next 4
bits (0010) show an invalid header length (2 × 4 = 8). The minimum number of bytes in the header must
be 20. The packet has been corrupted in transmission.

Example 2
In an IPv4 packet, the value of HLEN is 1000 in binary. How many bytes of options are being carried by this
packet?
The HLEN value is 8, which means the total number of bytes in the header is 8 × 4, or 32 bytes. The first
20 bytes are the base header, the next 12 bytes are the options.

Example 3
The HLEN value is 5, which means the total number of bytes in the header is 5 × 4, or 20 bytes (no
options). The total length is 40 bytes, which means the packet is carrying 20 bytes of data (40 − 20).

Example 4
An IPv4 packet has arrived with the first few hexadecimal digits as shown. 0x45000028000100000102 . . .
How many hops can this packet travel before being dropped? The data belong to what upper-layer
protocol?
To find the time-to-live field, we skip 8 bytes. The time-to-live field is the ninth byte, which is 01. This means
the packet can travel only one hop. The protocol field is the next byte (02), which means that the upper-
layer protocol is IGMP.

Example 5
An IPv4 packet has arrived with the header decimal digits as shown below. Calculate the checksum for this
header

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


128
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Fragmentation
A datagram can travel through different networks. Each router decapsulates the IP datagram from the
frame it receives, processes it, and then encapsulates it in another frame.The format and size of the received(
or sent) frames depend on the protocol used by the physical network through which the frame has just
travelled( or going to travel). For example, if a router connects a LAN to a WAN, it receives a frame in the
LAN format and sends a frame in the WAN format.

Figure 4.5: Encapsulation of a small datagram in an Ethernet frame


Maximum transfer unit (MTU)
Each link-layer protocol has its own frame format. One of the features of each format is the maximum size
of the payload that can be encapsulated. In other words, when a datagram is encapsulated in a frame, the
total size of the datagram must be less than this maximum size, which is defined by the restrictions imposed
by the hardware and software used in the network

Figure 4.6: Maximum transfer unit (MTU)


In order to make the IP protocol independent of the physical network, the designers decided to make the
maximum length of the IP datagram equal to 65,535 bytes. This makes transmission more efficient if one
day we use a link-layer protocol with an MTU of this size. However, for other physical networks, we must
divide the datagram to make it possible for it to pass through these networks. This is called fragmentation.
When a datagram is fragmented, each fragment has its own header with most of the fields repeated, but
some have been changed.
A fragmented datagram may itself be fragmented if it encounters a network with an even smaller MTU. In
other words, a datagram may be fragmented several times before it reaches the final destination. A datagram
can be fragmented by the source host or any router in the path. The reassembly of the datagram, however,
is done only by the destination host, because each fragment becomes an independent datagram
Table 4.1 MTUs for some networks

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 129


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Fields Related to Fragmentation


Three fields in an IP datagram are related to fragmentation: identification, flags, and
fragmentation offset
When a datagram is fragmented, the value in the identification field is copied into all fragments. In other
words, all fragments have the same identification number, which is also the same as the original datagram.
The identification number helps the destination in reassembling the datagram. It knows that all fragments
having the same identification value should be assembled into one datagram
Flags used in fragmentation
When the payload of the IP datagram is fragmented, most parts of the header, with the exception of some
options, must be copied by all fragments. The host or router that fragments a datagram must change the
values of three fields: flags, fragmentation offset, and total length. The value of the checksum must be
recalculated regardless of fragmentation.

Figure 4.7 Flags in fragmentation


The leftmost bit is reserved (not used). The second bit (D bit) is called the do not fragment bit. If its value
is 1, the machine must not fragment the datagram. If it cannot pass the datagram through any available
physical network, it discards the datagram and sends an ICMP error message to the source host (discussed
later). If its value is 0, the datagram can be fragmented if necessary. The third bit (M bit) is called the more
fragment bit. If its value is 1, it means the datagram is not the last fragment; there are more fragments after
this one. If its value is 0, it means this is the last or only fragment.
Fragmentation offset
The 13-bit fragmentation offset field shows the relative position of this fragment with respect to the whole
datagram. It is the offset of the data in the original datagram measured in units of 8 bytes.
Fragmentation example
Figure 4.8 shows a datagram with a data size of 4000 bytes fragmented into three fragments. The bytes in
the original datagram are numbered 0 to 3999. The first fragment carries bytes 0 to 1399. The offset for
this datagram is 0/8 = 0. The second fragment carries bytes 1400 to 2799; the offset value for this fragment
is 1400/8 = 175. Finally, the third fragment carries bytes 2800 to 3999. The offset value for this fragment
is 2800/8 = 350.

Figure 4.8 A fragmentation example

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


130
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Detailed fragmentation example


The value of the offset is measured in units of 8 bytes. This is done because the length of the offset field is
only 13 bits long and cannot represent a sequence of bytes greater than 8191. This forces hosts or routers
that fragment datagrams to choose the size of each fragment so that the first byte number is divisible by 8.
Figure 4.9. shows an expanded view of the fragments in the previous figure. The original packet starts at the
client; the fragments are reassembled at the server. The value of the identification field is the same in all
fragments, as is the value of the flags field with the more bit set for all fragments except the last. Also, the
value of the offset field for each fragment is shown. Note that although the fragments arrived out of order
at the destination, they can be correctly reassembled.

Figure 4.9 Detailed fragmentation example - Expanded view


Example
A packet has arrived with an M bit value of 0. Is this the first fragment, the last fragment, or a middle
fragment? Do we know if the packet was fragmented?
Solution : If the M bit is 0, it means that there are no more fragments; the fragment is the last one. However,
we cannot say if the original packet was fragmented or not. A non-fragmented packet is considered the last
fragment.

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 131


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Example :A packet has arrived with an M bit value of 1. Is this the first fragment, the last fragment, or a middle fragment?
Do we know if the packet was fragmented?
Solution - If the M bit is 1, it means that there is at least one more fragment. This fragment can be the first
one or a middle one, but not the last one. We don’t know if it is the first one or a middle one; we need
more information (the value of the fragmentation offset).
Example : A packet has arrived with an M bit value of 1 and a fragmentation offset value of 0. Is this the first fragment, the
last fragment, or a middle fragment?
Solution - Because the M bit is 1, it is either the first fragment or a middle one. Because the offset value is 0,
it is the first fragment.
Example A packet has arrived in which the offset value is 100. What is the number of the first byte? Do we know the number
of the last byte?
Solution - To find the number of the first byte, we multiply the offset value by 8. This means that the first
byte number is 800. We cannot determine the number of the last byte unless we know the length of the
data.
Example - A packet has arrived in which the offset value is 100, the value of HLEN is 5, and the value of the total length
field is 100. What are the numbers of the first byte and the last byte?
Solution The first byte number is 100 × 8 = 800. The total length is 100 bytes, and the header length is 20
bytes (5 × 4), which means that there are 80 bytes in this datagram. If the first byte number is 800, the last
byte number must be 879.
Options The header of the IPv4 datagram is made of two parts: a fixed part and a variable part. The fixed
part is 20 bytes long and was discussed in the previous section. The variable part comprises the options that
can be a maximum of 40 bytes (in multiples of 4-bytes) to preserve the boundary of the header. Options, as
the name implies, are not required for a datagram. They can be used for network testing and debugging.
Taxonomy of options in IPv4

Figure 4.10: Classification of Options

I Single-Byte Options: There are two single-byte options. No Operation A no-operation option is a 1-
byte option used as a filler between options. End of Option An end-of-option option is a 1-byte option
used for padding at the end of the option field. It, however, can only be used as the last option.

II Multliple-Byte Options: There are four multiple-byte options.

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


132
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

a) Record Route A record route option is used to record the Internet routers that handle the
datagram. It can list up to nine router addresses. It can be used for debugging and management
purposes.
b) Strict Source Route A strict source route option is used by the source to predetermine a route for
the datagram as it travels through the Internet. Here, the sender can choose a route with a specific
type of service, such as minimum delay or maximum throughput. Alternatively, it may choose a
route that is safer or more reliable for the sender’s purpose. For example, a sender can choose a
route so that its datagram does not travel through a competitor’s network. If a datagram specifies a
strict source route, all the routers defined in the option must be visited by the datagram
c) Loose Source Route: A loose source route option is similar to the strict source route, but it is less
rigid. Each router in the list must be visited, but the datagram can visit other routers as well.
d) Timestamp: A timestamp option is used to record the time of datagram processing by a router.
The time is expressed in milliseconds from midnight, Universal time or Greenwich mean time.
Knowing the time a datagram is processed can help users and managers track the behavior of the
routers in the Internet
Security of IPv4 Datagrams
The IPv4 protocol, as well as the whole Internet, was started when the Internet users trusted each other. No
security was provided for the IPv4 protocol. Today, however, the situation is different; the Internet is not
secure anymore. There are three security issues that are particularly applicable to the IP protocol: packet
sniffing, packet modification, and IP spoofing.
Packet Sniffing: An intruder may intercept an IP packet and make a copy of it. Packet sniffing is a passive
attack, in which the attacker does not change the contents of the packet. This type of attack is very difficult
to detect because the sender and the receiver may never know that the packet has been copied. Although
packet sniffing cannot be stopped, encryption of the packet can make the attacker’s effort useless. The
attacker may still sniff the packet, but the content is not detectable.
Packet Modification: The second type of attack is to modify the packet. The attacker intercepts the packet,
changes its contents, and sends the new packet to the receiver. The receiver believes that the packet is
coming from the original sender. This type of attack can be detected using a data integrity mechanism. The
receiver, before opening and using the contents of the message, can use this mechanism to make sure that the
packet has not been changed during the transmission
IP Spoofing : An attacker can masquerade as somebody else and create an IP packet that carries the source
address of another computer. An attacker can send an IP packet to a bank pretending that it is coming
from one of the customers
IPSec The IP packets today can be protected from the previously mentioned attacks using a protocol called
IPSec (IP Security).
ICMPv4
The IPv4 has no error-reporting or error-correcting mechanism. The IP protocol also lacks a mechanism for
host and management queries. A host sometimes needs to determine if a router or another host is alive.
And sometimes a network manager needs information from another host or router.
The Internet Control Message Protocol version 4 (ICMPv4) has been designed to compensate for the above
two deficiencies. It is a companion to the IP protocol. ICMP itself is a network-layer protocol. However,
its messages are not passed directly to the data-link layer as would be expected. Instead, the messages are
first encapsulated inside IP datagrams before going to the lower layer. When an IP

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 133


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

datagram encapsulates an ICMP message, the value of the protocol field in the IP datagram is set to 1 to
indicate that the IP payroll is an ICMP message.
MESSAGES
ICMP messages are divided into two broad categories: error-reporting messages and query messages. The
error-reporting messages report problems that a router or a host (destination) may encounter when it
processes an IP packet. The query messages, which occur in pairs, help a host or a network manager get
specific information from a router or another host. For example, nodes can discover their neighbors
An ICMP message has an 8-byte header and a variable-size data section. Although the general format of
the header is different for each message type, the first 4 bytes are common to all. As Figure 4. shows, the first
field, ICMP type, defines the type of the message. The code field specifies the reason for the particular
message type. The last common field is the checksum field (to be discussed later in the chapter). The rest
of the header is specific for each message type.
The data section in error messages carries information for finding the original packet that had the error. In
query messages, the data section carries extra information based on the type of query.

Figure 4.11 - General format of ICMP messages


Error Reporting Messages
Since IP is an unreliable protocol, one of the main responsibilities of ICMP is to report some errors that may
occur during the processing of the IP datagram. ICMP does not correct errors, it simply reports them. Error
correction is left to the higher-level protocols. Error messages are always sent to the original source because
the only information available in the datagram about the route is the source and destination IP addresses.
ICMP uses the source IP address to send the error message to the source (originator) of the datagram. To
make the error-reporting process simple, ICMP follows some rules in reporting messages.
1. No error message will be generated for a datagram having a multicast address or special address
(such as this host or loopback).
2. No ICMP error message will be generated in response to a datagram carrying an ICMP error
message.

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


134
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

3. No ICMP error message will be generated for a fragmented datagram that is not the first
fragment.
Note that all error messages contain a data section that includes the IP header of the original datagram plus
the first 8 bytes of data in that datagram. The original datagram header is added to give the original source,
which receives the error message, information about the datagram itself. The 8 bytes of data are included
because the first 8 bytes provide information about the port numbers (UDP and TCP) and sequence
number (TCP). This information is needed so the source can inform the protocols (TCP or UDP) about
the error.
The following are important points about ICMP error messages:

❑ No ICMP error message will be generated in response to a datagram carrying an ICMP error message.

❑ No ICMP error message will be generated for a fragmented datagram that is not the first fragment.

❑ No ICMP error message will be generated for a datagram having a multicast address.

❑ No ICMP error message will be generated for a datagram having a special address such as 127.0.0.0 or
0.0.0.0.

Error Message Types

 Destination Unreachable - The most widely used error message is the destination unreachable
(type 3). This message uses different codes (0 to 15) to define the type of error message and the
reason why a datagram has not reached its final destination
 Source Quench - Another error message is called the source quench (type 4) message, which
informs the sender that the network has encountered congestion and the datagram has been
dropped; the source needs to slow down sending more datagrams. In other words, ICMP adds a
kind of congestion control mechanism to the IP protocol by using this type of message.
 Redirection Message - The redirection message (type 5) is used when the source uses a wrong
router to send out its message. The router redirects the message to the appropriate router, but
informs the source that it needs to change its default router in the future. The IP address of the
default router is sent in the message.
 Parameter Problem - A parameter problem message (type 12) can be sent when either there is a
problem in the header of a datagram (code 0) or some options are missing or cannot be interpreted
(code 1).
 Query Messages - Query messages are used to probe or test the liveliness of hosts or routers in

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 135


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

the Internet, find the one-way or the round-trip time for an IP datagram between two devices, or
even find out whether the clocks in two devices are synchronized. Query messages in ICMP can be
used independently without relation to an IP datagram. But a query message needs to be
encapsulated in a datagram, as a carrier. Naturally, query messages come in pairs: request and reply.
The echo request (type 8) and the echo reply (type 0) pair of messages are used by a host or a router
to test the liveliness of another host or router
Deprecated Messages
Three pairs of messages are declared obsolete by IETF:

1. Information request and replay messages are not used today because their duties are done by the Address
Resolution Protocol (ARP).
2. Address mask request and reply messages are not used today because their duties are done by the
Dynamic Host Configuration Protocol (DHCP)

3. Router solicitation and advertisement messages are not used today because their duties are done by the
Dynamic Host Configuration Protocol (DHCP)
Debugging Tools
There are several tools that can be used in the Internet for debugging. We can determine the viability of a
host or router. We can trace the route of a packet. We introduce two tools that use ICMP for debugging: ping
and trace route.
Ping - We can use the ping program to find if a host is alive and responding. We use ping here to see how
it uses ICMP packets. The source host sends ICMP echo-request messages; the destination, if alive, responds
with ICMP echo-reply messages. The ping program sets the identifier field in the echo-request and echo-
reply message and starts the sequence number from 0; this number is incremented by 1 each time a new
message is sent. Note that ping can calculate the round-trip time. It inserts the sending time in the data
section of the message. When the packet arrives, it subtracts the arrival time from the departure time to get
the round-trip time (RTT).
Example: The following shows how we send a ping message to the auniversity.edu site. We set the identifier field in the echo
request and reply message and start the sequence number from 0; this number is incremented by one each time a new message is
sent. Note that ping can calculate the round-trip time. It inserts the sending time in the data section of the message. When the
packet arrives, it subtracts the arrival time from the departure time to get the round-trip time (rtt).
$ ping auniversity.edu PING auniversity.edu (152.181.8.3) 56 (84) bytes of data.
64 bytes from auniversity.edu (152.181.8.3): icmp_seq=0 ttl=62 time=1.91 ms
64 bytes from auniversity.edu (152.181.8.3): icmp_seq=1 ttl=62 time=2.04 ms
64 bytes from auniversity.edu (152.181.8.3): icmp_seq=2 ttl=62 time=1.90 ms
64 bytes from auniversity.edu (152.181.8.3): icmp_seq=3 ttl=62 time=1.97 ms
64 bytes from auniversity.edu (152.181.8.3): icmp_seq=4 ttl=62 time=1.93 ms
64 bytes from auniversity.edu (152.181.8.3): icmp_seq=5 ttl=62 time=2.00 ms
--- auniversity.edu statistics -- 6 packets transmitted, 6 received, 0% packet loss rtt min/avg/max =
1.90/1.95/2.04 ms

Traceroute or Tracert - The trace route program in UNIX or tracert in Windows can be used to trace the
path of a packet from a source to the destination. It can find the IP addresses of all the routers that are
visited along the path. The program is usually set to check for the maximum of 30 hops (routers) to be
visited. The number of hops in the Internet is normally less than this.

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


136
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Traceroute - The traceroute program is different from the ping program. The ping program gets help from
two query messages; the traceroute program gets help from two error-reporting messages: time- exceeded
and destination-unreachable. The traceroute is an application layer program, but only the client program is
needed, because, as we can see, the client program never reaches the application layer in the destination host.
In other words, there is no traceroute server program. The traceroute application program is encapsulated
in a UDP user datagram, but traceroute intentionally uses a port number that is not available at the
destination. If there are n routers in the path, the traceroute program sends (n + 1) messages. The first n
messages are discarded by the n routers, one by each router; the last message is discarded by the destination
host. The traceroute client program uses the (n + 1) ICMP error-reporting messages received to find the
path between the routers.
The first traceroute message is sent with time-to-live (TTL) value set to 1; the message is discarded at the
first router and a time-exceeded ICMP error message is sent, from which the traceroute program can find
the IP address of the first router (the source IP address of the error message) and the router name (in the
data section of the message). The second traceroute message is sent with TTL set to 2, which can find the
IP address and the name of the second router. Similarly, the third message can find the information about
router 3. The fourth message, however, reaches the destination host.

Figure 4.12: Use of ICMPv4 in traceroute


The traceroute program also sets a timer to find the round-trip time for each router and the destination.
Most traceroute programs send three messages to each device, with the same TTL value, to be able to find
a better estimate for the round-trip time. The following shows an example of a traceroute program, which
uses three probes for each device and gets three RTTs.
$ traceroute printers.com traceroute to printers.com (13.1.69.93), 30 hops max, 38-byte packets 1
route.front.edu (153.18.31.254) 0.622 ms 0.891 ms 0.875 ms
2 ceneric.net (137.164.32.140) 3.069 ms 2.875 ms 2.930 ms
3 satire.net (132.16.132.20) 3.071 ms 2.876 ms 2.929 ms
4 alpha.printers.com (13.1.69.93) 5.922 ms 5.048 ms 4.922 ms
Tracert - The tracert program in windows behaves differently. The tracert messages are encapsulated
directly in IP datagrams. The tracert, like traceroute, sends echo-request messages. However, when the last
echo request reaches the destination host, an echore play message is issued

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 137


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

ICMP Checksum
In ICMP the checksum is calculated over the entire message (header and data).
Example : Figure 4.13 below shows an example of checksum calculation for a simple echo-request message. We randomly
chose the identifier to be 1 and the sequence number to be 9. The message is divided into 16-bit (2-byte) words. The words are
added and the sum is complemented. Now the sender can put this value in the checksum field.

Figure 4.13 – Calculation of ICMP checksum


MOBILE IP
As mobile and personal computers such as notebooks become increasingly popular, we need to think about
mobile IP, the extension of IP protocol that allows mobile computers to be connected to the Internet at
any location where the connection is possible.
Addressing: The main problem that must be solved in providing mobile communication using the
IP protocol is addressing
Stationary Hosts - The original IP addressing was based on the assumption that a host is stationary,
attached to one specific network. A router uses an IP address to route an IP datagram. An IP address has
two parts: a prefix and a suffix. The prefix associates a host with a network. For example, the IP address
10.3.4.24/8 defines a host attached to the network 10.0.0.0/8.
The IP addresses are designed to work with stationary hosts because part of the address defines the network
to which the host is attached.
Mobile Hosts - When a host moves from one network to another, the IP addressing structure needs to be
modified. Several solutions have been proposed.

1. Changing the Address - One simple solution is to let the mobile host change its address as it
goes to the new network. The host can use DHCP to obtain a new address to associate it with the
new network. This approach has several drawbacks. First, the configuration files would need to be
changed. Second, each time the computer moves from one network to another, it must be
rebooted. Third, the DNS tables need to be revised so that every other host in the Internet is aware
of the change. Fourth, if the host roams from one network to another during a transmission, the
data exchange will be interrupted. This is because the ports and IP addresses of the client and the
server must remain constant for the duration of the connection.
2. Two Addresses - The approach that is more feasible is the use of two addresses. The host has its
original address, called the home address, and a temporary address, called the care-of

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


138
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

address. The home address is permanent; it associates the host with its home network, the network
that is the permanent home of the host. The care-of address is temporary. When a host moves from
one network to another, the care-of address changes; it is associated with the foreign network, the
network to which the host moves. When a mobile host visits a foreign network, it receives its care-
of address during the agent discovery and registration phase

Figure 4.14: Home address and care-of address


Agents
To make the change of address transparent to the rest of the Internet requires a home agent and a foreign
agent. Figure 4.14 shows the position of a home agent relative to the home network and a foreign agent
relative to the foreign network. We have shown the home and the foreign agents as routers, but we need to
emphasize that their specific function as an agent is performed in the application layer. In other words, they
are both routers and hosts.
Home Agent - The home agent is usually a router attached to the home network of the mobile host. The
home agent acts on behalf of the mobile host when a remote host sends a packet to the mobile host. The
home agent receives the packet and sends it to the foreign agent.
Foreign Agent - The foreign agent is usually a router attached to the foreign network. The foreign agent
receives and delivers packets sent by the home agent to the mobile host. The mobile host can also act as a
foreign agent. In other words, the mobile host and the foreign agent can be the same. However, to do this,
a mobile host must be able to receive a care-of address by itself, which can be done through the use of
DHCP. In addition, the mobile host needs the necessary software to allow it to communicate with the home
agent and to have two addresses: its home address and its care-of address. This dual addressing must be
transparent to the application programs. When the mobile host acts as a foreign agent, the care- of address
is called a collocated care-of address.
The advantage of using a collocated care-of address is that the mobile host can move to any network
without worrying about the availability of a foreign agent. The disadvantage is that the mobile host needs
extra software to act as its own foreign agent
Three Phases - To communicate with a remote host, a mobile host goes through three phases: agent
discovery, registration, and data transfer, as shown in Figure 4.15.

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 139


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Figure 4.15 : Remote host and mobile host communication


The first phase, agent discovery, involves the mobile host, the foreign agent, and the home agent. The
second phase, registration, also involves the mobile host and the two agents. Finally, in the third phase, the
remote host is also involved. We discuss each phase separately.
Agent Discovery - The first phase in mobile communication, agent discovery, consists of two sub phases.
A mobile host must discover (learn the address of) a home agent before it leaves its home network. A
mobile host must also discover a foreign agent after it has moved to a foreign network. This discovery
consists of learning the care-of address as well as the foreign agent’s address. The discovery involves two
types of messages: advertisement and solicitation
Agent Advertisement - When a router advertises its presence on a network using an ICMP router
advertisement, it can append an agent advertisement to the packet if it acts as an agent. Mobile IP does not
use a new packet type for agent advertisement; it uses the router advertisement packet of ICMP, and appends
an agent advertisement message

Figure 4.16 : Agent advertisement

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


140
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

The field descriptions are as follows:

❑ Type. The 8-bit type field is set to 16.

❑ Length. The 8-bit length field defines the total length of the extension message (not the length of the
ICMP advertisement message).

❑ Sequence number. The 16-bit sequence number field holds the message number. The recipient can
use the sequence number to determine if a message is lost.

❑ Lifetime. The lifetime field defines the number of seconds that the agent will accept requests. If the
value is a string of 1s, the lifetime is infinite.

❑ Code. The code field is an 8-bit flag in which each bit is set (1) or unset (0). The meanings of the bits
are shown in Table 4.2
Table 4.2 :Code bits

❑ Care-of Addresses. This field contains a list of addresses available for use as careof addresses. The
mobile host can choose one of these addresses. The selection of this care-of address is announced in the
registration request. Note that this field is used only by a foreign agent.
Agent Solicitation
When a mobile host has moved to a new network and has not received agent advertisements, it can initiate
an agent solicitation. It can use the ICMP solicitation message to inform an agent that it needs assistance.
Mobile IP does not use a new packet type for agent solicitation; it uses the router solicitation packet of ICMP.
Registration - The second phase in mobile communication is registration. After a mobile host has moved
to a foreign network and discovered the foreign agent, it must register. There are four aspects of registration:

1. The mobile host must register itself with the foreign agent.

2. The mobile host must register itself with its home agent. This is normally done by the foreign agent on
behalf of the mobile host.

3. The mobile host must renew registration if it has expired.


4. The mobile host must cancel its registration (deregistration) when it returns home. Request and Reply To
register with the foreign agent and the home agent, the mobile host uses a registration request and a
registration reply as shown in Figure 4.17.
Registration Request - A registration request is sent from the mobile host to the foreign agent to register

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 141


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

its care-of address and also to announce its home address and home agent address. The foreign agent, after
receiving and registering the request, relays the message to the home agent.

Figure 4.17 : Registration request format


The field descriptions are as follows:

❑ Type. The 8-bit type field defines the type of message. For a request message the value of this field is 1.

❑ Flag. The 8-bit flag field defines forwarding information. The value of each bit can be set or unset. The
meaning of each bit is given in Table 4.3.
Table 4.3 : Registration request flag field bits

❑ Lifetime. This field defines the number of seconds the registration is valid. If the field is a string of 0s,
the request message is asking for deregistration. If the field is a string of 1s, the lifetime is infinite.

❑ Home address. This field contains the permanent (first) address of the mobile host.

❑ Home agent address. This field contains the address of the home agent.

❑ Care-of address. This field is the temporary (second) address of the mobile host. ❑ Identification. This
field contains a 64-bit number that is inserted into the request by the mobile host and repeated in the reply
message. It matches a request with a reply.
❑ Extensions. Variable length extensions are used for authentication. They allow a home agent to
authenticate the mobile agent
Registration Reply - A registration reply is sent from the home agent to the foreign agent and then relayed
to the mobile host. The reply confirms or denies the registration request. Figure 4.18 shows the
format of the registration reply. The fields are similar to those of the registration request with the following
exceptions. The value of the type field is 3. The code field replaces the flag field and shows the result of the
DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T
142
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

registration request (acceptance or denial). The care-of address field is not needed.

Figure 4.18: Registration reply format


Encapsulation - Registration messages are encapsulated in a UDP user datagram. An agent uses the well-
known port 434; a mobile host uses an ephemeral port.
Data Transfer - After agent discovery and registration, a mobile host can communicate with a remote host.
Figure 4.19 shows the idea.

Figure 4.19 : Data transfer


From Remote Host to Home Agent - When a remote host wants to send a packet to the mobile host, it
uses its address as the source address and the home address of the mobile host as the destination address.
In other words, the remote host sends a packet as though the mobile host is at its home network. The packet,
however, is intercepted by the home agent, which pretends it is the mobile host. This is done using the
proxy ARP technique. Path 1 of Figure 4.19 shows this step.
From Home Agent to Foreign Agent - After receiving the packet, the home agent sends the packet to
the foreign agent, using the tunneling concept. The home agent encapsulates the whole IP packet inside
another IP packet using its address as the source and the foreign agent’s address as the destination. Path 2 of
Figure 4.19 shows this step.
From Foreign Agent to Mobile Host When the foreign agent receives the packet, it removes the original
packet. However, since the destination address is the home address of the mobile host, the foreign agent
MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 143
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

consults a registry table to find the care-of address of the mobile host. (Otherwise, the packet would just
be sent back to the home network.) The packet is then sent to the care-of address. Path 3 of Figure 4.19.
shows this step.
From Mobile Host to Remote Host When a mobile host wants to send a packet to a remote host (for
example, a response to the packet it has received), it sends as it does normally. The mobile host prepares a
packet with its home address as the source, and the address of the remote host as the destination. Although
the packet comes from the foreign network, it has the home address of the mobile host. Path 4 of Figure
4.19. shows this step.
Transparency - In this data transfer process, the remote host is unaware of any movement by the mobile
host. The remote host sends packets using the home address of the mobile host as the destination address;
it receives packets that have the home address of the mobile host as the source address. The movement is
totally transparent. The rest of the Internet is not aware of the movement of the mobile host.
Inefficiency in Mobile IP
Communication involving mobile IP can be inefficient. The inefficiency can be severe or moderate. The
severe case is called double crossing or 2X. The moderate case is called triangle routing or dog-leg routing.
Double Crossing - Double crossing occurs when a remote host communicates with a mobile host that has
moved to the same network (or site) as the remote host (see Figure 4.20). When the mobile host sends a
packet to the remote host, there is no inefficiency; the communication is local. However, when the remote
host sends a packet to the mobile host, the packet crosses the Internet twice. Since a computer usually
communicates with other local computers (principle of locality), the inefficiency from double crossing is
significant.
Triangle Routing - Triangle routing, the less severe case, occurs when the remote host communicates with
a mobile host that is not attached to the same network (or site) as the mobile host. When the mobile host
sends a packet to the remote host, there is no inefficiency

Figure 4. 20: Double crossing


However, when the remote host sends a packet to the mobile host, the packet goes from the remote host to
the home agent and then to the mobile host. The packet travels the two sides of a triangle, instead of just
one side

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


144
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

Figure 4.21: Triangle routing


Solution - One solution to inefficiency is for the remote host to bind the care-of address to the home
address of a mobile host. For example, when a home agent receives the first packet for a mobile host, it
forwards the packet to the foreign agent; it could also send an update binding packet to the remote host so
that future packets to this host could be sent to the care-of address. The remote host can keep this
information in a cache. The problem with this strategy is that the cache entry becomes outdated once the
mobile host moves. In this case the home agent needs to send a warning packet to the remote host to
inform it of the change.

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 145


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

UNICAST ROUTING

Unicast routing in the Internet, with a large number of routers and a huge number of hosts,
can be done only by using hierarchical routing: routing in several steps using different
routing algorithms.
In this section, we first discuss the general concept of unicast routing in an internet. After
the routing concepts and algorithms are understood, we show how we can apply them to
the Internet.
General Idea

In unicast routing, a packet is routed, hop by hop, from its source to its destination by the help of
forwarding tables. The source host needs no forwarding table because it delivers its packet to the
default router in its local network. The destination host needs no forwarding table either because it
receives the packet from its default router in its local network. This means that only the routers that glue
together the networks in the internet need forwarding tables. So, routing a packet from its source to its
destination means routing the packet from a source router to a destination router

Figure 4.22: An internet and its graphical representation


An Internet as a Graph
To find the best route, an internet can be modeled as a graph. A graph in computer science is a set of nodes
and edges (lines) that connect the nodes. To model an internet as a graph, we can think of each router as a
node and each network between a pair of routers as an edge. An internet is, in fact, modeled as a
weighted graph, in which each edge is associated with a cost. If a weighted graph is used to represent a
geographical area, the nodes can be cities and the edges can be roads connecting the cities; the weights, in
this case, are distances between cities. In routing, however, the cost of an edge has a different
interpretation in different routing protocols, which we discuss in a later section. For the moment, we
assume that there is a cost associated with each edge. If there is no edge between the nodes, the cost is
infinity. Figure 4.22 shows how an internet can be modeled as a graph.
Least-Cost Routing
When an internet is modeled as a weighted graph, one of the ways to interpret the best route from the
source router to the destination router is to find the least cost between the two. In other words, the
DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T
146
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

source router chooses a route to the destination router in such a way that the total cost for the route is
the least cost among all possible routes. In Figure 4.1, the best route between A and E is A-B- E, with the
cost of 6. This means that each router needs to find the least-cost route between itself and all the other
routers to be able to route a packet using this criteria.

Least-Cost Trees
If there are N routers in an internet, there are (N − 1) least-cost paths from each router to any other
router. This means we need N × (N − 1) least-cost paths for the whole internet. If we have only 10
routers in an internet, we need 90 least-cost paths. A better way to see all of these paths is to combine them
in a least-cost tree. A least-cost tree is a tree with the source router as the root that spans the whole graph
(visits all other nodes) and in which the path between the root and any other node is the shortest. In this
way, we can have only one shortest-path tree for each node; we have N least-cost trees for the whole
internet.

Figure 4.23: Least-cost trees for nodes in the internet of Figure 4.22
The least-cost trees for a weighted graph can have several properties if they are created using
consistent criteria 1.
1. The least-cost route from X to Y in X’s tree is the inverse of the least-cost route from Y to X in
Y’s tree; the cost in both directions is the same. For example, in Figure 2.23, the route from A to
F in A’s tree is (A → B → E → F), but the route from F to A in F’s tree is (F → E → B → A),
which is the inverse of the first route. The cost is 8 in each case. 2.
2. Instead of travelling from X to Z using X’s tree, we can travel from X to Y using X’s tree and
continue from Y to Z using Y’s tree. For example, in Figure 2.23, we can go from A to G in A’s
tree using the route (A → B → E → F → G). We can also go from A to E in A’s tree (A → B → E)
MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 147
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

and then continue in E’s tree using the route (E → F → G). The combination of the two routes in
the second case is the same route as in the first case. The cost in the first case is 9; the cost in the
second case is also 9 (6 + 3).
ROUTING ALGORITHMS
Several routing algorithms have been designed in the past. The differences between these methods are in
the way they interpret the least cost and the way they create the least-cost tree for each node. In this
section, we discuss the common algorithms; later we show how a routing protocol in the Internet
implements one of these algorithms.
Distance-Vector Routing
The distance-vector (DV) routing uses the goal we discussed in the introduction, to find the best route. In
distance-vector routing, the first thing each node creates is its own least-cost tree with the
rudimentary information it has about its immediate neighbors. The incomplete trees are exchanged
between immediate neighbors to make the trees more and more complete and to represent the whole
internet. In distance-vector routing, a router continuously tells all of its neighbors what it knows about the
whole internet (although the knowledge can be incomplete).
Bellman-Ford Equation
The heart of distance-vector routing is the famous Bellman-Ford equation. This equation is used to find
the least cost (shortest distance) between a source node, x, and a destination node, y, through some
intermediary nodes (a, b, c, . . .) when the costs between the source and the intermediary nodes and the least
costs between the intermediary nodes and the destination are given. The following shows the general
case in which Dij is the shortest distance and cij is the cost between nodes i and j. Dxy = min{(cxa +
Day), (cxb + Dby), (cxc + Dcy), …}
In distance-vector routing, normally we want to update an existing least cost with a least cost through an
intermediary node, such as z, if the latter is shorter. In this case, the equation becomes simpler, as shown
below:
Dxy = min{Dxy, (cxz + Dzy)}

Figure 4.24 shows the idea graphically for both cases.

Figure 4.24 Graphical idea behind Bellman-Ford equation

We can say that the Bellman-Ford equation enables us to build a new least-cost path from previously
established least-cost paths. In Figure 4.24, we can think of (a→y), (b→y), and (c→y) as previously
established least-cost paths and (x→y) as the new least-cost path
DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T
148
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

Distance Vectors
The concept of a distance vector is the rationale for the name distance-vector routing. A least-cost tree
is a combination of least-cost paths from the root of the tree to all destinations. These paths are
graphically glued together to form the tree. Distance-vector routing unglues these paths and creates a
distance vector, a one-dimensional array to represent the tree. Figure 4.25 shows the tree for node A in the
internet in Figure 4.1 and the corresponding distance vector. Note that the name of the distance vector
defines the root, the indexes define the destinations, and the value of each cell defines the least cost
from the root to the destination. A distance vector does not give the path to the destinations as the least-
cost tree does; it gives only the least costs to the destinations

Figure 4.25: The distance vector corresponding to a tree

Each node in an internet, when it is booted, creates a very rudimentary distance vector with the
minimum information the node can obtain from its neighborhood. The node sends some greeting
messages out of its interfaces and discovers the identity of the immediate neighbors and the distance
between itself and each neighbor. It then makes a simple distance vector by inserting the discovered
distances in the corresponding cells and leaves the value of other cells as infinity
Figure 4.26 shows all distance vectors for our internet. However, we need to mention that these vectors are
made asynchronously, when the corresponding node has been booted; the existence of all of them in a
figure does not mean synchronous creation of them

Figure 4.26: The first distance vector for an internet

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 149


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

These rudimentary vectors cannot help the internet to effectively forward a packet. For example, node A
thinks that it is not connected to node G because the corresponding cell shows the least cost of
infinity. To improve these vectors, the nodes in the internet need to help each other by exchanging
information. After each node has created its vector, it sends a copy of the vector to all its immediate
neighbors. After a node receives a distance vector from a neighbor, it updates its distance vector using the
Bellman-Ford equation (second case). However, we need to understand that we need to update, not only
one least cost, but N of them in which N is the number of the nodes in the internet. If we are using a
program, we can do this using a loop; if we are showing the concept on paper, we can show the whole
vector instead of the N separate equations. We show the whole vector instead of seven equations for
each update in Figure 4.27.

Figure 4.27: Updating distance vectors

In the first event, node A has sent its vector to node B. Node B updates its vector using the cost cBA =
2. In the second event, node E has sent its vector to node B. Node B updates its vector using the cost cEA
= 4. After the first event, node B has one improvement in its vector: its least cost to node D has changed
from infinity to 5 (via node A). After the second event, node B has one more improvement in its vector;
its least cost to node F has changed from infinity to 6 (via node E). We hope that we have convinced the
reader that exchanging vectors eventually stabilizes the system and allows all nodes to find the ultimate least
cost between themselves and any other node. We need to remember that after updating a node, it
immediately sends its updated vector to all neighbors. Even if its neighbors have received the previous
vector, the updated one may help more
Distance-Vector Routing Algorithm:
Now we can give a simplified pseudocode for the distance-vector routing algorithm, as shown in Figure
4.28. The algorithm is run by its node independently and asynchronously.
Lines 4 to 11 initialize the vector for the node. Lines 14 to 23 show how the vector can be updated after
receiving a vector from the immediate neighbor. The for loop in lines 17 to 20 allows all entries (cells) in
the vector to be updated after receiving a new vector. Note that the node sends its vector in line 12, after
being initialized, and in line 22, after it is updated.

Figure 4.28: Distance-Vector Routing Algorithm for a Node


DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T
150
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

Figure 4.29: Distance-Vector Routing Algorithm for a Node (continued)


Count to Infinity A problem with distance-vector routing is that any decrease in cost (good news)
propagates quickly, but any increase in cost (bad news) will propagate slowly. For a routing protocol to
work properly, if a link is broken (cost becomes infinity), every other router should be aware of it
immediately, but in distance-vector routing, this takes some time. The problem is referred to as count to
infinity. It sometimes takes several updates before the cost for a broken link is recorded as infinity by all
routers.
Two-Node Loop One example of count to infinity is the two-node loop problem. To understand the
problem, let us look at the scenario depicted in Figure 4.8. The figure shows a system with three nodes. We
have shown only the portions of the forwarding table needed for our discussion. At the beginning, both
nodes A and B. know how to reach node X. But suddenly, the link between A and X fails. Node A changes
its table. If A can send its table to B immediately, everything is fine. However, the system becomes
unstable if B sends its forwarding table to A before receiving A’s forwarding table. Node A receives the
update and, assuming that B has found a way to reach X, immediately updates its forwarding table.
Now A sends its new update to B. Now B thinks that something has been changed around A and updates
its forwarding table. The cost of reaching X increases gradually until it reaches infinity. At this moment,
both A and B know that X cannot be reached. However, during this time the system is not stable. Node
A thinks that the route to X is via B; node B thinks that the route to X is via
A. If A receives a packet destined for X, the packet goes to B and then comes back to A. Similarly, if B
receives a packet destined for X, it goes to A and comes back to B. Packets bounce between A and B,
creating a two-node loop problem
Split Horizon One solution to instability is called split horizon. In this strategy, instead of flooding the
table through each interface, each node sends only part of its table through each interface. If,
according to its table, node B thinks that the optimum route to reach X is via A, it does not need to
advertise this piece of information to A; the information has come from A (A already knows). In our
scenario, node B eliminates the last line of its forwarding table before it sends it to A. In this case, node
A keeps the value of infinity as the distance to X. Later, when node A sends its forwarding table to B, node
B also corrects its forwarding table. The system becomes stable after the first update: both node A and
MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 151
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

node B know that X is not reachable.

Figure 4.30 Two-node instability


Poison Reverse Using the split-horizon strategy has one drawback. Normally, the corresponding
protocol uses a timer, and if there is no news about a route, the node deletes the route from its table.
When node B in the previous scenario eliminates the route to X from its advertisement to A, node A
cannot guess whether this is due to the split-horizon strategy (the source of information was A) or
because B has not received any news about X recently. In the poison reverse strategy B can still
advertise the value for X, but if the source of information is A, it can replace the distance with infinity as
a warning: “Do not use this value; what I know about this route comes from you.”
Three-Node Instability The two-node instability can be avoided using split horizon combined with
poison reverse. However, if the instability is between three nodes, stability cannot be guaranteed
Link-State Routing
This method uses the term link-state to define the characteristic of a link (an edge) that represents a
network in the internet. In this algorithm the cost associated with an edge defines the state of the link.
Links with lower costs are preferred to links with higher costs; if the cost of a link is infinity, it means
that the link does not exist or has been broken.
Link-State Database (LSDB) - To create a least-cost tree with this method, each node needs to have a
complete map of the network, which means it needs to know the state of each link. The collection of
states for all links is called the link-state database (LSDB). There is only one LSDB for the whole
internet; each node needs to have a duplicate of it to be able to create the least-cost tree. Figure 4.9 shows
an example of an LSDB for the graph in Figure 4.1. The LSDB can be represented as a two- dimensional
array (matrix) in which the value of each cell defines the cost of the corresponding link.

Figure 4.31: Example of a link-state database

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


152
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

How each node can create this LSDB?

By flooding: Each node send LS packet (LSP) to all its immediate neighbors (each interface) to collect
two pieces of information for each neighboring node:
1. The identity of the node.

2. The cost of the link.

When a node receives an LSP, it compares the LSP with the copy it may already have. The node checks the
sequence number in both LSP to know which one is old and discards it, and keep the new one. Then
the node sends a copy of it out of each interface except the one from which the packet arrived. This
guarantees that flooding stops somewhere in the network (where a node has only one interface).

Each node creates the comprehensive LSDB. This LSDB is the same for each node and shows the whole
map of the internet. In other words, a node can make the whole map if it needs to, using

Figure 4.32: LSPs created and sent out by each node to build LSDB
Formation of Least-Cost Trees - To create a least-cost tree for itself, using the shared LSDB, each node
needs to run the famous Dijkstra Algorithm. This iterative algorithm uses the following steps:
1. The node chooses itself as the root of the tree, creating a tree with a single node, and sets the total cost
MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 153
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

of each node based on the information in the LSDB.


2. The node selects one node, among all nodes not in the tree, which is closest to the root, and adds this
to the tree. After this node is added to the tree, the cost of all other nodes not in the tree needs to be
updated because the paths may have been changed.
3. The node repeats step 2 until all nodes are added to the tree. We need to convince ourselves that the
above three steps finally create the least-cost tree. Table 4.11 shows a simplified version of Dijkstra’s
algorithm

Figure 4.33 Dijkstra’s Algorithm


Lines 4 to 13 implement step 1 in the algorithm. Lines 16 to 23 implement step 2 in the algorithm. Step
2 is repeated until all nodes are added to the tree. Figure 4.12 shows the format

ion of the least- cost tree for the graph in Figure 4.10 using Dijkstra’s algorithm.

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


154
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

Figure 4.34: Least-cost tree


Path-Vector Routing
Both link-state and distance-vector routing are based on the least-cost goal. However, there are
instances where this goal is not the priority. For example, assume that there are some routers in the
internet that a sender wants to prevent its packets from going through. For example, a router may
belong to an organization that does not provide enough security or it may belong to a commercial rival
of the sender which might inspect the packets for obtaining information. Least-cost routing does not
prevent a packet from passing through an area when that area is in the least-cost path. In other words,
the least-cost goal, applied by LS or DV routing, does not allow a sender to apply specific policies to
the route a packet may take. Aside from safety and security, there are occasions, in which the goal of
routing is merely reachability: to allow the packet to reach its destination more efficiently without
assigning costs to the route.

To respond to these demands, a third routing algorithm, called path-vector (PV) routing has been
devised. Path-vector routing does not have the drawbacks of LS or DV routing as described above
because it is not based on least-cost routing. The best route is determined by the source using the policy
it imposes on the route. In other words, the source can control the path. Although path-vector routing is
not actually used in an internet, and is mostly designed to route a packet between ISPs.

Spanning Trees
In path-vector routing, the path from a source to all destinations is also determined by the best

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 155


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

spanning tree. The best spanning tree, however, is not the least-cost tree; it is the tree determined by
the source when it imposes its own policy. If there is more than one route to a destination, the source can
choose the route that meets its policy best. A source may apply several policies at the same time. Figure 4.13
shows a small internet with only five nodes. Each source has created its own spanning tree that meets its
policy. The policy imposed by all sources is to use the minimum number of nodes to reach a destination.
The spanning tree selected by A and E is such that the communication does not pass through D as a
middle node. Similarly, the spanning tree selected by B is such that the communication does not
pass through C as a middle node.

Figure 4.35: Spanning trees in path-vector routing


Creation of Spanning Trees
Path-vector routing, like distance-vector routing, is an asynchronous and distributed routing
algorithm. The spanning trees are made, gradually and asynchronously, by each node. When a node is
booted, it creates a path vector based on the information it can obtain about its immediate
neighbor. A node sends greeting messages to its immediate neighbors to collect these pieces of
information. Figure 4.14 shows all of these path vectors for our internet in Figure 4.13. All of these
tables are not created simultaneously; they are created when each node is booted. The figure also shows
how these path vectors are sent to immediate neighbors after they have been created (arrows). Each node,
after the creation of the initial path vector, sends it to all its immediate neighbors. Each node, when it
receives a path vector from a neighbor, updates its path vector using an equation similar to the Bellman-
Ford, but applying its own policy instead of looking for the least cost.
Path(x, y) = best {Path(x, y), [(x + Path(v, y)]} for all v’s in the internet.
In this equation, the operator (+) means to add x to the beginning of the path. We also need to be
cautious to avoid adding a node to an empty path because an empty path means one that does not exist.

Figure 4.36: Path vectors made at booting time


DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T
156
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

Path-vector routing also imposes one more condition on this equation: If Path (v, y) includes x, that
path is discarded to avoid a loop in the path. In other words, x does not want to visit itself when it
selects a path to y. Figure 4.15 shows the path vector of node C after two events. In the first event, node
C receives a copy of B’s vector, which improves its vector: now it knows how to reach node A. In the
second event, node C receives a copy of D’s vector, which does not change its vector. The vector for node
C after the first event is stabilized and serves as its forwarding table.

Figure 4.37: Updating path vectors


Path-Vector Algorithm
Based on the initialization process and the equation used in updating each forwarding table after
receiving path vectors from neighbors, we can write a simplified version of the path vector algorithm as
shown in Figure 4.38.

Figure 4.38 Path-vector algorithm for a node


MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 157
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Lines 4 to 12 show the initialization for the node. Lines 17 to 24 show how the node updates its vector
after receiving a vector from the neighbor. The update process is repeated forever. We can see the
similarities between this algorithm and the DV algorithm.
UNICAST ROUTING PROTOCOLS
A protocol is more than an algorithm. A protocol needs to define its domain of operation, the
messages exchanged, communication between routers, and interaction with protocols in other
domains.
We discuss three common protocols used in the Internet:

1. Routing Information Protocol (RIP), based on the distance-vector algorithm.

2. Open Shortest Path First (OSPF), based on the link-state algorithm.

3. Border Gateway Protocol (BGP), based on the path-vector algorithm.

Internet Structure
Today, the Internet has changed from a tree-like structure, with a single backbone, to a multi-
backbone structure run by different private corporations. The Internet has a structure similar to what is
shown in Figure 4.16. The Internet has changed from a tree-like structure, with a single backbone, to a
multi-backbone structure run by different private corporations today.

Figure 4.39: Internet structure


There are several backbones run by private communication companies that provide global
connectivity. These backbones are connected by some peering points that allow connectivity between
DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T
158
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

backbones. At a lower level, there are some provider networks that use the backbones for global
connectivity but provide services to Internet customers. Finally, there are some customer networks that
use the services provided by the provider networks. Any of these three entities (backbone, provider
network, or customer network) can be called an Internet Service Provider or ISP. They provide services, but
at different levels.
Hierarchical Routing
The Internet today is made of a huge number of networks and routers that connect them. Routing in the
Internet cannot be done using a single protocol for two reasons: a scalability problem and an
administrative issue. Scalability problem means that the size of the forwarding tables becomes huge,
searching for a destination in a forwarding table becomes time-consuming, and updating creates a huge
amount of traffic. The administrative issue is related to the Internet structure described in Figure4.16.

As the figure shows, each ISP is run by an administrative authority. The administrator needs to have
control in its system. The organization must be able to use as many subnets and routers as it needs, may
desire that the routers be from a particular manufacturer, may wish to run a specific routing algorithm
to meet the needs of the organization, and may want to impose some policy on the traffic passing through
its ISP. Hierarchical routing means considering each ISP as an autonomous system (AS). Each AS can
run a routing protocol that meets its needs, but the global Internet runs a global protocol to glue all ASs
together.
The routing protocol run in each AS is referred to as intra-AS routing protocol, intradomain
routing protocol, or interior gateway protocol (IGP); the global routing protocol is referred
to as inter-AS routing protocol, interdomain routing protocol, or exterior gateway protocol
(EGP). There may be several intradomain routing protocols, and each AS is free to choose one, but it
should be clear that there should be only one interdomain protocol that handles routing between these
entities. Presently, the two common intradomain routing protocols are RIP and OSPF; the only
interdomain routing protocol is BGP. The situation may change when we move to IPv6.
Autonomous Systems
Each ISP is an autonomous system when it comes to managing networks and routers under its control.
There are small, medium-size, and large ASs, and each AS is given an autonomous number (ASN) by the
ICANN. Each ASN is a 16-bit unsigned integer that uniquely defines an AS. The autonomous
systems, however, are not categorized according to their size; they are categorized according to the way
they are connected to other ASs. We have stub ASs, multihomed ASs, and transient ASs. The type, affects the
operation of the interdomain routing protocol in relation to that AS.
❑ Stub AS. A stub AS has only one connection to another AS. The data traffic can be either initiated or
terminated in a stub AS; the data cannot pass through it. A good example of a stub AS is the
customer network, which is either the source or the sink of data.
❑ Multihomed AS. A multihomed AS can have more than one connection to other ASs, but it does
not allow data traffic to pass through it. A good example of such an AS is some of the customer ASs that
may use the services of more than one provider network, but their policy does not allow data to be
passed through them.

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 159


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

❑ Transient AS. A transient AS is connected to more than one other AS and also allows the traffic to
pass through. The provider networks and the backbone are good examples of transient ASs.
Routing Information Protocol (RIP)
The Routing Information Protocol (RIP) is one of the most widely used intradomain routing protocols
based on the distance-vector routing algorithm. RIP was started as part of the Xerox Network System
(XNS), but it was the Berkeley Software Distribution (BSD) version of UNIX that helped make the use of
RIP widespread.
Hop Count
A router in this protocol basically implements the distance-vector routing algorithm shown in Table
However, the algorithm has been modified as described below.

1. Since a router in an AS needs to know how to forward a packet to different networks (subnets) in
an AS, RIP routers advertise the cost of reaching different networks instead of reaching other
nodes in a theoretical graph. In other words, the cost is defined between a router and the
network in which the destination host is located.
2. To make the implementation of the cost simpler (independent from performance factors of
the routers and links, such as delay, bandwidth, and so on), the cost is defined as the number of
hops, which means the number of networks (subnets) a packet needs to travel through from
the source router to the final destination host. Note that the network in which the source
host is connected is not counted in this calculation because the source host does not use a
forwarding table; the packet is delivered to the default router. Figure 4.17 shows the concept
of hop count advertised by three routers from a source host to a destination host. In RIP, the
maximum cost of a path can be 15, which means 16 is considered as infinity (no connection).
For this reason, RIP can be used only in autonomous systems in which the diameter
of the AS is not more than 15 hops.

Figure 4.40: Hop counts in RIP


Forwarding Tables
The routers in an autonomous system need to keep forwarding tables to forward packets to their
destination networks. A forwarding table in RIP is a three-column table in which the first
column is the address of the destination network, the second column is the address of the
next router to which the packet should be forwarded, and the third column is the cost (the
number of hops) to reach the destination network. Figure 4.18 shows the three forwarding
tables for the routers in Figure 4.17. Note that the first and the third columns together convey the
same information as does a distance vector, but the cost shows the number of hops to the destination
networks.
DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T
160
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

Figure 4.41 Forwarding tables


Although a forwarding table in RIP defines only the next router in the second column, it gives the
information about the whole least-cost tree.For example, R1 defines that the next router for the path
to N4 is R2; R2 defines that the next router to N4 is R3; R3 defines that there is no next router for this
path. The tree is then R1 → R2 → R3 → N4.
RIP is implemented as a process that uses the service of UDP on the well-known port number
520. In BSD, RIP is a daemon process (a process running in the background), named “routed”
(abbreviation for route daemon and pronounced route-dee). This means that, although RIP is a routing
protocol to help IP route its datagrams through the AS, the RIP messages are encapsulated inside UDP
user datagrams, which in turn are encapsulated inside IP datagrams. In other words, RIP runs at the
application layer, but creates forwarding tables for IP at the network later
RIP Messages
Two RIP processes, a client and a server, like any other processes, need to exchange messages. RIP-2 defines
the format of the message, as shown in Figure 4.19. Part of the message, called entry, can be repeated as
needed in a message. Each entry carries the information related to one line in the forwarding table
of the router that sends the message.
RIP
RIP has two types of messages: request and response. A request message is sent by a router that has just
come up or by a router that has some time-out entries. A request message can ask about specific entries or
all entries. A response (or update) message can be either solicited or unsolicited. A solicited response
message is sent only in answer to a request message. It contains information about the destination
specified in the corresponding request message. An unsolicited response message, on the other hand, is sent
periodically, every 30 seconds or when there is a change in the forwarding table

Figure 4.42 RIP message format


RIP Algorithm
RIP implements the same algorithm as the distance-vector routing algorithm. Some changes need to be
made to the algorithm to enable a router to update its forwarding table:

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 161


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

❑ Instead of sending only distance vectors, a router needs to send the whole contents of its
forwarding table in a response message.

❑ The receiver adds one hop to each cost and changes the next router field to the address of the
sending router. Each route in the modified forwarding table is the received route and each route in the
old forwarding table the old route. The received router selects the old routes as the new ones except
in the following three cases:
1. If the received route does not exist in the old forwarding table, it should be added to the route.

2. If the cost of the received route is lower than the cost of the old one, the received route should be
selected as the new one.

3. next router is the same in both routes, the received route should be selected as the new one. This is
the case where the route was actually advertised by the same router in the past, but now the situation
has been changed. For example, suppose a neighbor has previously advertised a route to a destination with
cost 3, but now there is no path between this neighbor and that destination. The neighbor advertises
this destination with cost value infinity (16 in RIP). The receiving router must not ignore this value even
though its old route has a lower cost to the same destination.

The new forwarding table needs to be sorted according to the destination route (mostly using the longest
prefix first)

Example 4.2 Figure 4.43 shows a more realistic example of the operation of RIP in an autonomous
system. First, the figure shows all forwarding tables after all routers have been booted. Then there are
changes in some tables when some update messages have been exchanged. Finally, there are the stabilized
forwarding tables when there is no more change

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


162
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

Figure 4.43: Example of an autonomous system using RIP


RIP uses three timers to support its operation.
1. The periodic timer controls the advertising of regular update messages. Each router has one
periodic timer that is randomly set to a number between 25 and 35 seconds (to prevent all
routers sending their messages at the same time and creating excess traffic). The timer counts down;
when zero is reached, the update message is sent, and the timer is randomly set once again.
2. The expiration timer governs the validity of a route. When a router receives update
information for a route, the expiration timer is set to 180 seconds for that particular route.
MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 163
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Every time a new update for the route is received, the timer is reset. If there is a problem on an
internet and no update is received within the allotted 180 seconds, the route is considered expired
and the hop count of the route is set to 16, which means the destination is unreachable.
Every route has its own expiration timer.
3. The garbage collection timer is used to purge a route from the forwarding table. When the
information about a route becomes invalid, the router does not immediately purge that route
from its table. Instead, it continues to advertise the route with a metric value of 16. At the
same time, a garbage collection timer is set to 120 seconds for that route. When the count
reaches zero, the route is purged from the table. This timer allows neighbours to become
aware of the invalidity of a route prior to purging
Performance
Before ending this section, let us briefly discuss the performance of RIP:

❑ Update Messages - The update messages in RIP have a very simple format and are sent
only to neighbours; they are local. They do not normally create traffic because the routers try to
avoid sending them at the same time.
❑ Convergence of Forwarding Tables - RIP uses the distance-vector algorithm, which
can converge slowly if the domain is large, but, since RIP allows only 15 hops in a domain (16 is
considered as infinity), there is normally no problem in convergence. The only problems that may
slow down convergence are count-to-infinity and loops created in the domain; use of poison-
reverse and split-horizon strategies added to the RIP extension may alleviate the situation.
❑ Robustness - As distance-vector routing is based on the concept that each router sends
what it knows about the whole domain to its neighbours. This means that the calculation of the
forwarding table depends on information received from immediate neighbours, which in turn
receive their information from their own neighbours. If there is a failure or corruption in one
router, the problem will be propagated to all routers and the forwarding in each router will be
affected.
Open Shortest Path First
Open Shortest Path First (OSPF) is also an intradomain routing protocol like RIP, but it is based on
the link-state routing protocol. OSPF is an open protocol, which means that the
specification is a public document.

In OSPF, like RIP, the cost of reaching a destination from the host is calculated from the source
router to the destination network. Each link (network) can be assigned a weight based on the
throughput, round-trip time, reliability, and so on. An administration can also decide to use the
hop count as the cost. An interesting point about the cost in OSPF is that different service types
(TOSs) can have different weights as the cost. Figure 4.44 shows the idea of the cost from a
router to the destination host network. We can compare the figure with Figure 4.40 for the
RIP.

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


164
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

Figure4.44 Metric in OSPF


Forwarding Tables
Each OSPF router can create a forwarding table after finding the shortest-path tree between itself and the
destination using Dijkstra’s algorithm. Figure 4.45 shows the forwarding tables for the simple AS in Figure
4.44. Comparing the forwarding tables for the OSPF and RIP in the same AS, we find that the only
difference is the cost values. The reason for this consistency is that both protocols use the shortest-
path trees to define the best route from a source to a destination.
Areas
Compared with RIP, which is normally used in small ASs, OSPF was designed to be able to handle
routing in a small or large autonomous system. However, the formation of shortest-path trees in OSPF
requires that all routers flood the whole AS with their LSPs to create the global LSDB. Although this
may not create a problem in a small AS, it may have created a huge volume of traffic in a large AS. To
prevent this, the AS needs to be divided into small sections called areas. Each area acts as a small
independent domain for flooding LSPs. In other words, OSPF uses another level of hierarchy in routing: the
first level is the autonomous system, the second is the area.

Figure 4.45: Forwarding tables in OSPF


However, each router in an area needs to know the information about the link states not only in its area
but also in other areas. For this reason, one of the areas in the AS is designated as the backbone area,
responsible for gluing the areas together. The routers in the backbone area are responsible for

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 165


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

passing the information collected by each area to all other areas. In this way, a router in an area can receive
all LSPs generated in other areas. For the purpose of communication, each area has an area identification.
The area identification of the backbone is zero. Figure 20.23 shows an autonomous system and its
areas.

Figure 4.46: Areas in an autonomous system


Link-State Advertisement
OSPF is based on the link-state routing algorithm, which requires that a router advertise the state of each
link to all neighbors for the formation of the LSDB. When we discussed the link-state algorithm, we used
the graph theory and assumed that each router is a node and each network between two routers is an
edge. The situation is different in the real world, in which we need to advertise the existence of
different entities as nodes, the different types of links that connect each node to its neighbors, and
the different types of cost associated with each link. This means we need different types of
advertisements, each capable of advertising different situations. We can have five types of link-state
advertisements: router link, network link, summary link to network, summary link to AS border
router, and external link. Figure 20.24 shows these five advertisements and their uses.

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


166
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

Figure 4.47: Five different LSPs


Router link. A router link advertises the existence of a router as a node. In addition to giving the
address of the announcing router, this type of advertisement can define one or more types of links
that connect the advertising router to other entities. A transient link announces a link to a transient
network, a network that is connected to the rest of the networks by one or more routers. This type of
advertisement should define the address of the transient network and the cost of the link. A stub link
advertises a link to a stub network, a network that is not a through network. Again, the advertisement
should define the address of the network and the cost. A point-to-point link should define the address of
the router at the end of the point-to-point line and the cost to get there.
❑ Network link. A network link advertises the network as a node. However, since a network cannot
do announcements itself (it is a passive entity), one of the routers is assigned as the designated router and
does the advertising. In addition to the address of the designated router, this type of LSP announces
the IP address of all routers (including the designated router as a router and not as speaker of the network),
but no cost is advertised because each router announces the cost to the network when it sends a router
link advertisement.
❑ Summary link to network. This is done by an area border router; it advertises the summary of links
collected by the backbone to an area or the summary of links collected by the area to the backbone. This
type of information exchange is needed to glue the areas together.
❑ Summary link to AS. This is done by an AS router that advertises the summary links from other ASs
to the backbone area of the current AS, information which later can be disseminated to the areas so that
they will know about the networks in other ASs. The need for this type of information exchange is
better understood in inter-AS routing (BGP).
❑ External link. This is also done by an AS router to announce the existence of a single network outside the
AS to the backbone area to be disseminated into the areas.
OSPF Implementation
OSPF is implemented as a program in the network layer, using the service of the IP for propagation. An
MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 167
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

IP datagram that carries a message from OSPF sets the value of the protocol field to 89. This means that,
although OSPF is a routing protocol to help IP to route its datagrams inside an AS, the OSPF messages
are encapsulated inside datagrams. OSPF has gone through two versions: version 1 and version 2. Most
implementations use version 2
OSPF Messages
OSPF is a very complex protocol; it uses five different types of messages. In Figure 20.25, the format of
the OSPF common header (which is used in all messages) and the link-state general header (which is used in
some messages) are given. Then, the outlines of five message types used in OSPF are given. The hello
message (type 1) is used by a router to introduce itself to the neighbours and announce all neighbours
that it already knows. The database description message (type 2) is normally sent in response to
the hello message to allow a newly joined router to acquire the full LSDB. The linkstate request
message (type 3) is sent by a router that needs information about a specific LS. The link-state update
message (type 4) is the main OSPF message used for building the LSDB. This message, in fact, has five
different versions (router link, network link, summary link to network, summary link to AS border
router, and external link). The link-state acknowledgment message (type 5) is used to create
reliability in OSPF; each router that receives a link-state update message needs to acknowledge it.
Authentication - As Figure 20.25 shows, the OSPF common header has the provision for
authentication of the message sender. This prevents a malicious entity from sending OSPF messages
to a router and causing the router to become part of the routing system to which it actually does not
belong.
OSPF Algorithm
OSPF implements the link-state routing algorithm we discussed in the previous section. However, some
changes and augmentations need to be added to the algorithm:
❑ After each router has created the shortest-path tree, the algorithm needs to use it to create the
corresponding routing algorithm.
❑ The algorithm needs to be augmented to handle sending and receiving all five types of messages.

Performance
The performance of OSPF includes:
❑ Update Messages. The link-state messages in OSPF have a somewhat complex format. They also are
flooded to the whole area. If the area is large, these messages may create heavy traffic and use a lot of
bandwidth.
❑ Convergence of Forwarding Tables. When the flooding of LSPs is completed, each router can create
its own shortest-path tree and forwarding table; convergence is fairly quick. However, each router
needs to run Dijkstra’s algorithm, which may take some time.
❑ Robustness. The OSPF protocol is more robust than RIP because, after receiving the completed
LSDB, each router is independent and does not depend on other routers in the area. Corruption or
failure in one router does not affect other routers as seriously as in RIP

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


168
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

Figure 4.48: OSPF message formats


Border Gateway Protocol Version 4 (BGP4)
The Border Gateway Protocol version 4 (BGP4) is the only interdomain routing protocol used in the
Internet today. BGP4 is based on the path-vector algorithm we described before, but it is tailored to
provide information about the reachability of networks in the Internet.

Figure 4.49 A sample internet with four ASs

Figure 4.26 shows an example of an internet with four autonomous systems. AS2, AS3, and AS4 are stub
autonomous systems; AS1 is a transient one. In our example, data exchange between AS2, AS3, and AS4
should pass through AS1. Each autonomous system in Figure 4.26 uses one of the two common
intradomain protocols, RIP or OSPF. Each router in each AS knows how to reach a network that is in its
own AS, but it does not know how to reach a network in another AS. To enable each router to route a
packet to any network in the internet, a variation of BGP4, called external BGP (eBGP) is installed on
each border router (the one at the edge of each AS which is connected to a router at another AS). Then
MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 169
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

install the second variation of BGP, called internal BGP (iBGP), on all routers. This means that the border
routers will be running three routing protocols (intradomain, eBGP, and iBGP), but other routers are
running two protocols (intradomain and iBGP). In this section, we introduce the basics of BGP and its
relationship with intradomain routing protocols (RIP or OSPF).
Operation of External BGP (eBGP)
BGP is a kind of point-to-point protocol. When the software is installed on two routers, they try to
create a TCP connection using the well-known port 179. Here, a pair of client and server processes
continuously communicate with each other to exchange messages. The two routers that run the BGP
processes are called BGP peers or BGP speakers. The eBGP variation of BGP allows two physically
connected border routers in two different ASs to form pairs of eBGP speakers and exchange messages. The
routers that are eligible in our example in Figure 20.26 form three pairs: R1-R5, R2-R6, and R4-R9. The
connection between these pairs is established over three physical WANs (N5, N6, and N7).
However, there is a need for a logical TCP connection to be created over the physical connection to
make the exchange of information possible. Each logical connection in BGP parlance is referred to as a
session. This means that we need three sessions in our example, as shown in Figure 4.50.

Figure 4.50 eBGP operation


The figure 4.27 also shows the simplified update messages sent by routers involved in the eBGP
sessions. The circled number defines the sending router in each case. For example, message number 1 is sent
by router R1 and tells router R5 that N1, N2, N3, and N4 can be reached through router R1 (R1 gets this
information from the corresponding intradomain forwarding table). Router R5 can now add these pieces
of information at the end of its forwarding table. When R5 receives any packet destined for these four
networks, it can use its forwarding table and find that the next router is R1.
The messages exchanged during three eBGP sessions help some routers know how to route packets to
some networks in the internet, but the reachability information is not complete. There are two
problems that need to be addressed:
1. Some border routers do not know how to route a packet destined for nonneighbor ASs. For
example, R5 does not know how to route packets destined for networks in AS3 and AS4. Routers R6 and
R9 are in the same situation as R5: R6 does not know about networks in AS2 and AS4; R9 does not know
DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T
170
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

about networks in AS2 and AS3.


2. None of the nonborder routers know how to route a packet destined for any networks in other ASs. To
address the above two problems, all pairs of routers (border or nonborder) will have to run the second
variation of the BGP protocol, iBGP.
Operation of Internal BGP (iBGP)
The iBGP protocol is similar to the eBGP protocol in that it uses the service of TCP on the well-known
port 179, but it creates a session between any possible pair of routers inside an autonomous system.
However, some points should be made clear. First, if an AS has only one router, there cannot be an iBGP
session. For example, we cannot create an iBGP session inside AS2 or AS4 in our internet. Second, if there are
n routers in an autonomous system, there should be [n × (n − 1) / 2] iBGP sessions in that autonomous
system (a fully connected mesh) to prevent loops in the system. In other words, each router needs to
advertise its own reachability to the peer in the session instead of flooding what it receives from another
peer in another session. Figure 4.51 shows the combination of eBGP and iBGP sessions in our internet.

Figure 4.51 Combination of eBGP and iBGP sessions in our internet


Note that the physical networks inside ASs are not shown because a session is made on an overlay
network (TCP connection), possibly spanning more than one physical network as determined by the
route dictated by intradomain routing protocol. Also note that in this stage only four messages are
exchanged. The first message (numbered 1) is sent by R1 announcing that networks N8 and N9 are
reachable through the path AS1-AS2, but the next router is R1. This message is sent, through separate
sessions, to R2, R3, and R4. Routers R2, R4, and R6 do the same thing but send different messages to
different destinations. The interesting point is that, at this stage, R3, R7, and R8 create sessions with their
peers, but they actually have no message to send. The updating process does not stop here. For example,
after R1 receives the update message from R2, it combines the reachability information about AS3
with the reachability information it already knows about AS1 and sends a new update message to R5.
Now R5 knows how to reach networks in AS1 and AS3. The process continues when R1 receives the
update message from R4. The point is that we need to make certain that at a point in time there are no
changes in the previous updates and that all information is propagated through all ASs.

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 171


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Figure 4.52: Finalized BGP path tables


At this time, each router combines the information received from eBGP and iBGP and creates what we
may call a path table after applying the criteria for finding the best path.To demonstrate, we show the
path tables in Figure 4.52 for the routers in Figure 4.51. For example, router R1 now knows that any
packet destined for networks N8 or N9 should go through AS1 and AS2 and the next router to deliver
the packet to is router R5. Similarly, router R4 knows that any packet destined for networks N10, N11, or
N12 should go through AS1 and AS3 and the next router to deliver this packet to is router R1, and so on.

Injection of Information into Intradomain Routing


The role of an interdomain routing protocol such as BGP is to help the routers inside the AS to augment their
routing information. In other words, the path tables collected and organized by BGP are not used for
routing packets; they are injected into intradomain forwarding tables (RIP or OSPF) for routing
packets. This can be done in several ways depending on the type of AS. In the case of a stub AS, the only
area border router adds a default entry at the end of its forwarding table and defines the next router to
be the speaker router at the end of the eBGP connection. In Figure 4.26, R5 in AS2 defines R1 as the
default router for all networks other than N8 and N9. The situation is the same for router R9 in AS4 with
the default router to be R4. In AS3, R6 set its default router to be R2, but R7 and R8 set their default
router to be R6. These settings are in accordance with the path tables described in Figure4.29 for these
routers. In other words, the path tables are injected into intradomain forwarding tables by adding only one
default entry. In the case of a transient AS, the situation is more complicated. R1 in AS1 needs to inject
the whole contents of the path table for R1 in Figure 20.29 into its intradomain forwarding table. The
situation is the same for R2, R3, and R4. One issue to be resolved is the cost value. We know that RIP
and OSPF use different metrics. One solution, which is very common, is to set the cost to the foreign
networks at the same cost value as to reach the first AS in the path. For example, the cost for R5 to reach
all networks in other ASs is the cost to reach N5. The cost for R1 to reach networks N10 to N12 is the cost
to reach N6, and so on.

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


172
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

The cost is taken from the intradomain forwarding tables (RIP or OSPF). Figure 20.30 shows the
interdomain forwarding tables. For simplicity, we assume that all ASs are using RIP as the intradomain
routing protocol. The shaded areas are the augmentation injected by the BGP protocol; the default
destinations are indicated as zero

Figure 4.53 Forwarding tables after injection from BGP


Address Aggregation
The intradomain forwarding tables obtained with the help of the BGP4 protocols may become huge in
the case of the global Internet because many destination networks may be included in a forwarding table.
Fortunately, BGP4 uses the prefixes as destination identifiers and allows the aggregation of these
prefixes. For example, subnets can be reached through one path. Even if one or two of the aggregated
prefixes need a separate path, the longest prefix principle allows us to do so. Prefixes 14.18.20.0/26,
14.18.20.64/26, 14.18.20.128/26, and 14.18.20.192/26, can be combined into
/24 and all subnets can be reached through one path.

Path Attributes
In both intradomain routing protocols (RIP or OSPF), a destination is normally associated with two
pieces of information: next hop and cost. The first one shows the address of the next router to deliver the
packet; the second defines the cost to the final destination. Interdomain routing is more involved and
naturally needs more information about how to reach the final destination. In BGP these pieces are
called path attributes. BGP allows a destination to be associated with up to seven path attributes. Path
attributes are divided into two broad categories: well-known and optional. A well-known attribute
must be recognized by all routers; an optional attribute need not be. A well-known attribute can be
mandatory, which means that it must be present in any BGP update message, or discretionary, which means
it does not have to be. An optional attribute can be either transitive, which means it can pass to the next
AS, or intransitive, which means it cannot. All attributes are inserted after the corresponding
destination prefix in an update message (discussed later). The format for an attribute is shown in Figure
4.31.
MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 173
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Figure 4. 54: Format of path attribute


The first byte in each attribute defines the four attribute flags (as shown in the figure). The next byte
defines the type of attributes assigned by ICANN (only seven types have been assigned, as explained next).
The attribute value length defines the length of the attribute value field (not the length of the whole
attributes section). The following gives a brief description of each attribute.
❑ ORIGIN (type 1). This is a well-known mandatory attribute, which defines the source of the routing
information. This attribute can be defined by one of the three values: 1, 2, and 3. Value 1 means that the
information about the path has been taken from an intradomain protocol (RIP or OSPF). Value 2 means
that the information comes from BGP. Value 3 means that it comes from an unknown source.
❑ AS-PATH (type 2). This is a well-known mandatory attribute, which defines the list of
autonomous systems through which the destination can be reached. We have used this attribute in our
examples. The AS-PATH attribute, as we discussed in path-vector routing in the last section, helps prevent
a loop. Whenever an update message arrives at a router that lists the current AS as the path, the router drops
that path. The AS-PATH can also be used in route selection.
❑ NEXT-HOP (type 3). This is a well-known mandatory attribute, which defines the next router
to which the data packet should be forwarded. This attribute helps to inject path information collected
through the operations of eBGP and iBGP into the intradomain routing protocols such as RIP or OSPF.

❑ MULT-EXIT-DISC (type 4). The multiple-exit discriminator is an optional intransitive attribute,


which discriminates among multiple exit paths to a destination. The value of this attribute is normally
defined by the metric in the corresponding intradomain protocol (an attribute value of 4-byte
unsigned integer). For example, if a router has multiple paths to the destination with different values
related to these attributes, the one with the lowest value is selected. Note that this attribute is
intransitive, which means that it is not propagated from one AS to another.
LOCAL-PREF (type 5). The local preference attribute is a well-known discretionary attribute. It is
normally set by the administrator, based on the organization policy. The routes the administrator
prefers are given a higher local preference value (an attribute value of 4-byte unsigned integer). For
example, in an internet with five ASs, the administrator of AS1 can set the local preference value of 400
to the path AS1 → AS2 → AS5, the value of 300 to AS1 → AS3 → AS5, and the value of 50 to AS1
→ AS4 → AS5. This means that the administrator prefers the first path to the second one and prefers the
second one to the third one. This may be a case where AS2 is the most secured and AS4 is the least secured
AS for the administration of AS1. The last route should be selected if the other two are not available.
❑ ATOMIC-AGGREGATE (type 6). This is a well-known discretionary attribute, which
defines the destination prefix as not aggregate; it only defines a single destination network. This attribute
DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T
174
MYSOR E
COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK
17EC 64
LAYER PROTOCOLS AND UNICAST ROUTING

has no value field, which means the value of the length field is zero.
❑ AGGREGATOR (type 7). This is an optional transitive attribute, which emphasizes that the
destination prefix is an aggregate. The attribute value gives the number of the last AS that did the
aggregation followed by the IP address of the router that did so.
Route Selection
In the case where multiple routes are received to a destination, BGP needs to select one among them. The
route selection process in BGP is not as easy as the ones in the intradomain routing protocol that is based on
the shortest-path tree. A route in BGP has some attributes attached to it and it may come from an eBGP
session or an iBGP session. Figure 4.32 shows the flow diagram as used by common implementations. The
router extracts the routes which meet the criteria in each step. If only one route is extracted, it is selected and
the process stops; otherwise, the process continues with the next step. Note that the first choice is related to
the LOCAL-PREF attribute, which reflects the policy imposed by the administration on the route.

Figure 4.55: Flow diagram for route selection

MIT MYSORE | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 175


COMPUTER COMMUNICATON NETWORKS | MODULE 4: NETWORK LAYER
17EC 64
PROTOCOLS AND UNICAST ROUTING

Messages
BGP uses four types of messages for communication between the BGP speakers across the ASs and inside
an AS: open, update, keepalive, and notification (see Figure 4.33). All BGP packets share the same
common header.
❑ Open Message. To create a neighborhood relationship, a router running BGP opens a TCP
connection with a neighbor and sends an open message.
❑ Update Message. The update message is the heart of the BGP protocol. It is used by a router to
withdraw destinations that have been advertised previously, to announce a route to a new
destination, or both. Note that BGP can withdraw several destinations that were advertised before, but
it can only advertise one new destination (or multiple destinations with the same path attributes) in a
single update message.
❑ Keepalive Message. The BGP peers that are running exchange keepalive messages regularly (before
their hold time expires) to tell each other that they are alive.
❑ Notification. A notification message is sent by a router whenever an error condition is detected or a
router wants to close the session.

Figure 4.56: BGP messages


Performance
BGP performance can be compared with RIP. BGP speakers exchange a lot of messages to create
forwarding tables, but BGP is free from loops and count-to-infinity. The same weakness we mention for
RIP about propagation of failure and corruption also exists in BGP.

DEPT. OF ELECTR ONICS & COMMU NICA TION E N G G . |M I T


176
MYSOR E

You might also like