Computer Networks - Internet Protocol
Computer Networks - Internet Protocol
Networks
ECE 5713
Internetworking (IP)
2
Internetworking
Outline
Best Effort Service Model
Global Addressing Scheme
3
What we understand …
• Concepts of networking and network programming
– Elements of networks: nodes and links
– Building a packet abstraction on a link
• Transmission, and units of communication data
– How to detect transmission errors in a frame after
encoding and framing it
– How to simulate a reliable channel (sliding window)
– How to arbitrate access to shared media in any network
• Design issues of direct link networks
– Functionality of network adaptors
4
We also understand …
• How switches may provide indirect connectivity
– Different ways to move through a network (forwarding)
– Bridge approach to extending LAN concept
– Example of a real virtual circuit network (ATM)
– How switches are built and contention within switches
5
Internetworking
Outline
Best Effort Service Model
Global Addressing Scheme
6
Internetworking
7
Internetworking
• Routing – moving forward with IP
– Building forwarding information
• Dealing with global internets - scale
– Virtual geography and addresses
– Hierarchical routing
– Name translation and lookup: translating
between global and local (physical) names
– Multicast traffic
• Future internetworking: IPv6
8
Basics of Internetworking
• What is an internetwork?
– Illusion of a single (direct link) network
– Built on a set of distributed, heterogeneous networks
– Abstraction typically supported by software
• Properties
– Supports heterogeneity: independent of architecture,
operating system, network type and topology
– Scales to global connectivity
• The Internet:
– Specific global internetwork that grew out of ARPANET
9
Internet Protocol (IP)
• Network protocol for the Internet
• Operates on all hosts and routers (routers connect
distinct networks into the Internet)
FTP HTTP NV TFTP
TCP UDP
IP
13
Internetwork
• Concatenation of networks
H6 H7
H1 R3
Network R1 Network 2 Network Network
1 3 4
Point
Ethernet FDDI Ethernet
-to- R2
H5
point H8
H2 H3 H4
• Protocol stack
H1 H8
TCP R1 R2 R3 TCP
IP IP IP IP IP
14
Global Addresses
• Properties
– 32 bit long hierarchical addresses: network + host
– Globally unique, maps to interfaces rather than hosts
• Exception is service request splitting, as is done with
large web servers, for example
• Traditional 3-class model; out of 4 billion addresses
– 1/2 are class A
– 1/4 are class B
– 1/8 are class C
• Often written using dot notation
– 10.3.2.4, 128.96.33.81
15
IP Addresses
7 bits (126 nets) 24 bits (16 million hosts)
Class A: 0 Network Host
DestinationAddr
DestinationAddr
• Fragmentation support
– 16-bit packet ID (identifies packet fragments)
– 3-bit flags; one bit marks last fragment
– 13-bit fragment offset into packet (in 8-byte words)
• 8-bit “time-to-live” (TTL); a hop count until
forced destruction of packet
18
Datagram Format
• 16-bit IP checksum on header
• 32-bit source IP address
• 32-bit destination IP address
• Options and padding (variable length)
– Source-based routing (typically disabled)
– Record route 0 4 8 16 19 31
DestinationAddr
19
Fragmentation and Reassembly
• Different physical layers provide different limits
on frame length (maximum transfer unit - MTU)
• Source host cannot know minimum value along
dynamic route (no signaling)
• Avoid restricting datagram size to a small value
• Solution: when necessary, split IP packet into
fragments before sending over physical link
• Questions
– Where should reassembly occur ?
– What happens when a fragment is damaged/lost ?
20
Fragmentation and Reassembly
• Strategy used with IP
– Fragments are self-contained IP datagrams
– Avoid fragmentation at source host
• Transport layer passes small enough packets to fit
into MTU of local network
• MTU on ATM based on CS-PDU (not cells)
– Re-fragmentation along the path is possible
– Reassemble at destination to minimize re-
fragmentation
– Drop packet if one or more fragments lost (based on
timeout); do not recover from lost fragments
21
Fragmentation and Reassembly
Example
H1 R1 R2 R3 H8
Start of header
Ident= x 1 Offset= 0
Rest of header
22
Datagram Forwarding
• Hosts and routers maintain forwarding tables
• Forwarding table maps network number to next hop
– List of (network/host, next hop) pairs
• Table often ends with default router
– Recall hierarchical routing notion of handing unknown
addresses up to the next level
• Very simple (and static) table on hosts
• Complex (and dynamic) table on routers
• Example (R2) Network Number Next Hop
1 R3
2 R1
3 interface 1
4 interface 0 23
Datagram Forwarding
Network # Netmask Nest hop / port
18.0.0.0 255.0.0.0 1
128.32.0.0 255.255.0.0 2
0.0.0.0 0.0.0.0 3
25
Scaling Problems for the Internet
• Inefficient address allocation (by class system)
• Too many networks for routing
• Can trade off between these two
• Questions
– What network(s) should you allocate to a company
with 1000 machines ?
– What about a company with 200 machines ?
– What about a company with 2 machines that plans to
grow rapidly ?
26
Scaling Problems for the Internet
• Pressure primarily on class B networks
– Most companies plan to grow beyond 255 machines
– Renumbering is a hassle and can interrupt service
– Only around 16,000 class B networks available (14 bits)
• Class B networks aren’t very efficient
– Few organizations have O(10,000) machines
– More likely a network uses O(1,000) of 65,000 addresses
• Scaling problems with alternatives
– Multiple table entries if class C networks used instead
– Protocols do not scale beyond O(10,000) networks
27
IP Address Hierarchy
• Begin with class-based system
class A: 0 network (7) host (24 bits)
class B: 1 0 network (14 bits) host (16 bits)
class C: 1 1 0 network (21 bits) host (8 bits)
• Subnetting within an organization
– Network can be broken into smaller networks
– Recognized only within the organization
– Implemented by packet-switching
– Smaller networks called subnets
28
Subnetting
• Another level to address/routing hierarchy: subnet
• Subnet masks define variable partition of host part
• Subnets visible only within site (close to each
other)
IP address Class Network Host
Subnetted
IP address Class Network Subnet Host
Subnet
mask 11 1111111111111111 11111111 00000000
Non-
contiguous 11 1111111111111111 1111 0000 1111 0000
mask
29
Subnetting Example
Subnet mask: 255.255.255.128
Subnet number: 128.96.34.0 All hosts have
128.96.34.15
128.96.34.1 address && mask = subnet address
H1
R1
Subnet mask: 255.255.255.128
128.96.34.130 Subnet number: 128.96.34.128
128.96.34.139
128.96.34.129
H2
H3 R2 Forwarding table at router R1
128.96.33.14 128.96.33.1
Subnet Number Subnet Mask Next Hop
128.96.34.0 255.255.255.128 interface 0
Subnet mask: 255.255.255.0 128.96.34.128 255.255.255.128 interface 1
Subnet number: 128.96.33.0
128.96.33.0 255.255.255.0 R2
30
Forwarding Algorithm
D = destination IP address
for each entry (SubnetNum, SubnetMask, NextHop)
D1 = SubnetMask & D
if D1 = SubnetNum
if NextHop is an interface
deliver datagram directly to D
else
deliver datagram to NextHop
32
How to Make Routing Scale
• Flat versus Hierarchical Addresses
• Inefficient use of Hierarchical Address Space
– class C with 2 hosts (2/255 = 0.78% efficient)
– class B with 256 hosts (256/65535 = 0.39% efficient)
• Still Too Many Networks
– routing tables do not scale
– route propagation protocols do not scale
33
Supernetting/CIDR
• CIDR: Classless Inter-Domain Routing
• Compromise in address utilization vs scalability
• Eliminate class notion; generalize subnet notion
• Assign block of contiguous network numbers to
nearby networks
– Restrict block sizes to powers of 2
– Use a bit mask (CIDR mask) to identify block size
• All routers must understand CIDR addressing
– longest match in the table
34
CIDR
• Specify network with (network#, mask bits)
– Equivalent to (network#, # of hosts)
• Block of 8 class C networks may be treated as one
• Organizations can still use subnetting internally !
• Routing table entries look like:
subnet # mask length next hop
131.126.141.0 24 Interface 0
131.126.142.0 25 Interface 1
131.126.142.128 25 R1
131.126.0.0 16 R2
default 0 R3
35
CIDR Growth
• CIDR/supernetting allows hierarchical
development
• Assign block of addresses to regional provider
(e.g., 128.0.0.0/9 to BARRNET)
• Regional provider subdivides addresses
• Can hand out to subregional providers (e.g.,
128.32.0.0/16 to Berkeley)
• Who in turn hand out to smaller organizations
(e.g., 128.32.32.0/21 to Berkeley CS Dept)
• etc.
36
Address Translation
• IP route can cross many physical networks
• Delivers to destination’s physical network
• Hosts listen for packets marked with physical
interface names
– Each (next) hop along route
– Destination host
• Translate IP addresses to physical addresses
– How ?
37
Address Translation Approaches
• Hardcoded
– Physical address encoded in IP address
– Not possible for many networks (e.g., Ethernet)
• Fixed table
– Centrally maintained
– Distributed to all hosts
• Automatically generated table
– Use Address Resolution Protocol (ARP) to build
– Time out entries periodically
38
Address Resolution Protocol (ARP)
• Table of IP to physical address bindings
• To send a packet, check table for physical address
• If IP address not in table
– Broadcast a query (ARP request)
– Wait for response
• When query seen by target host
– Refresh (reset timeout) on existing table entry
– Creates table entry for requester if necessary
– Responds with translation (its physical address)
39
ARP Details
• Table entries are discarded if not refreshed; time
out O(10) minutes
• Do not refresh table entries upon reference
• ARP packet format
– HardwareType: type of physical network (e.g. ethernet)
– ProtocolType: type of higher layer protocol (e.g., IP)
– HLEN & PLEN: length of physical and protocol
addresses
– Operation: request or response
– Source/Target-Physical/Protocol addresses
40
ARP Packet Format
0 8 16 31
SourceHardwareAddr (bytes 0 – 3)
TargetHardwareAddr (bytes 2 – 5)
TargetProtocolAddr (bytes 0 – 3)
41
ARP in ATM
• LAN Emulation can be used to broadcast ARP
messages
– quite inefficient in large, wide area ATM network
• ATMARP is a solution which uses ARP server
– logical IP subnet (LIS) concept is used
– large ATM network in subdivided into small subnets
– each subnet has a different network number
– all nodes on the same subnet communicate directly
42
ARP in ATM
• In LIS model, large number of hosts and routers
can be connected to a big ATM network
• Each LIS has an ARP server and each node in the
LIS has ATM address of the server
• Hosts on different subnets communicate via router
R
10.0.0.2
12.0.0.3
H1 10.0.0.1
12.0.0.5
LIS 10 H2
LIS 12
ATM network
43
Internet Control Message Protocol
(ICMP)
• IP companion protocol (not necessary)
• Handles error and control messages
FTP HTTP NV TFTP
TCP UDP
IP ICMP
46
Dynamic Configuration
• Plug new host into network
– How much information must be known ?
– What new information must be assigned ?
– How can process be automated ?
• Some answers
– Host needs an IP address (must know it)
– Host must also
• Send packets out of physical (direct) network
• Thus needs physical address of router
47
Configuration Protocols (old!)
• Reverse Address Resolution Protocol (RARP)
– Translate physical address to IP address
– Used to boot diskless machines
– Machine broadcasts request at boot
– RARP server tells it it’s IP address
• Boot Protocol (BOOTP)
– Use UDP packets for same purpose as RARP
– Allows boot requests to traverse routers
– IP address of BOOTP server must be known
– Also returns file server IP, subnet mask, and default
router for host
48
Dynamic Host Configuration
Protocol- DHCP
• DHCP server is required to provide configuration
information to each host
– Each host retrieve this information on bootup
• DHCP server can be configured manually, or it
may allocate addresses on-demand
– Addresses are “leased” for some period of time
• Each host is not configured for DHCP server, it
performs a DHCP server discovery
– A broadcast discovery message is sent by the host and a
unicast reply is sent by the server
49
DHCP Operation
• New machine sends request to DHCP server for
assignment and information
• Server receives
– Directly if new machine given server’s IP address
– Through broadcast if on same physical network
– Via DHCP relay nodes that forward requests onto the
server’s physical network
• Server assigns IP address and provides other info
• Can be made secure (present signed request or
just a “valid” physical address)
50
DHCP Server
• A DHCP server is not required in each network
– A DHCP relay agent is used to relay the DHCP request
to the server
Unicast to server
DHCP DHCP
Other networks
Broadcast relay server
Host
51
Virtual Private Networks - VPN
• Controlled connectivity
– Restrict forwarding to authorized hosts
• Controlled capacity
– Change router drop and priority policies
– Provide guarantees on bandwidth, delay, etc.
• Similar to LANE, but possibly across
heterogeneous IP networks
52
Virtual Private Networks - VPN
• Used where controlled connectivity is required
• Two sites of a company may be connected by a
leased line to make a real private network
• Virtual network replaces leased lines with shared
network, making logical point-to-point connection
• Unwanted connectivity is prevented on this logical
link using IP tunnel
53
Virtual Private Networks - VPN
C Physical links
A B
K L
K C L
Physical links
A M B
Virtual circuits
(b)
54
IP Tunneling
• Allows gradual extension
– e.g., multicast
– Develop multicast-capable switches and routers
– Install on 5-10 university campuses
– Routers between universities do not support
multicast…too bad!
• Solution: use a tunnel, a point-to-point link
between nodes in an Internet
55
IP Tunneling
data IP
data IP IP
IP IPv4 IP
multicast multicast
da
ta
IP
University1 IP
University2
IP
multicast
Organization
56
IP Tunnel in VPNs
• Virtual point-to-point link between a pair of nodes
separated by many networks
10.0.0.1
IP payload
57
IP Tunneling for Multicast
• Set up a tunnel between each pair of universities
• Multicast packets
– Received by tunnel entry node
– Encapsulated (another IP header added for tunnel exit)
– Travel through the Internet (the tunnel)
– Received by tunnel exit node
– Unwrapped and delivered to another multicast-capable
university campus
58
Disadvantages of Tunneling
• Increases packet size
• Adds processing delays (and requires processing
power)
• Management overhead at tunnel entries and exits
• Of course…
– 1 is attractive compared to 0 (not working)
– So lots of research extensions use tunneling …
59
Routing
Algorithms
Scalability
Routing
Islamabad
Pir Wadhai
Rawal
Rawalpindi
Dam
Faizabad
Airport
• A stranger appears and asks “Airport ?”
• Which way do you point ?
61
What is Routing ?
• Definition: task of constructing and maintaining
forwarding information (in hosts or in switches)
• Goals for routing
– Capture notion of “best” routes
– Propagate changes effectively
– Require limited information exchange
– Admit efficient implementation
• Important notion: graph representation of network
62
Forwarding vs Routing
• Forwarding: to select an output port based on
destination address and routing table
• Routing: process by which routing table is built
• Forwarding table: enough information to accomplish
the forwarding function; optimized for forwarding
• Routing table: built by routing algorithms to build
forwarding table; optimized for topology changes
• Routing Table:
Network # (10) - Next hop (171.69.245.10)
• Forwarding Table:
Network # (10) - interface (if0) - MAC (8:0:2b:e4:b:1:2)
63
Routing Overview
• Hierarchical routing infrastructure defines routing
domains
– Where all routers are under same administrative control
• Network as a Graph A 6
– Nodes are routers 3
1
2 F
– Edges are links B
1 E
65
Problem with Ideal Routing
• Unbounded amount of information
• Queuing delay can change rapidly
• Graph connectivity changes, too
• Solution: Dynamic algorithm
– Periodically recalculate routes
– Distributed algorithm
• No single point-of-failure
• Reduced computation per node
– Abstract “distance” metric
• Combines many factors
• Heuristic
66
Routing Outline
• Algorithms
– Static shortest path algorithms
• Bellman-Ford—all pairs shortest paths to destination
• Dijkstra’s algorithm—single source shortest path
– Distributed, dynamic routing algorithms
• Distance vector routing (based on Bellman-Ford)
• Link state routing (Dijkstra’s algorithm at each
node)
• Metrics (from ARPANET, with informative names)
– Original
– New
– Revised
67
Bellman-Ford Algorithm
• Static, centralized algorithm, (local iterations/dest)
• Requires: directed graph with edge weights (costs)
• Calculates: shortest paths for all directed pairs
• Check use of each node as successor in all paths
• For every node N
– for each directed pair (B,C)
• is the path B N …C better than B .C ?
• is cost BNdest smaller than previously known ?
• For N nodes
– Uses an NxN matrix of (distance, successor) values
68
Bellman-Ford Algorithm
source A B
infinity infinity infinity
infinity 6 infinity 2 1 Dest
infinity 3 B 1 Dest
8 C 3 B 1 Dest
8 C 3 B 1 Dest
1 1 1
C E destination
infinity infinity
infinity 2 5 Dest 5
7 E 5 Dest
7 E 4 A
6 E 4 A
• After n iterations, nodes at distance n hops along the shortest
path have correct information
69
Dijkstra’s Algorithm
• Static, centralized algorithm, build tree from source
• Requires directed graph with edge weights (distances)
• Calculates: shortest paths from one node to all others
• Greedily grow set S of known minimum paths
• From node N
– Start with S = {N} and one-hop paths from N
– Loop n-1 times
• add closest outside node M to S
• for each node P not in S
– is the path N .....M .P better than N.... P ?
70
Dijkstra’s Algorithm
1 3 3
2 4
9
2 2 7
7
3 1 2
2 6
4
6 1
10
8 3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
71
Distance Vector Routing
• Distributed, dynamic version of Bellman-Ford
• Each node maintains distance vector: set of triples
– (Destination, Cost, NextHop)
– Edge weights starting at a node assumed known by that
node
• Exchange updates of distance vector (Destination,
Cost) with directly connected neighbors (known as
advertising the routes)
– periodically (on the order of several seconds to minutes)
– whenever vector changes (called triggered update)
72
Distance Vector Routing
• Update local table if receive a “better” route
– Newly advertised route has smaller cost (shorter)
– Came from next-hop (successor advertising new one)
• Refresh existing routes; delete if they time out
• Local failure detection
– Control message not ACK’d
– Time out on periodic route update
• Used in original ARPANET (until 1979)
• Early Internet: Routing Information Protocol (RIP)
• Early versions of DECnet and Novell IPX
73
Distance Vector Routing Example
Information in routing table of each node:
Iteration 1
At distance to reach node
node A B C D E F G B
A 0 1 1 x 1 1 x C
B 1 0 1 x x x x A
D
C 1 1 0 1 x x x
D x x 1 0 x x 1 E
E 1 x x x 0 x x
F 1 x x x x 0 1 F G
G x x x 1 x 1 0
74
Distance Vector Routing Example
Information in routing table of each node:
Iteration 2
At distance to reach node
node A B C D E F G B
A 0 1 1 2 1 1 2 C
B 1 0 1 2 2 2 x A
D
C 1 1 0 1 2 2 2
D 2 2 1 0 x 2 1 E
E 1 2 2 x 0 2 x
F 1 2 2 2 2 0 1 F G
G 2 x 2 1 x 1 0
75
Distance Vector Routing Example
Information in routing table of each node:
Iteration 3
At distance to reach node
node A B C D E F G B
A 0 1 1 2 1 1 2 C
B 1 0 1 2 2 2 3 A
D
C 1 1 0 1 2 2 2
D 2 2 1 0 3 2 1 E
E 1 2 2 3 0 2 3
F 1 2 2 2 2 0 1 F G
G 2 3 2 1 3 1 0
76
Distance Vector Routing Table
C
Destination Cost NextHop
A A 1 A
D
C 1 C
D 2 C
E
E 2 A
F 2 A
F G
G 3 A
77
Distance Vector Routing: Link Failure
• F detects that link to G has failed
• F sets distance to G to infinity and
sends update to A
• A sets distance to G to infinity since B
it uses F to reach G
C
A
D
• A receives periodic update from C
E
with 2-hop path to G
• A sets distance to G to 3 and sends F G
update to F
• F decides it can reach G in 4 hops
via A
78
Count to Infinity Problem
• Link from A to E fails
• A advertises distance of infinity to E, but
B and C advertise a distance of 2 to E ! B
79
Count to Infinity Problem
• Node X notices that its link to Y is
broken
• Other nodes believe that the route
through X is still good
• Mutual deception !!!
80
Heuristic Attempts at Solution
• Limit infinity to network diameter +1
– Small limit allows fast completion of counting to infinity
– Limits network size and growth!!!
• Do not advertise routes to successors (Split Horizon)
– If route to A goes to B, do not inform B about route to A
– Solve mutual deception problem
• Split horizon with poison reverse
– Advertise negative routes to node, learned from that node
– Poisoned the routes sent by the neighbors
• Wait before advertising delays convergence
81
Split Horizon
• Avoid counting to infinity by solving “mutual
deception” problem
• When sending an update to node X, do not include
destinations that you would route through X
– If X thinks route is not through you, no effect
– If X thinks route is through you, X will timeout route
C:2:B D Loop of > 2 nodes fails split horizon !!!
A B C
C:2:B C:1:C
C:∞:-
82
Split Horizon with Poison Reverse
83
Distance Vector Routing Problems
84
Link State Routing
• Distributed, dynamic form of Dijkstra’s algorithm
• Strategy
– Send to all nodes (not just neighbors) information about
directly connected nodes (not entire route table) in LSP
• Basic data structure: Link State Packet (LSP)
– ID of the node that created the LSP
– Cost of link to each directly connected neighbor: vector
of (distance, successor) values
– Sequence number (SEQNO)
– Time-to-live (TTL) for this packet
85
Link State Routing
• Each node maintains a list of (ideally all) LSP’s
– Runs Dijkstra’s algorithm on list
– May discover its neighbors by “Hello” messages
• Information acquisition via reliable flooding
– Create new LSP periodically; send to 1-hop neighbors
• Increment SEQNO (start SEQNO at 0 when reboot)
– Store most recent (higher SEQNO) LSP from each node
– Forward new LSP to all nodes but the one that sent it
• Decrement TTL of each LSP; discard when TTL=0
– Try to minimize routing traffic “overhead”
86
Reliable Flooding
X A X A
C B D C B D
(a) (b)
X A X A
C B D C B D
(c) (d)
87
Link State Routing
• TTL fixes sequence number problems
– Wraparound, bit errors, host crashes
• Used in
– ARPANET: bad heuristics brought down network in 1981
– Internet: Open Shortest Path First (OSPF)
– Intermediate System-Intermediate System (IS-IS)
• Designed for DECnet
• Adopted by ISO for connectionless network layer protocol
(CNLP)
• Used in NSFNET backbone (and others) some digital cellular
systems
• Minor variant in Novell NetWare
88
Route Calculation: Dijkstra’s
Shortest Path Algorithm
• Let
– N denotes set of nodes in the graph
– l (i, j) denotes non-negative cost (weight) for edge (i, j)
– s denotes this node
– M denotes the set of nodes incorporated so far
– C(n) denotes cost of the path from s to node n
M = {s}
for each n in N - {s}
C(n) = l(s, n) // calculate cost to each node
while (M != N)
M = M union {w} such that C(w) is the minimum for
all w in (N - M)
for each n in (N - M)
C(n) = MIN(C(n), C (w) + l(w, n ))
89
Link State Routing
• At each router, perform a forward search
algorithm
• Router maintains two lists
– Tentative
– Confirmed
• Each list contains triplets
– <destination, cost, nexthop>
90
Link State Algorithm
1. Initialize confirmed with entry for self (cost = 0)
2. For newly added node (next), select its LSP
3. For each neighbor of next, calculate cost to reach
neighbor as the sum of cost from self to next and from
next to neighbor
1. If neighbor is currently in neither confirmed nor tentative, add
<neighbor, cost, nexthop> to tentative, where nexthop is the
direction to reach next
2. If neighbor is currently in tentative and cost is less than current
cost for neighbor , then replace current entry with <neighbor,
cost, nexthop>, where nexthop is the direction to reach next
4. If tentative is empty, stop. Otherwise pick entry from
tentative with the lowest cost, move it to confirmed and
return to step 2.
91
Route Calculation
At node D
Confirmed list Tentative list
1. (D,0,-)
2. (D,0,-) (C,2,C), (B,11,B)
3. (D,0,-), (C,2,C) (B,11,B) B
4. (D,0,-), (C,2,C) (B,5,C), (A,12,C) 3
5
5. (D,0,-), (C,2,C), (B,5,C) (A,12,C)
6. (D,0,-), (C,2,C), (B,5,C) (A,10,C) 10
7. (D,0,-), (C,2,C), (B,5,C), (A,10,C) A C
11
2
D
92
OSPF Routing Protocol
• Authentication of routing messages
– Encrypted communication between routers
• Additional hierarchy
– Domains are split into areas
– Routers only need to know how to reach every node in
a domain
– Routers need to know how to get to the right area
– Load balancing
• Allows traffic to be distributed over multiple routes
93
Link State Routing - Scalability
• Stabilizes quickly, does not generates much
traffic, responds to changes or node failures
• One LSP per router in Internet (one vector element
for every link): stored information is large!
• Solution: hierarchical routing
– Split hosts into domains
– Routing occurs in two ways
• Within domains (scales as size of domain)
• Between domains (scales as number of domains)
– Add more levels as necessary
– Organize names (IP addresses) by routing domains
94
Hierarchical Routing Drawbacks
• Slightly inaccurate information
• Cannot distribute heavy loads
shortest
hierarcical path
path
95
Comparison of Routing Approaches
• Distance Vector
– Communicate information to neighbors only
– Exchange information about entire network
• Link State
– Communicate information to entire network
– Exchange information about neighbors only
96
Link State Metrics
• Capture a general notion of distance
• A heuristic combination of (among other factors)
– Distance, Bandwidth
– Average traffic
– Queue length
– Measured delay
• A few to discuss
– Original ARPANET
– New ARPANET
– Revised ARPANET
97
Original ARPANET Metric
• Uniform 56 kbps lines
– Bandwidth equal on every line (hence irrelevant)
– Latency relatively unimportant
• Use queue length as distance (number of packets
waiting to use a link)
• Problems
– Uniform bandwidth assumption became invalid
– Latency comparable to 1kB transmission delay on
1.544 Mbps link
98
New ARPANET Metric
• Captured queuing delay, bandwidth, and latency
• Queue delay
– Timestamp packet arrival time (AT)
– Also stamp departure time (DT)
– Only calculate when link level ACK received
– Average DT - AT over packets and time
– If timeout, reset DT to departure time for
retransmission
• Used fixed (per-link) measurements
– Transmission time (bandwidth), Latency
• Add three terms to find “distance”
– Link cost = average delay over some time period
99
New ARPANET Metric
• Worked well
– Under light load
– Static factors dominated cost
• Oscillated under heavy load
– Heavily loaded link advertises high price
– All traffic moves off
– Link next advertises low price
– All traffic returns
– Repeat cycle
• Range of values too large
100
Revised ARPANET Metric
• Measure link utilization
• Feed measurement through function to restrict
dynamic range
• Specific function chosen carefully based on
bandwidth and latency
• Aspects of class of functions
– Cost is constant at low to moderate utilizations
– Link cost no more than 3 times idle link cost
– Maximum cost no more than 7 times minimum
– High-bandwidth, high-latency link (e.g., satellite) better
than low-bandwidth, low-latency link
101