Chapter 03 Network Layer and IP ThamKhao
Chapter 03 Network Layer and IP ThamKhao
4-2
Network layer
4-3
Two Key Network-Layer Functions
routing algorithm
value in arriving
packet’s header
0111 1
3 2
Datagram networks
applicatio
applicatio
n
n
transport
network 1. Send data 2. Receive data transport
network
data link
data link
physical
physical
4-6
Forwarding table 4 billion
possible entries
otherwise 3
Longest prefix matching
Examples
Physical layer:
bit-level reception
Data link layer: Decentralized switching:
e.g., Ethernet given datagram dest., lookup output
see chapter 5 port using forwarding table in input
port memory
goal: complete input port processing
at ‘line speed’
queuing: if datagrams arrive faster
than forwarding rate into switch fabric
Three types of switching fabrics
Switching Via Memory
System Bus
Switching Via a Bus
Buffering required when datagrams arrive from fabric faster than the
transmission rate
Scheduling discipline chooses among queued datagrams for transmission
Output port queueing
4-20
The Internet Network layer
physical layer
Chapter 4: Network Layer
4.4 Routing
4.1 Introduction
algorithms
Link state
4.2 What’s inside a router
Distance Vector
4.3 IP: Internet Protocol
Hierarchical routing
Datagram format
4.5
IPv4Routing in the Internet
addressing
RIP
ICMP
OSPF
IPv6
BGP
4-22
IP datagram format
IP protocol version 32 bits
number total datagram
header length type of length (bytes)
ver head. length
(bytes) len service for
“type” of data fragment fragmentation/
16-bit identifier flgs
offset reassembly
max number time to upper header
remaining hops live layer checksum
(decremented at
32 bit source IP address
each router)
32 bit destination IP address
upper layer protocol
to deliver payload to Options (if any) E.g. timestamp,
record route
how much data taken, specify
overhead with (variable length, list of routers
TCP? typically a TCP to visit.
20 bytes of TCP or UDP segment)
20 bytes of IP
= 40 bytes +
IP Fragmentation & Reassembly
network links have MTU
(max.transfer size) - largest
possible link-level frame. fragmentation:
different link types, in: one large datagram
different MTUs out: 3 smaller datagrams
large IP datagram divided
(“fragmented”) within net
one datagram becomes reassembly
several datagrams
“reassembled” only at
final destination
IP header bits used to
identify, order related
fragments
4-24
IP Fragmentation and Reassembly
length ID fragflag offset
Example =4000 =x =0 =0
4000 byte
One large datagram becomes
datagram several smaller datagrams
MTU = 1500
bytes length ID fragflag offset
=1500 =x =1 =0
1480 bytes in
data field length ID fragflag offset
=1500 =x =1 =185
offset =
1480/8 length ID fragflag offset
=1040 =x =0 =370
Chapter 4: Network Layer
4.51Routing
4. Introduction
algorithms
Link state
4.2 Virtual circuit and datagram networks
Distance Vector
4.3 What’s inside a router
Hierarchical routing
4.4 IP: Internet Protocol
4.6 Routing in the Internet
Datagram format
RIP
IPv4 addressing
OSPF
ICMP
BGP
IPv6
4.7 Broadcast and multicast routing
4-26
IP Addressing: introduction
223.1.1.1
IP address: 32-bit
identifier for host, 223.1.2.1
223.1.1.2
router interface 223.1.1.4 223.1.2.9
interface: 223.1.2.2
223.1.3.27
connection between 223.1.1.3
host/router and
physical link
223.1.3.1 223.1.3.2
router’s typically have
multiple interfaces
host typically has one
interface
223.1.1.1 = 11011111 00000001 00000001 00000001
IP addresses
associated with each 223 1 1 1
interface
Subnets
223.1.1.1
IP address:
subnet part (high 223.1.2.1
223.1.1.2
order bits) 223.1.1.4 223.1.2.9
host part (low order
bits) 223.1.1.3 223.1.3.27
223.1.2.2
What’s a subnet ?
subnet
device interfaces with
same subnet part of 223.1.3.1 223.1.3.2
IP address
can physically reach
each other without
network consisting of 3 subnets
intervening router
Subnets
Recipe 223.1.1.0/24
223.1.2.0/24
To determine the
223.1.3.0/24
223.1.1.1 223.1.1.4
How many?
223.1.1.3
223.1.9.2 223.1.7.0
223.1.9.1 223.1.7.1
223.1.8.1 223.1.8.0
223.1.2.6 223.1.3.27
subnet host
part part
11001000 00010111 00010000 00000000
200.23.16.0/23
IP addresses: how to get one?
A DHCP 223.1.2.1
223.1.1.1
server
223.1.1.2
223.1.1.4 223.1.2.9
B
223.1.2.2 arriving DHCP
223.1.1.3 223.1.3.27 E client needs
address in this
223.1.3.1 223.1.3.2
network
DHCP client-server scenario
DHCP server: 223.1.2.5 arriving
DHCP discover
client
src : 0.0.0.0, 68
dest.: 255.255.255.255,67
yiaddr: 0.0.0.0
transaction ID: 654
DHCP offer
src: 223.1.2.5, 67
dest: 255.255.255.255, 68
yiaddrr: 223.1.2.4
transaction ID: 654
Lifetime: 3600 secs
DHCP request
src: 0.0.0.0, 68
dest:: 255.255.255.255, 67
yiaddrr: 223.1.2.4
transaction ID: 655
time Lifetime: 3600 secs
DHCP ACK
src: 223.1.2.5, 67
dest: 255.255.255.255, 68
yiaddrr: 223.1.2.4
transaction ID: 655
Lifetime: 3600 secs
Network Layer
IP addresses: how to get one?
Q: How does network get subnet part of IP
addr?
A: gets allocated portion of its provider
ISP’s address space
ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20
Organization 0
200.23.16.0/23
Organization 1
“Send me anything
200.23.18.0/23 with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
beginning
199.31.0.0/16”
Hierarchical addressing: more specific
routes
ISPs-R-Us has a more specific route to Organization 1
Organization 0
200.23.16.0/23
“Send me anything
with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
Organization 1 beginning 199.31.0.0/16
or 200.23.18.0/23”
200.23.18.0/23
IP addressing: the last word...
10.0.0.4
10.0.0.2
138.76.29.7
10.0.0.3
32-bit Internet
Address
ARP RARP
48-bit Ethernet
Address
RARP: Reverse Address Resolution
Protocol
RARP = Reverse ARP.
RARP is the opposite of ARP.
ARP is used when the IP address is
known but the physical address is not
known.
RARP is used when the physical address
is known but the IP address is not known.
RARP is often used in conjunction with
the BOOTP protocol (boot PROM) to boot
diskless workstations.
Chapter 4: Network Layer
4.51Routing
4. Introduction
algorithms
Link state
4.2 Virtual circuit and datagram networks
Distance Vector
4.3 What’s inside a router
Hierarchical routing
4.4 IP: Internet Protocol
4.6 Routing in the Internet
Datagram format
RIP
IPv4 addressing
OSPF
ICMP
BGP
IPv6
4.7 Broadcast and multicast routing
ICMP - Internet Control Message Protocol
no fragmentation allowed
IPv6 Header (Cont)
Priority: identify priority among datagrams in flow
Flow Label: identify datagrams in same “flow.”
(concept of“flow” not well defined).
Next header: identify upper layer protocol for data
Other Changes from IPv4
Checksum: removed entirely to reduce
processing time at each hop
Options: allowed, but outside of header,
indicated by “Next Header” field
ICMPv6: new version of ICMP
additional message types, e.g. “Packet Too
Big”
multicast group management functions
Transition From IPv4 To IPv6
Not all routers can be upgraded
simultaneous
no “flag days”
How will the network operate with mixed
IPv4 and IPv6 routers?
Tunneling: IPv6 carried as payload in IPv4
datagram among IPv4 routers
Tunneling
A B E F
4-60
Logical view: tunnel
A B E F
Physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6
Tunneling
A B E F
Logical view: tunnel
A B C D E F
Physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6
data data
A-to-B: E-to-F:
B-to-C: D-to-E:
IPv6 IPv6
IPv6 inside IPv6 inside
IPv4 IPv4
Chapter 4: Network Layer
4.4 Introduction
4.1 Routing algorithms
Link state
4.2 What’s inside a router
Distance Vector
4.3 IP: Internet Protocol
Hierarchical routing
Datagram format
4.5
IPv4Routing in the Internet
addressing
RIP
ICMP
OSPF
IPv6
BGP
4-62
Interplay between routing, forwarding
routing algorithm
value in arriving
packet’s header
0111 1
3 2
Graph abstraction
5
3
v w 5
2
u 2 1 z
3
1 2
Graph: G = (N,E)
x 1
y
N = set of routers = { u, v, w, x, y, z }
E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }
Notation: algorithm
Dijkstra’s
net topology, link costs known to all nodes
c(x,y): link cost from node x to y; = ∞ if not direct
neighbors
accomplished via “link state broadcast”
5
3
v w 5
2
u 2 1 z
3
1 2
x 1
y
Dijkstra’s algorithm: example (2)
Resulting shortest-path tree from u:
v w
u z
x y
N
n(n+1)/2 comparisons: O(n2)
O(nlogn)
Oscillations possible:
e.g.,
1 A link cost = amount
1+e A 2+e
of carried
0
A 0
traffic A
2+e 2+e 0
D B D 1+e1 B D
0 0 0 0 B D 1+e1 B
0 e 0 0 1 1+e 0 e
1
C C C C
1
e
… recompute … recompute … recompute
initially
routing
Chapter 4: Network Layer
4.4 Routing
4.1 Introduction
algorithms
Link state
4.2 What’s inside a router
Distance Vector
4.3 IP: Internet Protocol
Hierarchical routing
Datagram format
4.5
IPv4Routing in the Internet
addressing
RIP
ICMP
OSPF
IPv6
BGP
4-73
Distance Vector Algorithm
Bellman-Ford Equation (dynamic
programming)
Define
dx(y) := cost of least-cost path from x to y
Then
v
dx(y) = min {c(x,v) + dv(y) }
from
y ∞∞ ∞ y 2 0 1
z ∞∞ ∞ z 7 1 0
node y table
cost to
x y z y
2 1
x ∞ ∞ ∞
x z
from
y 2 0 1 7
z ∞∞ ∞
node z table
cost to
x y z
x ∞∞ ∞
from
y ∞∞ ∞
z 71 0
4-79
time
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} Dx(z) = min{c(x,y) +
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
node x table = min{2+1 , 7+0} = 3
cost to cost to cost to
x y z x y z x y z
x 0 2 7 x 0 2 3 x 0 2 3
from
from
from
y ∞∞ ∞ y 2 0 1 y 2 0 1
z ∞∞ ∞ z 7 1 0 z 3 1 0
node y table
cost to cost to cost to
x y z x y z x y z y
2 1
x ∞ ∞ ∞ x 0 2 7 x 0 2 3 x z
from
from
from
y 2 0 1 y 2 0 1 y 2 0 1 7
z ∞∞ ∞ z 7 1 0 z 3 1 0
node z table
cost to cost to cost to
x y z x y z x y z
x ∞∞ ∞ x 0 2 7 x 0 2 3
from
from
from
y ∞∞ ∞ y 2 0 1 y 2 0 1
z 71 0 z 3 1 0 z 3 1 0
4-80
time
Distance Vector: link cost changes
distance vector
if DV changes,
At time tnotify
0, y detects the link-cost change, updates its DV,
neighbors
and informs its neighbors.
“good
At time t1, z receives the update from y and updates its table.
news
It computes a new least cost to x and sends its neighbors its D
travels
fast” At time t2, y receives z’s update and updates its distance table
y’s least costs do not change and hence y does not send any
message to z.
Distance Vector: link cost changes
Example:
Consider the three-node topology shown in
Figure 4.30. Rather than having the link costs
shown in Figure 4.30, the link costs are c(x,y) =
3, c(y,z) = 6, c(z,x) = 4. Compute the distance
tables after the initialization step and after each
iteration of a synchronous version of the
distance-vector algorithm
y
3 6
x z
4
Comparison of LS and DV algorithms
4-86
Interconnected ASes
3c
3a 2c
3b 2a
AS3 2b
1c AS2
1a 1b AS1
1d forwarding table
configured by both
intra- and inter-AS
Intra-AS
Routing
Inter-AS
Routing routing algorithm
algorithm algorithm
intra-AS sets entries
Forwarding for internal dests
table
inter-AS & intra-As sets
entries for external
dests
Inter-AS tasks AS1 must:
suppose router in 1. learn which dests
AS1 receives are reachable
datagram destined through AS2, which
outside of AS1: through AS3
router should 2. propagate this
forward packet to reachability info to
gateway router, all routers in AS1
but which one? Job of inter-AS routing!
3c
3a 2c
3b 2a
AS3 2b
1c AS2
1a 1b AS1
1d
Example: Setting forwarding table in router
1d
suppose AS1 learns (via inter-AS protocol) that
subnet x reachable via AS3 (gateway 1c) but not
via AS2.
inter-AS protocol propagates reachability info to all
internal routers.
router 1d determines from intra-AS routing info that
its interface I is on the least cost path to 1c.
installs forwarding table entry (x,I)
3c
… x
3a 2c
3b 2a
AS3 2b
1c AS2
1a 1b AS1
1d
Example: Choosing among multiple ASes
now suppose AS1 learns from inter-AS protocol
that subnet x is reachable from AS3 and from
AS2.
to configure forwarding table, router 1d must
determine towards which gateway it should
forward packets for dest x.
this is also job of inter-AS routing protocol!
3c … x …
3a 2c
3b 2a
AS3 2b
1c AS2
1a 1b AS1
1d
Example: Choosing among multiple ASes
now suppose AS1 learns from inter-AS protocol
that subnet x is reachable from AS3 and from
AS2.
to configure forwarding table, router 1d must
determine towards which gateway it should
forward packets for dest x.
this is also job of inter-AS routing protocol!
hot potato routing: send packet towards closest
of two routers.
4-94
RIP ( Routing Information Protocol)
distance vector algorithm
included in BSD-UNIX Distribution in 1982
distance metric: # of hops (max = 15 hops)
u destination hops
v
u 1
A B w v 2
w 2
x 3
x y 3
z C D z 2
y
RIP advertisements
distance vectors: exchanged among
neighbors every 30 sec via Response
Message (also called advertisement)
each advertisement: list of up to 25
destination subnets within AS
RIP: Example
z
w x y
A D B
C
Destination Network Next Router Num. of hops
to dest.
w A 2
y B 2
z B 7
x -- 1
…. …. ....
Routing/Forwarding table in D
RIP: Example
Dest Next hops
w - 1 Advertisement
x - 1 from A to D
z C 4
…. … ...
z
w x y
A D B
C
Destination Network Next Router Num. of hops
to dest.
w A 2
y B 2
z BA 75
x -- 1
…. ….
Routing/Forwarding table in D ....
RIP: Link Failure and Recovery
If no advertisement heard after 180 sec -->
neighbor/link declared dead
routes via neighbor invalidated
new advertisements sent to neighbors
neighbors in turn send out new
advertisements (if tables changed)
link failure info quickly (?) propagates to
entire net
poison reverse used to prevent ping-pong
loops (infinite distance = 16 hops)
RIP Table processing
Transprt Transprt
(UDP) (UDP)
network forwarding network
forwarding table (IP)
(IP)
link table link
physical physical
Chapter 4: Network Layer
4.4 Routing
4.1 Introduction
algorithms
Link state
4.2 What’s inside a router
Distance Vector
4.3 IP: Internet Protocol
Hierarchical routing
Datagram format
4.5
IPv4Routing in the Internet
addressing
RIP
ICMP
OSPF
IPv6
BGP
4-
101
OSPF (Open Shortest Path First)
“open”: publicly available
uses Link State algorithm
LS packet dissemination
topology map at each node
route computation using Dijkstra’s algorithm
eBGP session
3c iBGP session
3a 2c
3b 2a
AS3 2b
1c AS2
1a 1b
AS1 1d
Path attributes & BGP
routes
advertised prefix includes BGP attributes.
prefix + attributes = “route”
two important attributes:
AS-PATH: contains ASs through which prefix
advertisement has passed: e.g, AS 67, AS
17
NEXT-HOP: indicates specific internal-AS
router to next-hop AS. (may be multiple
links from current AS to next-hop-AS)
when gateway router receives route
advertisement, uses import policy to
accept/decline.
BGP route selection
router may learn about more than 1
route to some prefix. Router must
select route.
elimination rules:
1. local preference value attribute: policy
decision
2. shortest AS-PATH
3. closest NEXT-HOP router: hot potato
routing
4. additional criteria
BGP messages
BGP messages exchanged using TCP.
BGP messages:
OPEN: opens TCP connection to peer and
authenticates sender
UPDATE: advertises new path (or withdraws
old)
KEEPALIVE keeps connection alive in
absence of UPDATES; also ACKs OPEN
request
NOTIFICATION: reports errors in previous
msg; also used to close connection
BGP routing policy
legend: provider
B network
X
W A
customer
C network:
A advertises path AW to B
B advertises path BAW to X
Should B advertise path BAW to C?
No way! B gets no “revenue” for routing CBAW
since neither W nor C are B’s customers
B wants to force C to route to w via A
B wants to route only to/from its customers!
Why different Intra- and Inter-AS routing ?
Policy:
Inter-AS: admin wants control over how its traffic
routed, who routes through its net.
Intra-AS: single admin, so no policy decisions
needed
Scale:
hierarchical routing saves table size, reduced
update traffic
Performance:
Intra-AS: can focus on performance
4-
117
Broadcast Routing
deliver packets from source to all other nodes
source duplication is inefficient:
duplicate
duplicate
R1 creation/transmission R1
duplicate
R2 R2
R3 R4 R3 R4
source in-network
duplication duplication
B B
c c
D D
F E F E
G G
(a) Broadcast initiated at A (b) Broadcast initiated at D
Spanning Tree: Creation
Center node
Each node sends unicast join message to
center node
Message forwarded until it arrives at a node already
belonging to spanning tree
A A
3
B B
c c
4
2
D D
F E F E
1 5
G G
(a) Stepwise construction (b) Constructed spanning
of spanning tree tree
Multicast Routing: Problem
Statement
Goal: find a tree (or trees) connecting
routers having local mcast group
members
tree: not all paths between routers used
source-based: different tree from each sender to rcvrs
shared-tree: same tree used by all group members
no downstream groupLEGEND
S: source
members
R1 router with attached
R4
group member
LEGEND
Dense: Sparse:
group members # networks with group
densely packed, in members small wrt #
“close” proximity. interconnected networks
bandwidth more group members “widely
plentiful dispersed”
bandwidth not plentiful
Consequences of Sparse-Dense Dichotomy
Dense Sparse:
group membership by no membership until
routers assumed until routers explicitly join
routers explicitly prune receiver- driven
data-driven construction of mcast
construction on mcast tree (e.g., center-
tree (e.g., RPF) based)
bandwidth and non- bandwidth and non-
group-router group-router processing
processing profligate conservative
PIM- Dense Mode
which distributes R1
R4
down RP-rooted tree join
RP can extend mcast R2
join
tree upstream to R5
source join
R3 R7
RP can send stop R6
msg if no attached all data multicast rendezvous
receivers from rendezvous point
“no one is listening!” point
Summary
4.4 Routing
4.1 Introduction
algorithms
Link state
4.2 What’s inside a router
Distance Vector
4.3 IP: Internet Protocol
Hierarchical routing
Datagram format
4.5
IPv4Routing in the Internet
addressing
RIP
ICMP
OSPF
IPv6
BGP
4-
139