Lecture 03 Thu Fri Routing Isis BGP
Lecture 03 Thu Fri Routing Isis BGP
Overview
Desirable Characteristics
of Dynamic Routing
Automatically detect and adapt to
topology changes
Provide optimal routing
Scalability
Robustness
Simplicity
Rapid convergence
Some control of routing choices
E.g. which links we prefer to use
Interplay between routing
& forwarding
routing algorithm
value in arriving
packet’s header
0111 1
3 2
IP Routing – finding the
path
Path is derived from information
received from the routing protocol
Several alternative paths may exist
best next hop stored in forwarding table
Decisions are updated periodically or as
topology changes (event driven)
Decisions are based on:
topology, policies and metrics (hop count,
filtering, delay, bandwidth, etc.)
Convergence – why do I
care?
Convergence is when all the routers
have a stable view of the network
When a network is not converged there
is network downtime
Packets don’t get to where they are
supposed to go
Black holes (packets “disappear”)
Routing Loops (packets go back and fore between
the same devices)
Occurs when there is a change in status of
router or the links
Internet Routing Hierarchy
The Internet is composed of
Autonomous Systems
Each Autonomous System is an
administrative entity that
Uses Interior Gateway Protocols (IGPs) to
determine routing within the Autonomous
System
Uses Exterior Gateway Protocols (EGPs) to
interact with other Autonomous Systems
Internet Routing
Architecture
Autonomous Autonomous
System (AS) System (AS)
Routing = building
maps and giving
directions
Forwarding = moving
packets between
interfaces according to
the “directions”
IP Forwarding
Source
S IP Subnet
IP Subnet
IP Subnet
Destination
IP Subnet
D
Forwarding decisions:
Destination address
class of service (fair queuing, precedence, others)
local requirements (packet filtering)
Routing Tables Feed the
Forwarding Table
Forwarding Information Base (FIB)
Static Routes
RIB Construction
Each routing protocol builds its own
Routing Information Base (RIB)
Each protocol has its own “view” of
“costs”
e.g., ISIS is administrative weights
e.g., BGP4 is Autonomous System path
length
FIB Construction
There is only ONE forwarding table!
An algorithm is used to choose one
next-hop toward each IP destination
known by any routing protocol
the set of IP destinations present in any RIB
are collected
if a particular IP destination is present in
only one RIB, that RIB determines the next
hop forwarding path for that destination
FIB Construction
Choosing FIB entries, cont..
if a particular IP destination is present in
multiple RIBs, then a precedence is defined
to select which RIB entry determines the
next hop forwarding path for that
destination
This process normally chooses exactly one
next-hop toward a given destination
There are no standards for this; it is an
implementation (vendor) decision
FIB Contents
IP subnet and mask (or length) of
destinations
can be the “default” IP subnet
IP address of the “next hop” toward
that IP subnet
Interface id of the subnet associated
with the next hop
Optional: cost metric associated with
this entry in the forwarding table
IP routing
Default route
where to send packets if there is no entry
for the destination in the routing table
most machines have a single default route
often referred to as a default gateway
0.0.0.0/0
matches all possible destinations, but is usually
not the longest match
IP route lookup:
Longest match routing
R3 Most of 10.0.0.0/8
except for
Packet: Destination
10.1.0.0/16
IP address: 10.1.1.1
R1 R2 R4
10.1.0.0/16
Based on
destination IP R2’s IP forwarding table
address 10.0.0.0/8 R3
10.1.0.0/16 R4
20.0.0.0/8 R5
0.0.0.0/0 R1
IP route lookup:
Longest match routing
R3 Most of 10.0.0.0/8
except for
Packet: Destination
10.1.0.0/16
IP address: 10.1.1.1
R1 R2 R4
10.1.0.0/16
Based on
destination IP R2’s IP forwarding table
address 10.0.0.0/8 R3 10.1.1.1 & FF.00.00.00
10.1.0.0/16 R4 vs.
20.0.0.0/8 R5 10.0.0.0 & FF.00.00.00
0.0.0.0/0 R1 Match! (length 8)
IP route lookup:
Longest match routing
R3 Most of 10.0.0.0/8
except for
Packet: Destination
10.1.0.0/16
IP address: 10.1.1.1
R1 R2 R4
10.1.0.0/16
Based on
destination IP R2’s IP forwarding table
address 10.0.0.0/8 R3 10.1.1.1 & FF.FF.00.00
10.1.0.0/16 R4 vs.
20.0.0.0/8 R5 10.1.0.0 & FF.FF.00.00
0.0.0.0/0 R1 Match! (length 16)
IP route lookup:
Longest match routing
R3 Most of 10.0.0.0/8
except for
Packet: Destination
10.1.0.0/16
IP address: 10.1.1.1
R1 R2 R4
10.1.0.0/16
Based on
destination IP R2’s IP forwarding table
address 10.0.0.0/8 R3 10.1.1.1 & FF.00.00.00
10.1.0.0/16 R4 vs.
20.0.0.0/8 R5 20.0.0.0 & FF.00.00.00
0.0.0.0/0 R1 No Match!
IP route lookup:
Longest match routing
R3 Most of 10.0.0.0/8
except for
Packet: Destination
10.1.0.0/16
IP address: 10.1.1.1
R1 R2 R4
10.1.0.0/16
Based on
destination IP R2’s IP forwarding table
address 10.0.0.0/8 R3 10.1.1.1 & 00.00.00.00
10.1.0.0/16 R4 vs.
20.0.0.0/8 R5 0.0.0.0 & 00.00.00.00
0.0.0.0/0 R1 Match! (length 0)
IP route lookup:
Longest match routing
R3 Most of 10.0.0.0/8
except for
Packet: Destination
10.1.0.0/16
IP address: 10.1.1.1
R1 R2 R4
10.1.0.0/16
Based on
destination IP R2’s IP forwarding table
address 10.0.0.0/8 R3 This is the longest
10.1.0.0/16 R4 matching prefix (length
20.0.0.0/8 R5 16). “R2” will send the
0.0.0.0/0 R1 packet to “R4”.
IP route lookup:
Longest match routing
Most specific/longest match always
wins!!
Many people forget this, even experienced
ISP engineers
Default route is 0.0.0.0/0
Can handle it using the normal longest
match algorithm
Matches everything. Always the shortest
match.
Graph abstraction
5
3
v w 5
2
u 2 1 z
3
1 2
Graph: G = (N,E) x 1
y
N = set of routers = { u, v, w, x, y, z }
E = set of links ={ (u,v), (u,x), (v,x),
(v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }
Path: Sequence of edges (routers)
Define
dx(y) := cost of least-cost path from x to y
Then
dx(y) = min v{c(x,v) + dv(y) }
Distributed:
each node notifies if Distance Vector to any
neighbors only when its dest has changed, notify
neighbors
Distance Vector changes
neighbors then notify their
neighbors if necessary
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} Dx(z) = min{c(x,y) +
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
node x table
cost to cost to cost to
x y z x y z x y z
x 0 2 7 x 0 2 3 x 0 2 3
from
from
from
y ∞∞ ∞ y 2 0 1 y 2 0 1
z ∞∞ ∞ z 7 1 0 z 3 1 0
node y table
cost to cost to cost to
x y z x y z x y z y
2 1
x ∞ ∞ ∞ x 0 2 7 x 0 2 3 x z
from
from
from
y 2 0 1 y 2 0 1 y 2 0 1 7
z ∞∞ ∞ z 7 1 0 z 3 1 0
node z table cost to cost to
cost to
x y z x y z x y z
x ∞∞ ∞ x 0 2 7 x 0 2 3
from
from
from
y ∞ ∞ ∞ y 2 0 1 y 2 0 1
z 7 1 0 z 3 1 0 z 3 1 0
time
Distance Vector (DV): link cost
changes
Link cost changes:
1
node detects local link cost change y
updates routing info, recalculates 4 1
distance vector x z
50
if DV changes, notify neighbors
node y table
cost to cost to cost to
x y z x y z x y z
x x x
y 41 0 1
from
from
from
y 1 0 1 y 1 0 1
z 5 1 0 z 5 1 0 z 2 1 0
node z table
cost to cost to cost to
x y z x y z x y z
x x x
from
from
y 4 0 1 y 1 0 1 from y 1 0 1
z 5 1 0 z 5 21 0 z 2 1 0
time
Distance Vector: link cost
changes
node y table
cost to cost to cost to
x y z x y z x y z
x x x
y 46 0 1
from
from
from
y 6 0 1 y 68 0 1
z 5 1 0 z 5 1 0 z 7 1 0
node z table
cost to cost to cost to
x y z x y z x y z
x x x
from
from
from
y 4 0 1 y 6 0 1 y 6 0 1
z 5 1 0 z 57 1 0 z 7 1 0
time
Distance Vector: link cost
changes
Link cost changes: 60
good news travels fast y
4 1
bad news travels slow – x z
“count to infinity” problem! 50
44 iterations before algorithm
stabilizes.
Poisoned reverse:
If Z routes through Y to get to X
:
Z tells Y its (Z’s) distance to
X is infinite (so Y won’t route
to X via Z)
will this completely solve count
to infinity problem?
RIP (Routing Information
Protocol)
Distance vector algorithm
Included in BSD-UNIX Distribution in 1982
Distance metric: # of hops (max = 15
hops)
From router A to subsets:
u v destination hops
u 1
A B w
v 2
w 2
x x 3
z C D
y 3
y
z 2
RIP advertisements
Distance vectors: exchanged among
neighbors every 30 sec via Response
Message (also called advertisement)
Each advertisement: list of up to 25
destination nets within AS
RIP: link failure and
recovery
If no advertisement heard after 180
sec, neighbor/link declared dead
Routes via the neighbor are invalidated
New advertisements sent to neighbors
Neighbors in turn send out new
advertisements (if their tables changed)
Link failure info quickly propagates to entire
net
Poison reverse used to prevent ping-pong
loops (infinite distance = 16 hops)
Why not use RIP?
RIP is a Distance Vector Algorithm
Listen to neighbouring routes
Install all routes in routing table
Lowest hop count wins
Advertise all routes in table
Very simple, very stupid
Only metric is hop count
Network is max 16 hops (not large enough)
Slow convergence (routing loops)
Poor robustness
EIGRP
“Enhanced Interior Gateway Routing Protocol”
Predecessor was IGRP which was classfull
IGRP developed by Cisco in mid 1980s to overcome
scalability problems with RIP
Cisco proprietary routing protocol
Distance Vector Routing Protocol
Has very good metric control
5
3
B C 5
2
A 2 1 F
3
1 2
D E
1
Dijkstra’s algorithm:
example (2)
Resulting shortest-path tree from A:
B C
A F
D E
Intermediate-System
to
Intermediate-System
IS-IS Overview
The Intermediate Systems to Intermediate
System Routing Protocol (IS-IS) was originally
designed to route the ISO Connectionless
Network Protocol (CLNP) . (ISO10589 or RFC
1142)
Adapted for routing IP in addition to CLNP
(RFC1195) as Integrated or Dual IS-IS
IS-IS is a Link State Protocol similar to the
Open Shortest Path First (OSPF). OSPF
supports only IP
IS-IS Overview
3 network layer protocols play together to
deliver the ISO defined Connectionless
Network Service
CLNP
IS-IS
ES-IS – End System to Intermediate System
All 3 protocols independently go over layer-
2
IS-IS Overview
CLNP is the ISO equivalent of IP for
datagram delivery services (ISO 8473, RFC
994)
ES-IS is designed for routing between
network hosts and routers (ISO 9542, RFC
995).
IS-IS for layer 3 routing between routers.
(ISO 10589/RFC 1142). Integrated IS-IS
(RFC 1195) works within the ISO CNLS
framework even when used for routing only
IP.
IS-IS Overview
End System Hellos (ESH) from Hosts and
Intermediate System Hellos (ISH) from
Routers used for ES-IS neighbor discovery
Intermediate System to Intermediate
Systems Hellos (IIH) are used for
establishing IS-IS layer3 adjacencies
ES-IS is somehow tied into IS-IS layer 3
adjacency discovery. ES-IS enabled
automatically when IS-IS is configured on
Ciscos
Link State Algorithm
Each router contains a database
containing a map of the whole topology
Links
Their state (including cost)
All routers have the same information
All routers calculate the best path to
every destination
Any link state changes are flooded
across the network
“Global spread of local knowledge”
ISIS Levels
ISIS has a 2 layer hierarchy
Level-2 (the backbone)
Level-1 (the areas)
A router can be
Level-1 (L1) router
Level-2 (L2) router
Level-1-2 (L1L2) router
L1, L2, and L1L2 Routers
Area-3
L1-only
L1L2
Area-2 L2-only
L1L2
L1L2
L1-only
Area-4
L1L2 L1-only
Area-1
L1L2
L1-only
IS-IS Protocol Concepts
IS-IS Packet Types
IS-IS Hello Packets (IIH)
Level 1 LAN IS-IS Hello
Level 2 LAN IS-IS Hello
Point-to-point Hello
Link State Packets (LSP)
Level 1 and Level 2
Complete Sequence Number packets (CSNP)
Level 1 and Level 2
Partial Sequence Number Packets (PSNP)
Level 1 and Level 2
Backbone & Areas
ISIS does not have a backbone area as
such (like OSPF)
Instead the backbone is the contiguous
collection of Level-2 capable routers
ISIS area borders are on links, not
routers
Each router is identified with Network
Entity Title (NET)
NET is an NSAP where the n-selector is 0
3. CLNS Addressing
NSAP Format
AFI Values
Requirements and Caveats
Examples
Globally unique NSAPs
CLNS Addressing
NSAP Format
X.121 37
ISO DCC 39
ISO 6523 47
Local 49
47.0001.aaaa.bbbb.cccc.00
Area = 47.0001, SysID = aaaa.bbbb.cccc, NSel = 00
Example 2
39.0f01.0002.0000.0c00.1111.00
Area = 39.0f01.0002, SysID = 0000.0c00.1111, NSel = 00
Example 3.
49.0002.0000.0000.0007.00
Area = 49.0002, SysID = 0000.0000.0007, Nsel = 00
An Addressing Example
49.0f01.0002.4444.4444.4444.00 Area 3
49.0f01.0003.6666.6666.6666.00
Area 2
49.0f01.0002.3333.3333.3333.00
49.0f01.0004.7777.7777.7777.00 Area 4
49.0f01.0001.2222.2222.2222.00
Area 1
49.0f01.0004.8888.8888.8888.00
49.0f01.0001.1111.1111.1111.00
CLNS Addressing
How do most ISP define System IDs?
Interface Loopback 0
IP address 192.168.3.25
Router isis
Net 49.0001.1921.6800.3025.00
RTB. RTC
RTA
LSP: RTA.00-00
RTD. RTE
Source ID ID Length
Holding Time 2
PDU Length 2
Local Circuit ID 1
Interface 2 Interface 3
LSP LSP
RTA.00-00 RTA.00-00
SEQ#100 SEQ#100
PSNP PSNP
RTA.00-00 RTA.00-00
SEQ#100 SEQ#100
IS-IS Database Timers
Timer Default Value Cisco IOS Command
Wide-Area-Network speeds
OC-1 is a SONET line with transmission speeds of
up to 51.84 Mbit/s.
OC-3 / STM-1x : 155.52 Mbit/s
OC-12 / STM-4x : 622.08 Mbit/s
OC-48 / STM-16x / 2.5G SONET
OC-192 / STM-64x / 10G SONET
OC-768 / STM-256x / 40G
Basic Configuration
12.1.1.0/24
GSR1 GSR2 GSR4
.8 .2 .5 198.168.1.4/30 .6
e0 Pos1/0 Pos1/0
hostname GSR2 hostname GSR4
clns routing clns routing
! !
interface Loopback0 interface Loopback0
ip address 13.1.1.2 255.255.255.0 ip address 13.1.1.2 255.255.255.0
ip router isis ip router isis
!
interface Ethernet0 interface POS2/0
ip address 12.1.1.2 255.255.255.0 ip address 10.1.1.2 255.255.255.0
ip router isis ip router isis
! !
interface POS2/0 router isis
ip address 10.1.1.1 255.255.255.252 net 49.0002.0000.0000.0004.00
ip router isis
!
router isis
net 49.0001.0000.0000.0002.00
!
clns host GSR1 49.0001.0000.0000.0008.00
Verifying Operation
show clns neighbors
GSR2#show clns neighbors
GSR2#sh ip route
Codes: C - connected, S - static,
i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area
hostname RTB
!
interface Ethernet0
ip address 172.170.1.1 255.255.255.0
ip router isis
!
router isis
summary-address 172.170.0.0 255.255.0.0
net 49.0001.0000.0000.0001.00
Summarization
RTE#sh ip route
Gateway of last resort is not set
i L2 172.170.0.0/16 [115/20] via 172.16.5.5, Serial 0
172.16.0.0/16 is subnetted, 1 subnets
C 172.16.5.4/30 is directly connected, Serial0
172.170.1.0/24 172.16.2.0/24
172.80.1.1/24
RTE
router ospf 1
network 172.16.2.0 0.0.0.255 area 0
!
router isis
redistribute ospf 1 metric 20 metric-type internal level-2
net 49.0002.0000.0000.0002.00
Troubleshooting
CLNS Commands
show clns int
show clns protocol
show clns neighbors detail
show clns is-neighbors
show clns es-neighbors
show clns route
show clns cache
show clns traffic
show isis spf-log
show isis database detail
show isis database<lspid>
show isis route
show isis database L1|L2
Troubleshooting
SPF Logs
Rtr-B Rtr-C
Router-B
Rtr-B Rtr-C
Router-C
router isis
passive-interface default
no passive-interface GigabitEthernet 0/0
interface GigabitEthernet 0/0
ip router isis
isis metric 1 level-2
Network Design Issues
As in all IP network designs, the key issue is
the addressing lay-out
ISIS supports a large number of routers in a
single area
When using areas, use summary-addresses
>400 routers in the backbone is quite doable
Border Gateway Protocol
Introduction
BGP Protocol Basics
Peering
A C
AS 100 AS 101
B D
Static routes
or IGP routes
to small Static default
customers route to
provider
Small ISP
Static or IGP
routes inside
What happens with other
ISPs in the same
region/country
Similar setup
Traffic between you and them goes over
Your expensive line
Their expensive line
Traffic can be significant
Your customers want to talk to their
customers
Same language/culture
Local email, discussion lists, web sites
Keeping Local Traffic Local
Mainland
Europe
or USA
Upstream ISP
Small Small
ISP ISP
UK
Consider a larger ISP with
multiple upstreams
Large ISP multi-homes to two or more
upstream providers
multiple connections
to achieve:
redundancy
connection diversity
increased speeds
Use BGP to choose a different upstream for
different destination addresses
A Large ISP with more
than one upstream
provider
Upstream Upstream
USA ISP ISP
Mainland
Europe
ISP UK
Terminology: “Policy”
Where do you want your traffic to go?
It is difficult to get what you want, but you
can try
Control of how you accept and send
routing updates to neighbours
Prefer cheaper connections
Prefer connections with better latency
Load-sharing, etc
“Policy” (continued)
Implementing policy:
Accepting routes from some ISPs and not
others
Sending some routes to some ISPs and not
to others
Preferring routes from some ISPs over
those from other ISPs
“Policy” Implementation
You want to use a local line to talk to
the customers of other local ISPs
local peering
You do not want other local ISPs to use
your expensive international lines
no free transit!
So you need some sort of control over
routing policies
BGP can do this
Terminology:
“Peering” and “Transit”
Peering: getting connectivity to the
network of other ISPs
… and just that network, no other networks
Usually at zero cost (zero-settlement)
Transit: getting connectivity though the
other ISP to other ISP networks
… getting connectivity to rest of world (or part
thereof)
Usually at cost (customer-provider
relationship)
Terminology:
“Aggregation”
Combining of several smaller blocks of
address space into a larger block
For example:
192.168.4.0/24 and 192.168.5.0/24 are
contiguous address blocks
They can be combined and represented as
192.168.4.0/23…
…with no loss of information!
Customers and Providers
provider
customer
$$$ $$$
$$$ $$$
peer peer
Peers provide transit between
provider customer
their respective customers
Global Transit /
Global
Core
Large Content,
“tier-1” Consumer, Hosting CDN
ISP 1
ISP 2
Customer IP
Networks
Summary:
Why do I need BGP?
Multi-homing – connecting to multiple
providers
upstream providers
local networks – regional peering to get
local traffic
Policy discrimination
controlling how traffic flows
do not accidentally provide transit to non-
customers
BGP Part II
AS 100
AS1
BGP
Peers Exchange
All Routes
UPDATE:
“Announcement”: prefix is reachable
“Withdraw”: prefix is not reachable
KEEPALIVE:
keeps connection alive in absence of UPDATES
serves as ACK to an OPEN request
NOTIFICATION:
reports errors in previous msg;
closes a connection
BGP Attributes
Nex AS Local-
... MED Community ...
t Path Pref.
Hop
Attributes are “knobs” for
traffic engineering
capacity planning
BGP Protocol Basics
Uses Incremental updates
sends one copy of the RIB at the beginning,
then sends changes as they happen
Path Vector protocol
keeps track of the AS path of routing
information
Many options for policy enforcement
Terminology
Neighbour
Configured BGP peer
NLRI/Prefix
NLRI – network layer reachability information
Reachability information for an IP address & mask
Router-ID
32 bit integer to uniquely identify router
Comes from Loopback or Highest IP address
configured on the router
Route/Path
NLRI advertised by a neighbour
Terminology
Transit – carrying network traffic across a
network, usually for a fee
Peering – exchanging routing information and
traffic
your customers and your peers’ customers network
information only.
not your peers’ peers; not your peers’ providers.
Peering also has another meaning:
BGP neighbour, whether or not transit is provided
A C
AS 100 AS 101
100.100.8.0/24 100.100.16.0/24
B D
A C
AS 100 AS 101
100.100.8.0/24 100.100.16.0/24
B D
BGP speakers
are called peers
E
Peers in different AS’s
are called External Peers AS 102
100.100.32.0/24
eBGP TCP/IP
Peer Connection
Note: eBGP Peers normally should be directly connected.
BGP Peers – Internal
(iBGP)
A C
AS 100 AS 101
100.100.8.0/24 100.100.16.0/24
B D
BGP speakers
are called peers
E
Peers in the same AS
are called Internal Peers AS 102
100.100.32.0/24
iBGP TCP/IP
Peer Connection
Note: iBGP Peers don’t have to be directly connected.
Configuring eBGP peers
BGP peering sessions are established using the
BGP “neighbor” command
eBGP is configured when AS numbers are different
AS 100 AS 101
iBGP TCP Connection
110.110.10.0/30
A .2 100.100.8.0/30 .1 B .2 .1 C .2 100.100.16.0/30 .1 D
AS 100
B
A
iBGP TCP/IP
Peer Connection
C
Configuring iBGP peers:
Loopback interface
Loopback interfaces are normally used as the
iBGP peer connection end-points
105.10.7.1
AS 100 105.10.7.2
B
A
105.10.7.3
iBGP TCP/IP
Peer Connection
C
Configuring iBGP peers
105.10.7.1
AS 100 105.10.7.2
B
A
105.10.7.3
interface loopback 0
ip address 105.10.7.1 255.255.255.255
105.10.7.1
AS 100 105.10.7.2
B
A
105.10.7.3
iBGP TCP/IP
Peer Connection interface loopback 0
ip address 105.10.7.2 255.255.255.255
C
router bgp 100
network 105.10.7.0 mask 255.255.255.0
neighbor 105.10.7.1 remote-as 100
neighbor 105.10.7.1 update-source loopback0
neighbor 105.10.7.3 remote-as 100
neighbor 105.10.7.3 update-source loopback0
Configuring iBGP peers
105.10.7.1
AS 100 105.10.7.2
B
A
105.10.7.3
interface loopback 0
ip address 105.10.7.3 255.255.255.255
C
router bgp 100
network 105.10.7.0 mask 255.255.255.0
neighbor 105.10.7.1 remote-as 100
neighbor 105.10.7.1 update-source loopback0
neighbor 105.10.7.2 remote-as 100
neighbor 105.10.7.2 update-source loopback0
Route Reflectors
• Route reflectors can pass
on iBGP updates to clients
• Each RR passes along
ONLY best routes
• ORIGINATOR_ID and
CLUSTER_LIST attributes
are needed to avoid loops
RR
RR RR
BGP Part III
AS 300
AS 400
150.10.0.0/16
Network Path
180.10.0.0/16 300 200 100
AS 500 170.10.0.0/16 300 200
150.10.0.0/16 300 400
AS-Path (with 16 and 32-
bit ASNs)
Internet with 16-bit
and 32-bit ASNs AS 70000 AS 80000
170.10.0.0/16 180.10.0.0/16
AS-PATH length
maintained 180.10.0.0/16 300 23456 23456
170.10.0.0/16 300 23456
AS 300
AS 400
150.10.0.0/16
AS 90000 180.10.0.0/16
170.10.0.0/16
300 70000 80000
300 70000
150.10.0.0/16 300 400
Shorter Doesn’t Always
Mean Shorter
Mr. BGP says that In fairness:
path 4 1 is better could you do
than path 3 2 1 this “right” and
Duh!
still scale?
AS 4 Exporting
internal
AS 3
state would
dramatically
AS 2 increase global
instability and
amount of
routing
state
AS 1
ASPATH Padding
AS 1 provider
192.0.2.0/24 192.0.2.0/24
ASPATH = 2 ASPATH = 2 2 2
AS 300
AS 200 192.10.1.0/30 140.10.0.0/16
150.10.0.0/16 C .1 .2 D
E
B
BGP Update .2
30
.1
AS 300
AS 200 192.10.1.0/30 140.10.0.0/16
150.10.0.0/16 C .1 .2 D
E
B
BGP Update .2
30
.1
Next hop to reach a network
A
Usually a local network is the
AS 100 next hop in eBGP session
160.10.0.0/16 Next Hop updated between
eBGP Peers
Next Hop Attribute
AS 300
AS 200 192.10.1.0/30 140.10.0.0/16
150.10.0.0/16 C .1 .2 D
E
B
BGP Update .2
30
.0/
.1
A
Next hop not changed
AS 100 between iBGP peers
160.10.0.0/16
Next Hop Attribute (more)
IGP is used to carry route to next hops
Recursive route look-up
BGP looks into IGP to find out next hop
information
BGP is not permitted to use a BGP route as
the next hop
Unlinks BGP from actual physical
topology
Allows IGP to make intelligent
forwarding decision
Next Hop Best Practice
Cisco IOS default is for external next-hop
to be propagated unchanged to iBGP peers
This means that IGP has to carry external
next-hops
Forgetting means external network is invisible
With many eBGP peers, it is extra load on IGP
ISP best practice is to change external
next-hop to be that of the local router
neighbor x.x.x.x next-hop-self
Community Attribute
32-bit number
Conventionally written as two 16-bit
numbers separated by colon
First half is usually an AS number
ISP determines the meaning (if any) of the
second half
Carried in BGP protocol messages
Used by administratively-defined filters
Not directly used by BGP protocol (except
for a few “well known” communities)
BGP Updates:
Withdrawn Routes
Used to “withdraw” network reachability
Each withdrawn route is composed of:
Network Prefix
Mask Length
BGP Updates:
Withdrawn Routes
AS 321
AS 123
.1 192.168.10.0/24 .2
BGP Update
Message
Withdraw Routes
192.192.25.0/24
x
Connectivity lost 192.192.25.0/24
D 10.1.2.0/24
D 160.10.1.0/24 • Best paths installed in routing table if:
D 160.10.3.0/24
R 153.22.0.0/16 • prefix and prefix length are unique
S 192.1.1.0/24 • lowest “protocol distance”
B 173.21.0.0/16
Route Table
An Example… 35.0.0.0/8
A AS3561
AS200
F
B AS21
C
D
AS101 AS675
E
Routing Policy
Filtering
Terminology: “Policy”
Where do you want your traffic to go?
It is difficult to get what you want, but you can try
How?
AS based route filtering – filter list
Prefix based route filtering – prefix list
BGP attribute modification – route maps
Complex route filtering – route maps
Import Routes
provider route peer route customer route ISP route
From From
provider provider
From From
peer peer
From From
customer customer
Export Routes
provider route peer route customer route ISP route
To From
provider provider
To To
peer peer
To To
customer customer
filters
block
Filter list rules:
Regular Expressions
Regular Expression is a pattern to
match against an input string
Used to match against AS-path attribute
ex: ^3561_.*_100_.*_1$
Flexible enough to generate complex
filter list rules
Regular expressions (cisco
specific)
^ matches start
$ matches end
_ matches start, or end, or space (boundary
between words or numbers)
.* matches anything (0 or more characters)
.+ matches anything (1 or more characters)
[0-9] matches any number between 0 and 9
^$ matches the local AS
There are many more possibilities
Filter list – using as-path
access list
Listen to routes originated by AS 3561.
Implicit deny everything else inbound.
Don’t announce routes originated by AS 35,
but announce everything else (outbound).
BGP Attributes
Synchronization
Path Selection
BGP Path Attributes:
Why ?
Encoded as Type, Length & Value (TLV)
Transitive/Non-Transitive attributes
Some are mandatory
Used in path selection
To apply policy for steering traffic
BGP Attributes
Used to convey information associated
with NLRI
AS path
Next hop
Local preference
Multi-Exit Discriminator (MED)
Community
Origin
Aggregator
Local Preference
Not used by eBGP, mandatory for iBGP
Default value of 100 on Cisco IOS
Local to an AS
Used to prefer one exit over another
Path with highest local preference wins
Local Preference
AS 100
160.10.0.0/16
AS 200 AS 300
D 500 800 E
A B
160.10.0.0/16 500
AS 400
> 160.10.0.0/16 800
C
Multi-Exit Discriminator
Non-transitive
Represented as a numerical value
Range 0x0 – 0xffffffff
AS 200
C
preferred
192.68.1.0/24 2000 192.68.1.0/24 1000
A B
192.68.1.0/24
AS 201
Origin
Conveys the origin of the prefix
Historical attribute
Three values:
IGP – from BGP network statement
E.g. – network 35.0.0.0
EGP – redistributed from EGP (not used
today)
Incomplete – redistributed from another
routing protocol
E.g. – redistribute static
IGP < EGP < incomplete
Lowest origin code wins
Weight
Not really an attribute
Used when there is more than one route
to same destination
Local to the router on which it is assigned,
and not propagated in routing updates
Default is 32768 for paths that the router
originates and zero for other paths
Routes with a higher weight are preferred
when there are multiple routes to the
same destination
Communities
Transitive, Non-mandatory
Represented as a numeric value
0x0 – 0xffffffff
Internet convention is ASn:<0-65535>
Community Local
Preference
200:90 90
Service Provider AS 200
200:120 120
C D
Community:200:90 Community:200:120
A B
192.168.1.0/24
Customer AS 201
Import Routes
provider route peer route customer route ISP route
From From
provider provider
From From
peer peer
From From
customer customer
Export Routes
provider route peer route customer route ISP route
To From
provider provider
To To
peer peer
To To
customer customer
filters
block
How Can Routes be
Colored?
BGP Communities!
A community value is 32 bits
Used for signally
within and between
ASes
1:100 To Customers
Customer routes 1:100, 1:200, 1:300
1:200 To Peers
Peer routes 1:100
1:300 To Providers
Provider Routes 1:100
Import Export
AS 1
Well-Known Communities
Several well known communities
www.iana.org/assignments/bgp-well-known-communities
no-export 65535:65281
do not advertise to any eBGP peers
no-advertise 65535:65282
do not advertise to any BGP peer
no-export-subconfed 65535:65283
do not advertise outside local AS (only used with
confederations)
no-peer 65535:65284
do not advertise to bi-lateral peers (RFC3765)
No-Export Community
105.7.0.0/16
105.7.X.X No-Export
105.7.X.X
A D
105.7.0.0/16
AS 100 AS 200 G
B E
C F
Largest weight
Largest local preference
BGP route selection
(bestpath)
Locally sourced
Via redistribute or network statement
Lowest origin
IGP < EGP < incomplete
Lowest MED
Compared from paths from the same AS
BGP route selection
(bestpath)
External before internal
Choose external path before internal
Closest next-hop
Lower IGP metric, nearest exit to router
Lowest router ID
Lowest IP address of neighbour
BGP Part VI
Configuring BGP
Basic commands
Getting routes into BGP
Basic BGP commands
Configuration commands
router bgp <AS-number>
no auto-summary
no synchronization
neighbor <ip address> remote-as <as-
number>
Show commands
show ip bgp summary
show ip bgp neighbors
show ip bgp neighbor <ip address>
Inserting prefixes into BGP
Two main ways to insert prefixes into
BGP
network command
redistribute static
Both require the prefix to be in the
routing table
Configure iBGP
The two routers in your AS should talk
iBGP to each other
no filtering here
use “update-source loopback 0”
“network” command
Configuration Example
router bgp 1
network 105.32.4.0 mask 255.255.254.0
ip route 105.32.4.0 255.255.254.0 serial 0
matching route must exist in the routing table
before network is announced!
Prefix will have Origin code set to “IGP”
“redistribute static”
Configuration Example:
router bgp 1
redistribute static
ip route 105.32.4.0 255.255.254.0 serial0
Static route must exist before redistribute
command will work
Forces origin to be “incomplete”
Care required!
This will redistribute all static routes into BGP
Redistributing without using a filter is dangerous
“redistribute static”
Care required with redistribution
redistribute <routing-protocol> means
everything in the <routing-protocol> will be
transferred into the current routing protocol
will not scale if uncontrolled
best avoided if at all possible
redistribute normally used with “route-
maps” and under tight administrative
control
“route-map” is used to apply policies in BGP, so is
a kind of filter
BGP Part VII
p: 1 p: 871
AS1
p: 1 AS2 p: 21 AS3 p: 321 AS4 p:5871
preferred
p AS5
d
rre
r efe p: 4321
p: 1 sp
les
p: 61
Only
Onlypolicy:
policy:ASAS44prefers
preferspath
pathover
overAS
AS33
AS6 instead
insteadof
ofAS
AS6!6!
Policy Interactions
AS7 AS8
p:71
p: 1 p: 871
AS1
p: 1 AS2 p: 21 AS3 p: 321 AS4 p:5871
preferred
p AS5
d
rre
r efe p: 4321
p: 1 sp p: 461
les
p: 61
Link
Linkfailure
failure/ /depeering
depeering/ /something
something
AS6 between
betweenAS AS22––AS
AS33
Policy Interactions
AS7 AS8
p:71
p: 1 p: 871
AS1
p: 1 AS2 p: 21 AS3 p: 321 AS4 p:5871
preferred
p AS5 p:5461
d
rre
r efe p: 4321
p: 1 sp p: 461
les
p: 61
Old:
Old:55887711 -- New:
New:55 446611
AS6 based
basedon
on‘event’
‘event’between
between22––33
Tim Griffin: “BGP
Wedgies”
ISP A ISP B
(Tier 1) (Tier 1)
ISP D
(Tier 2)
ISP C
Customer E
Tim Griffin: “BGP
Wedgies”
ISP A ISP B
(Tier 1) (Tier 1)
ISP D
(Tier 2)
primary
link
ISP C
backup link
Customer E
Desired Situation…
ISP A ISP B
(Tier 1) (Tier 1)
ISP D
(Tier 2)
primary
link
ISP C
backup link
Customer E
Desired Situation via
communities
ISP A ISP B
(Tier 1) (Tier 1)
ISP D
(Tier 2)
primary
link
Customer E
Primary link fails…
ISP A ISP B
(Tier 1) (Tier 1)
ISP D
(Tier 2)
ISP C
Customer E
Primary link recovers…
ISP A ISP B
(Tier 1) (Tier 1)
ISP D
(Tier 2)
ISP C
Customer E
Summary
We have learned:
Why we use BGP
About the difference between Forwarding
and Routing
About Interior and Exterior Routing
What the BGP Building Blocks are
How to configure BGP
Where complexity comes from…
Limitations of the “Internet”