6 TrafficControl
6 TrafficControl
Olivier Bonaventure
http://inl.info.ucl.ac.be/
These slides are licensed under the creative commons attribution share-alike license 3.0. You can obtain detailed
information
about this license at http://creativecommons.org/licenses/by-sa/3.0/
Course outline
Objective
Understand the protocols and mechanisms that
are required to support current and emerging
applications in IP-based networks
Topics covered
Traffic control and in IP networks
IPv6
IP Multicast
MultiProtocol Label Switching (MPLS)
Virtual Private Networks
2
Traffic control in IP Networks
Outline
Applications
Packet-level traffic control mechanisms
Flow identification
Packet Marking
Buffer acceptance
Scheduling
3
Taxonomy of applications
Streaming applications
a minimum amount of resources is required for one
streaming application
minimum amount is available, application works well
minimum amount not available, application doesn't work
4
Elastic applications : a few examples
Request-response applications
client server, NFS, Remote Procedure Calls,
distributed computing, ...
P2P search for file
Batch applications
ftp, remote backup, long http transactions, news
transfers, ...
P2P transfers
5
Streaming applications : a few examples
Conversational multimedia applications
voice or video over IP
6
Applications and protocol stack
7
UDP
Objectives
provide an unreliable connectionless packet
service over the unreliable IP service
Mechanisms
simple packet format with optional checksum
communicating applications are identified by
source IP address
destination IP destination
source port number
destination port number
does not include mechanisms to ensure
reliable delivery
bit errors can be detected but not corrected
in-sequence delivery
UDP was initially defined in [RFC768] in 1980 and was not modified since then. It relies on an eight bytes header that contains the following
information :
- 16 bits source port number
- 16 bits destination port number
- 16 bits UDP length
- 16 bits UDP checksum
The maximum length of a UDP packet corresponds to the maximum length of an IP packet, namely 64 KBytes. Even if UDP supports large
packets, most applications avoid to send UDP packets larger than the IP Maximum Transmission Unit .
TCP
Objectives
provide a reliable connection-oriented byte
stream service over the unreliable packet-based
IP service
Mechanisms
single packet format protected with checksum
connection establishment and release
a TCP connection is identified by the four-tuple
source IP address
destination IP address
source port number
destination port number
reliable data transfer
acknowledgements and retransmissions of lost packets
flow and congestion control
J.Postel. Transmission control protocol, protocol specification. Internet RFC 793, September 1981.
V.Jacobson. Congestion avoidance and control. In Proc. ACM SIGCOMM88, pages 314--329, August 1988.
W.Stevens. TCP slow start, congestion avoidance, fast retransmit and fast recovery algorithms. Internet RFC 2001, January 1997.
M.Allmanm, V.Paxson, and W.Stevens. TCP congestion control. Internet RFC 2581, April 1999.
V.Jacobson, B.Braden, and D.Borman. TCP extensions for high-performance. Internet RFC 1323, May 1992.
W.Stevens. TCP/IP Illustrated, volume 1 : The protocols. Addison-Wesley, 1994.
G.Wright and R.Stevens. TCP/IP Illustrated Vol. 2, The Implementation. Addison-Wesley, 1995.
S.Floyd. A report on recent developments in TCP congestion control. IEEE Communications Magazine, 39(4):84--90, April 2001.
TCP : summary
10
TCP : summary (2)
11
Internet traffic characteristics
12
K.Claffy, G.Miller, and K.Thompson. the nature of the beast : recent traffic measurements from an internet backbone. In INET98, 1998.
available from http://www.caida.org/Papers.
R.Koga and S.McCreary. Traffic workload overview. available from http://www.caida.org/Learn/Flow/tcpudp.html, June 1999.G.Miller,
K.Thomposon, and R.Wilder. Performance measurement
K.Thompson, G.Miller, and R.Wilder. Wide-area internet traffic patterns and characteristics. IEEE Network Magazine, 11(6), November/
December 1997. also available from http://www.vbns.net/presentations/papers.
National Laboratory for Applied Networking Research. Tutorial: Insight into current internet traffic workloads. available from http://
www.nlanr.net/NA/tutorial.html, 1997.
A.Mena and J.Heidemann. An empirical study of real audio traffic. In INFOCOM2000, March 2000.
S.McCreary and K.Claffy. Trends in wide area IP traffic paterns : a view from Ames Internet Exchange. available from http://www.caida.org/
outreach/papers/AIX0005/, 2000.
13
Internet traffic characteristics (3)
14
Source :
http://ipmon.sprintlabs.com/packstat/packetoverview.php
Stefan Saroiu, Krishna P. Gummadi, Richard J. Dunn, Steven D. Gribble, Henry M. Levy: An Analysis of Internet Content Delivery Systems.
Proceedings of 5th Symposium on Operating Systems Design and Implementation (OSDI) 2002, Boston, MA, USA, December 2002.
Internet traffic characteristics (4)
15
Source : http://ipmon.sprintlabs.com/packstat/viewresult.php?16:appsbreakdown:nyc-27.0-021121:
Predicted IP traffic growth
22500000
TB/month
15000000
7500000
0
2005 2006 2007 2008 2009 2010 2011
16
Source for this data Global IP traffic forecast and methodology, 2006-2011, Cisco 2007 white paper, [online version]
http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/net_implementation_white_paper0900aecd806a81aa.pdf
Predicted IP traffic growth
Predicted IP traffic growth
20000000
15000000
TB/month
10000000
5000000
0
2005 2006 2007 2008 2009 2010 2011
17
Predicted IP traffic growth (3)
Consummer IP traffic growth
8000000
6000000
TB/month
4000000
2000000
0
2005 2006 2007 2008 2009 2010 2011
18
Peer-to-peer file sharing
19
Peer-to-peer file sharing (2)
20
Today, Napster does not work anymore as explained due to copyright violations reasons.
One of the most efficient file transfer protocol used today is Bittorrent. Bittorrent also divides files in blocks and allows files to be downloaded
from several nodes at the same time. This provides good redundancy in case of node/link failures, but also allows an efficient utilisation of the
available link bandwidth by using uncongested paths (the node with the highest bandwidth will automatically serve blocks faster than a
congested node). A Bittorrent node will not necessarily receive blocks in sequence. Furthermore, to ensure that all Bittorrent users contribute to
the system, Bittorrent implementations apply the tit-for-tat principle which implies that once a node has received a block, it must serve this
block to other nodes before being allowed to download new blocks.
Node 5
Node 12
CNPP/2008.6. Node 7 © O. Bonaventure, 2008
21
Several Distributed Hash Tables have been proposed in the literature. One of the first and most influential ones is Chord
Ion Stoica, Rober t Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan , “ Chord: A Scalable Peer-to-peer Lookup Service for Internet
Applications” ACM SIGCOMM 2001
How to store files ?
Principle
File FOOBAR is stored on the node whose id is
the successor of hash(“FOOBAR”) on the ring
A node on the ring uses its successors to find the
responsible node for a given file
Examples Hash(FOOBAR)=13
Node 1
Node 15
Node 5
Hash(BARFOO)=5
Node 12
Node 7
CNPP/2008.6. © O. Bonaventure, 2008
22
How to find files faster ?
How to improve ?
Allow nodes to know addition pointers to other
nodes on the Chord ring to speedup lookup
m = number of bits in the key/node identifiers
Each node maintains routing table of m entries
The ith entry in the table at node n contains the identity
of the first node, s, that succeeds n by at least 2^i−1 on
the identifier circle, i.e., s = successor(n + 2^i−1 )
arithmetic modulo 2^m is used
A finger table entry includes both the Chord identifier
and the IP address (and port number) of relevant node.
CNPP/2008.6. © O. Bonaventure, 2008
23
Scalable lookup with Chord
Node 46
Node 29
Node 37
24
Audio : a closer look
Physical characteristics
sound is composed of acoustical waves travelling
through a physical medium (air)
An acoustical wave is characterised by its
frequency
expressed in Hertz [Hz]
amplitude or power conveyed by the wave
expressed in decibels
Limitation
human ear is only sensitive between
20 Hz and 22000 Hz
If we are only interested by humans, we can
neglect frequencies outside this range
but multimedia applications for dogs would have
different requirements ...
25
Audio : from sound to bits
How can we digitise sound
Nyquist theorem
any signal can be digitised by sampling provided that
exact samples are taken at least
at twice the highest frequency of the signal
Signal
Sample
Time
Quantification Sampling interval
exact samples are impossible
in practice and cannot be transmitted
we accept a small degradation of the signal quality by
using N bits to represent each sample
CNPP/2008.6. © O. Bonaventure, 2008
26
Audio : from sound to bits (2)
Telephone
human voice comprises sounds in the frequency
range 0-4000 Hz
historically, phone lines have been artificially
limited by low-pass filters in this frequency range
to improve transmission of signals on long links
although the telephone invented by Bell was initially
sold as a way to remotely listen to Opera...
27
Audio : from sound to bits (3)
Music
Good quality music should convey the complete
signal (0 - 22000 Hz) for human ear
lower qualities (e.g. AM radio) are possible with lower
frequency ranges
Samples should be accurately quantified
Stereo means two different channels
Example : CD
Sampling frequency : 44 kHz
16 bits per sample
two separate channels for CD-quality stereo music
Required bandwidth : 44000*16*2=1.4 Mbps
28
Audio : reducing bandwidth requirements
Nyquist based digitisation provides an
almost perfect encoding of the signal
drawback is the relatively high required bandwidth
29
Speaki
Silent ng
30
Audio : lossy compression
Human ear
a high amplitude signal at frequency f will
completely mask lower amplitude signals whose
frequency is close to f during some time
masked signals can be neglected without any loss of
audio quality
Audio signal
amplitude does not change quickly from one
sample to the next one
fewer bits can be used to encode each sample
Define models for human voice or music
fit audio signal to model and select appropriate
parameters and/or model
only transmit parameters and model id
31
Issues for multimedia transmission
How to packetise multimedia information?
samples are a few bits with audio
a single image can be very large for video
How does a destination distinguish different
medias and encoding ?
How to recover from the limitations of IP
packet loss
packet reordering
packet duplication
How can a source know that her information
is received correctly
How to support multiparty communications ?
32
Transporting multimedia information
Can we rely on TCP ?
TCP provides a reliable transfer, but its usage of
retransmissions and timeouts may cause large
delays when losses occur
audio and video can easily survive with limited losses
TCP does not support multicast
Initially, TCP was rarely used by multimedia
applications
except when the application transfers "files"
Today, some multimedia applications rely on
TCP,
e.g. Skype, youtube, distribution of Video on
Deman movies
main advantage of TCP : NAT and firewall
traversal
CNPP/2008.6. © O. Bonaventure, 2008
33
Transporting multimedia information (2)
UDP-based multimedia distribution
UDP supports multicast but lacks mechanisms to
reorder information received out of sequence
detect losses
recover from delay variations
identify multimedia information inside UDP segments
Application Application
RTP RTP
Transport
UDP UDP
IP IP
Lower layers Lower layers
34
RTP is used by most conversational multimedia applications such as interactive Voice or Video over IP
RTP
RTP can be considered as a framework over
which various applications can be built
RTP provides the basic mechanisms needed by
most multimedia applications
two sub protocols
RTP which deals with the flow of data packets
RTCP which "controls" the flow of data packets
RTP uses even UDP port, RTCP uses next UDP port number
another protocol is used to establish session
applications may add additional mechanisms
above RTP if needed
CNPP/2008.6.
RTP © O. Bonaventure, 2008
35
Source for this figure : W. Stallings, High-Speed networks : TCP/IP and ATM design principles, Prentice Hall, 1998
RTP is defined in
RFC1889 RTP: A Transport Protocol for Real-Time Applications. Audio-Video Transport Working Group, H. Schulzrinne, S. Casner, R.
Frederick, V. Jacobson. January 1996.
RFC1890 RTP Profile for Audio and Video Conferences with Minimal Control. Audio-Video Transport Working Group, H. Schulzrinne. January
1996.
Network
Alice Bob
RTCP
For time to time, provide feedback about quality of reception
CNPP/2008.6. © O. Bonaventure, 2008
36
Dealing with losses and reordering
Problem
RTP segments can be lost or reordered on their
way towards the destination
Network
Classical solution
insert sequence number inside RTP header since UDP
does not contain any sequence number
CNPP/2008.6. © O. Bonaventure, 2008
37
Dealing with delay variations
Problem
Synchronous signal : one packet every T seconds
Network may introduce variable delays even when no
packets are lost
for example some packets may have to wait in the buffers of
an intermediate router
T T T T T
D1 D2 D3 D4 D5
38
Dealing with delay variations (2)
Solution M
Maximum delay
introduce #packets
additional delay
at destination
Delay
Fixed delay
playback buffer Late packets are
incoming packets are stored inside a buffer discarded
one packet is read by the application every T seconds
the size of the buffer is chosen to emulate a fixed
transmission delay of M seconds
packets whose transmission delay is larger than M seconds are
discarded inside the playback buffer
M is compromise between low delay and low losses
39
Dealing with delay variations (3)
From network To audio/video decoding
40
Dealing with delay variations (4)
What happens with silence suppression ?
Problem
an expected packet can be late because either
the packet was delayed inside the network
no packet was generated due to silence suppression
when does the destination restart playback after
a period of silence ?
Solution
insert timestamp inside each packet
41
Packetising audio
Unit of information : audio sample
First solution
each sample is placed inside one RTP PDU
overhead would be prohibitive with one byte per sample
and 20 bytes of IP header, 8 bytes of UDP header and
12 bytes of RTP header
Better solution
N samples are placed inside one RTP PDU
we need N*sampling_interval seconds to prepare a complete RTP
PDU
Packetisation may cause a large delay inside source
64 Kbps voice with 10 samples per packet
packetisation delay : 1.25 msec
overhead RTP/UDP/IP : 80% (40 bytes out of 50)
the chosen value for N is always compromise between overhead and
packetisation delay
common choice is to put 20 msec of audio in one packet
42
A common packet size for audio is to pack inside one packet 20 milliseconds worth of audio, but larger values are sometimes used as well,
depending on the amount of overhead that can be tolerated.
RTP/UDP PDU format
32 bits
Marker
can be used to indicate frame Source port Destination port
or talk spurt boundaries UDP header
8 bytes
Length Checksum
PType V PX CC M PType Sequence number
Indicates the type (e.g. encoding)
RTP header
of audio/video data inside payload Timestamp 12 bytes
Synchronization source (SSRC) identifier
Sequence number
monotoniically increasing sequence Possible header extension
number (+1 for each RTP PDU)
Timestamp
Time at which the information contained
in the PDU was produced. Several PDUs Payload
may have same timestamp. The clock used
for this timestamp is application dependent.
SSRC
Identification of the source that Audio or video information
created the RTP PDU format is application dependent
This identification is chosen randomly
by the source of the RTP PDU
CNPP/2008.6. © O. Bonaventure, 2008
43
Registered payload types are defined in RFC1890 and the updated list is available at http://www.isi.edu/in-notes/iana/assignments/rtp-
parameters
RTCP
Three main objectives
Quality of service and congestion control
RTCP segments can be sent by a receiver as a kind of
low frequency ACK segments that are used to indicate
the quality of the reception
based on the receiver reports, the sender may adapt its
encoding (e.g. lower bandwidth during congestion)
Identification
provide more information about the application/user
that are sending RTP segments
Estimate the number of participants in multicast
sessions
RTCP feedback should only consume a fraction of the
bandwidth consumed by RTP
for multicast session, an estimate of the number of
receivers is needed to limit the RTCP bandwidth
44
RTCP (2)
RTP entities regularly send reports to provide
feedback about quality of transmission
Sender report
used by a sender of RTP packets to summarize
the amount of information it has sent recently
each report contains
absolute timestamp of the report
indicates the absolute transmission time of the report
64 bits NTP timestamp
timestamp relative to the flow of RTP packets
indicates the position of the report in the RTP flow
amount of data sent since beginning of session
total number of RTP segments sent
total number of octets sent
45
RTCP (3)
Receiver reports
used to inform senders about the quality of
reception
one report for each source (SSRC) heard
Each receiver report contains
indication of source (SSRC)
timestamp of last sender report received from this source
delay (seconds) since last sender report from this source
highest RTP sequence number received from this source
number of lost RTP packets for this source
fraction of lost RTP packets for this source
estimation of delay jitter for RTP packets from this source
46
RTCP (4)
Additional features
Information about applications/user
Source description (SDES) RTCP segments
can be used by senders or receivers in a session
CNAME : identification of user
typically [email protected]
NAME
real name of the person using the application
EMAIL
PHONE
LOC
geographic location of user
TOOL
application sending RTP segments
NOTE
any comment
47
Application requirements
End-to-end delay
Bandwidth
Packet loss
Ideal world
0 end-to-end delay
0% packet loss
infinite bandwidth
Real world
optimise for delay, bandwidth or loss, but not all !
CNPP/2008.6. © O. Bonaventure, 2008
48
Example streaming application
packetized voice
Voice
Acoustic signal
Acoustical/electrical
conversion
Electrical signal Time
Sample
Quantification
Packetization
Time
Voice packets
CNPP/2008.6. © O. Bonaventure, 2008
49
End-to-end packetized voice delay
V
V O
O I
I AEC Q C P P-1 C-1 Q-1 EAC C
C network E
E
Decompression delay
50
R R R
Propagation delay
Queuing delay
51
Applications : summary
Application requirements
delay
(and delay jitter)
bandwidth
loss ratio
52
Traffic control in IP Networks
Outline
Applications
Packet-level traffic control mechanisms
Flow identification
Packet Marking
Buffer acceptance
Scheduling
53
First generation router
CPU
+
RAM
CPU involved in
Packet reception (Interface -> RAM)
Packet queuing (in RAM)
Packet forwarding
Routing protocols to maintain forwarding table
Packet transmission (RAM -> interface)
CNPP/2008.6. © O. Bonaventure, 2008
54
http://www.cisco.com/en/US/products/hw/routers/ps233/ps235/index.html
http://www.cisco.com/en/US/products/hw/routers/ps986/index.html
Fcache
CPU
Fcache
+
RAM
Fcache
Fcache
Interfaces
Packets can be directly sent from input IF to output IF
Forwarding cache allows to bypass CPU
Functions of the main CPU
Routing protocols with update of Forwarding cache
Treatment of packets not handled by Forwarding cache
CNPP/2008.6. © O. Bonaventure, 2008
55
Examples :
Cisco 7200
http://www.cisco.com/en/US/products/hw/routers/ps341/index.html
Current high-end routers
R Bus
or CPU
Switching RAM
R matrix (routing
protocols)
Interfaces
Contain full forwarding table
Contain memory for queuing + other mechanisms
Functions of the main CPU
Routing protocols with update of interfaces
Treatment of <<anormal>> packets not handled by IF
CNPP/2008.6. © O. Bonaventure, 2008
56
In this tutorial, we place, for pedagogical reasons, the traffic control functions on the router's output ports. It should however be noted that one
some router architectures, some of these functions may be placed elsewhere.
Alcatel R7770
http://www.alcatel.com/products/productsummary.jhtml?_DARGS=/common/opg/products/include/productbrief.jhtml_A&_DAV=/x/opgproduct/
a7770obx.jhtml
Juniper M160
http://www.juniper.net/products/ip_infrastructure/m_series/index.html
Cisco 12000
http://www.cisco.com/en/US/products/hw/routers/ps167/index.html
What is a flow ?
Definition
a flow is a sequence of packets with one
common "characteristic"
characteristic can be based on any field of the packets
a flow usually exists for some period of time
57
Simple router v1
IP packets IP packets with internal ID IP packets
Input link 1
Classifier
Output link
Input link N
58
Layer-two flow
ATM
VCI/VPI
Frame-relay
DLCI
802.x LANs
VLANs, 802.1q
59
Layer-three flow
32 bits
Source IP address
Destination IP address
Payload
60
Layer-three flow (2)
ER ER
BR
BR
AS1 ER AS2
ER
CNPP/2008.6. © O. Bonaventure, 2008
61
IP packet marking
0 1 2 3 4 5 6 7
1000 minimize delay
Prec. Type of Service 0 0100 maximize throughput
0010 maximize reliability
0001 minimize monetary cost
0000 normal service
Precedence (relative priority)
current status
definition of ToS Octet changed several times
Precedence is used in some networks
ToS field is rarely used
Using the ToS Octet for marking
advantage : easy to implement
disadvantage : limited number of marked flows
CNPP/2008.6. © O. Bonaventure, 2008
62
Layer-four flow
Ver IHL ToS Total length Ver IHL ToS Total length
TCP
CNPP/2008.6. © O. Bonaventure, 2008
63
Identifying applications
Simple solution
Look at TCP/UDP port numbers
64
The problems
Not all applications use well-known port numbers
FTP
server and client may negotiate other non-default port numbers
than 20/21 for some file transfers
P2P and other new applications
rarely utilise well-known port numbers
Deployment of encrypted tunnels (IPSEC, L2TP,
PPTP) hide TCP/UDP headers to routers
if special treatment is required inside network, packets
should be marked at layer 3 before being encrypted
deployment of encryption together with traffic control
should be carefully done
Accurately identifying apps is difficult and often
requires specialised hardware that looks at
packet content
but encryption is more and more used by P2P apps
CNPP/2008.6. © O. Bonaventure, 2008
65
An analysis of P2P traffic by IPOQUE in 2007, see Internet Study 2007, http://www.ipoque.com, revealed that for edonkey and bittorrent 20%
of the traffic was encrypted
Traffic control in IP Networks
Outline
Applications
Packet-level traffic control mechanisms
Flow identification
Packet Marking
Buffer acceptance
Scheduling
66
Packet marking algorithms
Objectives
Identify inside the network packets that belong to
a class of applications and should receive some
kind of service
Marking is performed by classifier
67
Measuring the rate of a flow
Simple solution
time divided in fixed length intervals
Traffic contract defined as N bytes per interval
Interval Interval
time
Drawback
starting time of first interval may have an influence over which packets are
accepted
CNPP/2008.6. © O. Bonaventure, 2008
68
Measuring the rate of a flow (2)
Improved solution
when a packet arrives, the rate is equal to the
number of bytes received during the last W sec
divided by W (W: time window)
time
Drawback
not really implementable in practice
CNPP/2008.6. © O. Bonaventure, 2008
69
Time sliding window
Time sliding window
remembering packet arrivals is not possible
estimate average rate
on packet arrival, assume that flow was fluid and
sending at estimated average rate during the last W
seconds (W : sliding window)
time
70
Token Bucket
Token bucket Bucket filling
R : average rate in bytes/sec Initialisation
C=B;
B : size of the token bucket every 1/R second do
{
if(C<B)
C=C+1;
}
Arrival of packet P of length L
if (L <= C)
{ /* packet is accepted */
C=C-L;
} C : number of tokens Maximum : B tokens
else inside the bucket
{
/* packet is discarded */
}
71
Token Bucket (2)
Advantages
can be used by a network provider to enforce a
traffic contract since it provides a precise
algorithmic definition for
conforming packets
non-conforming packets
72
Token Bucket in shaping mode
Problem
How can we ensure that one particular flow is
conforming to a contract ?
utilise modified token bucket in shaping mode (shaper)
Arrival of packet of size L 1 token every 1/R seconds
if (L <= C)
{ /* packet arrived on time */ C : number of tokens
C=C-L; inside the bucket
transmit_packet();
}
else Maximum : B tokens
{/* packet arrived too early
* delay packet inside buffer
* until it becomes conforming
*/
while (C<L)
{ /* wait */ } Buffer
/* now C=L and packet is Incoming Outgoing packets
conforming */ packets
C=C-L;
transmit_packet();
} Traffic at the output of the shaper is
conforming with R,B traffic contract
73
Deterministic marking
Principle
Modify token bucket to mark non-conforming
packets instead of discarding them
must specify bucket size in
addition to minimum bandwidth Bucket filling
Initialisation
C=B;
every 1/R second do
{
if(C<B)
C=C+1;
}
Arrival of packet P of length L
if (L <= C)
{ /* packet is guaranteed */
C=C-L;
} C : number of tokens Maximum : B tokens
else inside the bucket
{
/* packet is in excess */
}
74
This marker can also be modified to support more than three types of packets.
See J. Heinanen and R. Guerin, A Single Rate Three Color Marker, RFC 2697, Sept. 1999
J. Heinanen and R. Guerin, A Two Rate Three Color Marker, RFC 2698, Sept. 1999
Extensions to token bucket
75
See also
O.Bonaventure and S.De Cnodder. A rate adaptive shaper for differentiated services. Internet RFC2963, October 2000.
for a shaper that can be used to improve the performance of TCP with such markers
Cisco routers have a different way to implement this kind of token bucket with two burst sizes. See
S. Vegesna, IP Quality of Service, Cisco Press, 2001
Extensions to token bucket (2)
76
Improved router
Shaper
Token bucket
Shaping
Classification Buffer
Output link
Input links Policing
Classifier + Marker
Layer 4 Policer
Layer 3 - ToS byte Token bucket, Three Color
Layer 2.5 - MPLS Marker (token bucket based
or time sliding window based)
CNPP/2008.6. © O. Bonaventure, 2008
77
Traffic control in IP Networks
Outline
Applications
Packet-level traffic control mechanisms
Flow identification
Packet Marking
Buffer acceptance
Scheduling
78
Improved router
Packet treatment inside router's output port
Shaping
Classification Buffer
Output link
Input links Policing
79
Buffer acceptance algorithms
80
Buffer acceptance algorithms (2)
Objectives
control the amount of packets in the buffer to
efficiently support best-effort traffic
should provide a fair utilisation of the routers buffers
Link efficiency
100%
81
Tail drop
Advantages
easy to implement
can limit the number of packet losses for large buffer
Disadvantages
no distinction between the various flows
not the best solution for TCP traffic
82
Random Early Detection
Goals
should be easily implemented in simple routers
with a single logical queue
achieve a low, but non-zero, average buffer
occupancy
low average occupancy provides low delay for
interactive applications and ensure fast TCP response
non-zero average occupancy ensures an efficient
utilization of the output link
approximate a fair discard of packets among the
active flows without identifying them
discard packets in a TCP friendly way
we should avoid discarding bursts of packets since
TCP reacts severely to burst losses
83
See also : S. Floyd. Red (Random Early detection) queue management. available from http://www.aciri.org/floyd/red.html
Random Early Detection (2)
Principle
How can we detect congestion ?
measure average buffer occupancy by using a low-
pass filter
buffer is considered congested when its average
occupancy is above a configured threshold
threshold value usually around 10%- 20% of buffer size
What do we do in case of congestion ?
Probabilistic drop for incoming packet
drop will force TCP to slow down
drop probability should increase with congestion level
84
Random Early Detection (3)
Implementation
suitable for routers with a single queue
Output buffer
Pa
Packet arrival : 1
compute average queue occupancy, avg
if (avg < min_th)
// no congestion
accept packet
else if (min_th <≤ avg < max_th ) Maxp
// near congestion, probabilistic drop
calculate probability Pa
with probability Pa Avg
Min_th Max_th
discard packet
else with probability (1-Pa)
accept packet
else if avg ≥ max_th Steady state : average buffer
discard packet occupancy between Min_th and
Max_th
85
The computation of the average queue occupancy is based on a low-pass filter . On each packet arrival, the average is computed as :
average= ( average * (1-wq)) + (buffer_occupancy * wq)
where wq < 1 and usually of the form 1/2n for implementation reasons
Issues with RED
86
A description of FRED
D. Lin and R. Morris. Dynamics of random early detection. In SIGCOMM 97, pages 137--145, Cannes, France, September 1997.
Characterising best-effort service
What should be the goal of a best-effort
service ?
The network should provide a fair service to all its
best-effort users
Fairness definition for a single bottleneck link
User 1
User i R
User N
87
Max-min fairness
Property
a max-min fair allocation is such that in order to
increase the bandwidth allocated to one source, it is
necessary to decrease the bandwidth allocated to
another source which already receives a lower
allocation
88
Max-min fairness : example
D1 D3 D5
D2
S1 D4
Link2 Link4
R D6
R R D7
S2
Link1 Link3
D8
100 Mbps
R R 34 Mbps
S3 S4 S5 S6 S7 S8
89
Algorithm [Bertsekas & Gallager, Data Networks, 2nd edition, Prentice Hall 1992]
90
Another way to control congestion
A D
R1 R2 R3
91
Alternatives to TCP
congestion control
Other methods to detect congestion
DECBit
proposed at the same conference as TCP's congestion
control mechanism
routers tag packets when they are congested
destination returns congestion indications in acks
Frame relay
FECN : congestion indicated in forward direction on VC
BECN : congestion indicated in backward direction on VC
ATM
Available Bit Rate
Bit-based congestion indication (EFCI bit)
Explicit Rate congestion indication (RM cells)
92
How to provide Explicit Congestion
Notification to TCP ?
Problems to solve
1. How to detect congestion before it occurs ?
Difficult with tail-drop since buffer becomes full
easy with RED, since the buffer is never full
93
K.Ramakrishnan and S.Floyd. A proposal to add Explicit Congestion Notification (ECN) to IP. Internet RFC 2481, January 1999.
K.Ramakrishnan, S.Floyd, and D.Black. The addition of explicit congestion notification (ECN) to IP. Internet RFC 3168, September 2001.
A D
R1 R2
Potential problems
What happens if the returning ECN-echo ack is lost ?
How can we deploy such a solution when 99.99% of the TCP
sources/destinations are not ECN capable ?
CNPP/2008.6. © O. Bonaventure, 2008
94
Solution
allow sender to confirm notification to receiver
A D
R1 R2
95
Deployment of ECN
How can we support ECN in endsystems ?
TCP
TCP must be modified to support the new flags
at connection establishment time, the utilisation of ECN
will be negotiated during the three way handshake
if both endsystems support ECN, it will be used
if one of the endsystems does not support ECN, fallback to
normal TCP congestion control
96
Ten years ago, upgrading hosts was considered to be a serious deployment problem. Today, with the regular distribution of patches and
security fixes from all major OS vendors and the utilisation of automatic upgrades, a very large number of hosts can support a new feature or a
new protocol within a few months.
TCP is still the dominant transport protocol in todayʼs Internet. UDP has not been changed to support ECN given its datagram nature. However,
some applications running on top of UDP have been enhanced to support ECN. Several researchers have proposed solutions to provide TCP-
like or TCP compatible congestion control for UDP based applications.
SCTP is another transport protocol that was developed during the last years. It is rarely used by classical applications, but some operating
systems already support it.
Deployment of ECN (2)
How can we deploy ECN-aware routers ?
to work properly, a router should be able to
distinguish between :
IP packets from ECN-capable flows
these flows should be notified when congestion occurs
IP packets from non-ECN capable flows
packets from these flows should be discarded during congestion
ECT bit in IP header
A D
R1 R2
CNPP/2008.6. © O. Bonaventure, 2008
97
As a side note, it is interesting to note what happened when ECN was added to the Linux kernel. The first tests with this kernel revealed some
subtle problems when using ECN on the Internet. Apparently, some firewalls were not ECN aware and dropped ECN packets, considering them
as invalid, irrespectively of other firewall rules. Affected vendors have since fixed their firewall software, but this indicates that systems subtle
problems may appear during deployment.
Packet discard preferences
Problem
two types of packets
high and low priority packets
carry preferably high priority packets
Solution
Discard less-important packets earlier than others
Arrival of packet
if (Packet.Type == H)
{ /* packet is high priority }
if (Buf.Length < Buf.Size)
Partial Buffer Sharing accept_packet();
else
discard_packet();
}
else
Threshold {/* packet is low priority */
if (Buf.Length < Buf.Threshold)
H packets only H+L packets accept_packet();
else
discard_packet();
}
CNPP/2008.6. © O. Bonaventure, 2008
98
Partial Buffer Sharing can easily be extended to support N different drop priorities.
Packet discard preferences (2)
Weighted RED
extension of RED to support N packet discard
preferences
99
Several variants of RIO have been proposed and implemented since then.
Weighted RED
Principles
compute two averages : Avg(H) and Avg(H+L)
Apply conservative RED for H packets with large thresholds
Apply aggressive RED for L packets with small thresholds
1 1
Maxp(L)
Pa(L)
Maxp(H) Pa(H)
100
Weighted RED (2)
Weighted RED
can be used to support N drop priorities
Principles
compute a single average for all packets in buffer
conservative RED algorithm for high priority packets
large values for min_th(H) and max_th(H)
aggressive RED for low priority packets
small values for min_th(L) and min_th(L), and higher drop prob.
Pa(L) Pa(L)
1 1
Maxp(L) Maxp(L)
Maxp(H) Pa(H) Maxp(H) Pa(H)
Avg (All)
Min_th(L) Max_th(L) Min_th(L) Max_th(L) Avg (All)
Min_th(H) Max_th(H) Min_th(H) Max_th(H)
CNPP/2008.6. © O. Bonaventure, 2008
101
The configuration guidelines for WRED on cisco routers propose to use the same value for max_th and maxp for all classes and to perform the
differentiation only on the basis on min_th. See
http://www.cisco.com/univercd/cc/td/doc/product/software/ios120/12cgcr/qos_c/qcpart3/qcwred.htm
Traffic control in IP Networks
Outline
Applications
Packet-level traffic control mechanisms
Flow identification
Packet Marking
Buffer acceptance
Scheduling
102
Scheduler
Function
among all the logical queues containing at least
one packet, select the packet that will be
transmitted on the output link
A scheduler should ...
be easy to implement in hardware
support best-effort and guaranteed services
provide fairness for best-effort traffic
max-min fairness is the desired goal
provide protection
one flow should not be able to steal bandwidth from
other existing flows
provide statistical or deterministic guarantees
bandwidth, delay
103
H.Zhang. Service disciplines for guaranteed performance service in packet-switching networks. Proceedings of the IEEE, 83(10), October
1995.
QoS capable router
Shaper Scheduler
Delays flows which do not Chooses the packet to
follow some rules Queuing strategy be transmitted first on
Logical organisation of the the output link
router's buffers
Q[1]
Shaping
Q[2]
Q[3]
Classification
Output link
Input links Q[N]
Policing
Buffer acceptance
Classifier accepts or rejects an
Identifies the flow to which incoming packet
the arriving packet belongs Policer
Verifies whether the incoming
flow follows some rules
CNPP/2008.6. © O. Bonaventure, 2008
104
In practice, the shaper could also be located on the output link, but we don't address this issue here to keep the picture simple and
understandable.
Queuing strategies
105
Scheduling best-effort flows
Design goals
Provide a fair distribution of bandwidth between
active flows to support max-min fairness at
network level
fairness should not depend on behaviour of congestion
control mechanisms inside endsystems
Provide protection between flows
a potentially misbehaving flow should not be able to
consume most of the available bandwidth
scheduler ensures distribution of output link bandwidth
packet discard mechanism should ensure that one flow cannot
consume all the available buffer space
Implementable at high speeds
106
H.Zhang. Service disciplines for guaranteed performance service in packet-switching networks. Proceedings of the IEEE, 83(10), October
1995.
Processor Sharing
Flow 1 Flow 1
Flow 2
Flow 2
Flow 3 Flow 3
Flow 4 Scheduler :
Flow 5 Serve each queue
Flow 5 like a fluid flow
107
Processor Sharing : example
Flow1(L=1) F1
Flow2 (L=2) F2
Flow3 (L=1) F3
33%
Flow 1 and 2 actives
Flow 1 and 2 actives
108
Processor Sharing
Advantage
if no packets are discarded, a network of PS
schedulers will provide a max-min fair service
the fairness does not depend on any congestion control
mechanism
fairness is achieved for TCP and UDP flows !
if packets are discarded, then packet discarding must
be performed in a fair manner
Disadvantage
Ideal "mathematical" solution, not implementable
in practice
approximations to PS are implementable
109
Round Robin
Round Robin
Flow 1 Flow 1
Flow 2
Flow 2
Flow 3 Flow 3
Flow 4 Scheduler :
Flow N F1
Flow 5
FN
Principle F2
serve the active queues one after the other
Advantages F4 F3
can be easily implemented in hardware
provides protection for best-effort traffic
provides fair distribution of bandwidth with fixed-size packets
but fairness is only provided
at timescales larger than schedule
Disadvantages
unfairness with variable length packets
110
Round Robin : example
Flow1(L=1) F1
Flow2 (L=2) F2
Flow3 (L=1) F3 F1
111
Deficit Round Robin
Flow 1 Flow 1
Flow 2
Flow 2
Flow 3 Flow 3
Flow 4 Scheduler :
Flow N F1
Flow 5
FN
Idea F2
Round-Robin + variable length packets
F4 F3
Principle
associate counter d[i] to each queue
increase d[i] by quantum every time queue[i] is visited
if first_packet of queue[i] larger than d[i]
{ packet stays in queue[i]; }
else
{
packet is transmitted on output link;
d[i]=d[i]- packet length;
if queue[i] is empty { d[i]=0; }
}
112
M.Shreedhar and G.Vargese. Efficient fair queueing using deficit round robin. In Proc. ACM SICOGMM'95, pages 231--242, 1995.
Deficit Round Robin : example
Flow1(L=1) F1
Flow2 (L=2) F2
Flow3 (L=1) F3 F1
Flow1 and Flow3 send packets of size 1
Flow 2 sends packets of size 2
F3 F2
113
Frame discard mechanisms
114
Scheduling guaranteed flows
Design goals
Efficiently support flows with minimum and
maximum guaranteed bandwidth
provide bandwidth guarantees
provide delay guarantees
115
Priority-based scheduler
116
Generalised Processor Sharing
Flow 1 Flow 1
Flow 2
Flow 2
Flow 3 Flow 3
Flow 4 Scheduler :
Flow 5 Serve each queue
Flow 5 like a fluid flow
117
Generalised Processor Sharing (2)
Advantages
provides per-flow bandwidth guarantee
through one GPS scheduler
through a network of GPS schedulers
provides per-flow delay guarantee for token-bucket
(R,B) constrained flows
through one GPS scheduler
through a network of GPS schedulers
provides bound on buffer utilisation
provides protection among the different flows
a flow cannot jeopardise the guarantees for another flow
trivial guarantee on delay jitter ([0,Dmax])
Disadvantage
ideal scheduler not implementable
CNPP/2008.6. © O. Bonaventure, 2008
118
Weighted Round Robin
119
Weighted Fair Queuing
Objective
Define an implementable approximation for GPS
Idea
simulate GPS on a per-packet basis
serve the packets in (approximately) the same
order as the one they would be served with GPS
How to do this ?
Compute time at which GPS would serve each
packet (finish time)
Serve packets in order of finish times
120
A.Parekh and R.Gallagher. A generalized processor sharing approach to flow control : the single node case. IEEE/ACM Transactions on
Networking, 1(3):346--357, 1993.
A.Parekh and R.Gallagher. A generalized processor sharing approach to flow control - the multiple node case. IEEE/ACM Transactions on
Networking, 2(2):137--150, 1996.
Virtual Clock
Approximation of GPS
Idea
associate one timestamp to each arriving packet
scheduler selects among all the queued packets the
packet with the smallest timestamp
First algorithm
D[i] : bandwidth associated with Queue[i]
V[i] : state variable associated with Queue[i]
Arrival of a P bytes long packet in Queue[i]
V[i] = V[i] + ( P / D[i] )
associate V[i] with the packet
Scheduler
select the packet with the smallest timestamp for transmission
121
L.Zhang. VirtualClock: A new traffic control algorithm for packet switching. ACM Transactions on Computing Systems, 9(2):101--124, May
1991.
Virtual Clock (2)
First example
Flow1 (D[1]=1/3) F1
Flow2 (D[2]=1/3) F2
Flow3 (D[3]=1/3) F3 PCR
122
Virtual Clock (3)
Second example
F1 (D[1]=1/3) F1
F2 (D[2]=1/3) F2
F3 (D[3]=1/3) VC3 PCR
123
Virtual Clock (4)
Second algorithm
arrival of a P bytes long packet at time t
V[i] = max ( V[i] , t )+ ( P / D[i] )
Example
F1 (D[1]=1/3) F1
F2 (D[2]=1/3) F2
F3 (D[3]=1/3) F3 PCR
124
Virtual Clock (5)
Third example
F1 (D[1]=1/3) F1
F2 (D[2]=1/3) F2
F3 (D[3]=1/3) F3 PCR
125
SCFQ / Virtual Spacing
Approximation of GPS
Principle
associate one timestamp to each arriving packet
scheduler selects packet with smallest timestamp
Algorithm
D[i] : bandwidth associated to Queue[i]
V[i] : state variable associated to Queue[i]
V : state variable associated to the scheduler
at all time, V is equal to the timestamp of the packet being
transmitted
Arrival of a packet of P bytes in Queue[i]
V[i] = max(V[i], V ) + ( P / D[i] )
V[i] is associated to the arriving packet
Scheduler
select the packet with the smallest timestamp for transmission
126
J.Roberts. Virtual spacing for flexible traffic control. International Journal of Communication Systems, 7:307--318, 1994.
J.Roberts, U.Mocci, and J.Virtamo, editors. Weighted Fair Queueing, chapter6, pages 173--187. Number 1155 in Lecture Notes in Computer
Science. Springer Verlag, 1996.
S.Golestani. A self-clocked fair queuing scheme for broadband applications. In IEEE INFOCOM94, pages 636--646, 1994.
Guarantees
Schedulers supporting per-flow bandwidth
guarantees and protection between flows
GPS
WFQ/PGPS
SCFQ
Deficit-WRR
127
Guarantees (2)
128
Source : H. Zhang, Service disciplines for guaranteed performance service in packet switching networks, Proc. IEEE, Vol 83, No 10, October
1995
Q[1]
Shaping
Q[2]
Q[3]
Classification
Output link
Input links Q[N]
Policing
Buffer acceptance
Classifier + Marker Tail drop, Random Early, Detection,
Layer 4 Policer WRED, Drop from front, ...
Layer 3 - ToS byte Token bucket, Three Color Explicit Congestion Notification
Layer 2.5 - MPLS Marker (token bucket based
or time sliding window based)
CNPP/2008.6. © O. Bonaventure, 2008
129
What kind of guarantees ?
130
What kind of guarantees ? (2)
131
Best effort versus guaranteed service
132
Providing bandwidth guarantees
How can we provide bandwidth guarantees
in a packet-based network ?
Rate : I[1]
Rate : I[2]
Router
Rate : Out
133
Providing bandwidth guarantees (2)
Problem
In packet based networks, traffic is a flow of
variable length packets and not a fluid flow
134
Limiting rate of incoming flows
135
Traffic control and QoS in IP Networks
Outline
Applications
Packet-level traffic control mechanisms
Best-effort service
Maximum bandwidth service
Minimum bandwidth service
Delay guarantees
Standardized Services
136
How to provide
minimum guaranteed bandwidth ?
Problem
within each flow, we now have 2 types of packets
IP packets that are part of the minimum guaranteed
bandwidth for the flow
these packets cannot be discarded inside the router
IP packets that are in excess of the minimum
guaranteed bandwidth
these packets should be treated as best-effort packets and can
be discarded if necessary to preserve the guarantees
Principle
identify the two types of packets
discard preferably the non-guaranteed packets
when congestion occurs inside router
137
Bound on buffer occupancy in routers
Slope : N×LR-LR
Flow N : R[N], B[N] Buffer
occupancy
Simplifications
- A single flow per input link
- All links have same line rate (LR)
- Assume all N flows have same R and B
Worst case traffic into empty buffer
All flows are together sending their worst case traffic
Time
Slope :∑R[i]-LR
138
Impact of multiplexing
...
139
It should be noted that having a bound on the buffer occupancy implies that there will also be a bound on the amount of delay encountered by
one flow through a router. However, the bound on delay jitter will only be [0, Dmax]
Identification of the guaranteed packets
Principle
Measure the rate of the incoming flow
Identify the packets within the minimum bandwidth
Identify the packets in excess of the min. bandwidth
Packets may be explicitly or internally marked
Classifier Buffer
Acceptance
Marker
140
Traffic control and QoS in IP Networks
Outline
Applications
Packet-level traffic control mechanisms
Best-effort service
Maximum bandwidth service
Minimum bandwidth service
Delay guarantees
Standardized Services
141
Towards multiservice networks
Problems
How can we multiplex on a single link through
one router classical best-effort traffic packets and
packets from guaranteed flows ?
guaranteed packets should not be perturbed by best-
effort packets
best-effort packets should be able to utilize the output
link when there is no guaranteed traffic
142
Simple router v3
Shaper
Delays flows which do not
follow some rules
Shaping
Classification
Output link
Input links Policing
Buffer acceptance
Classifier accepts or rejects packets
Identifies the flow to which
the arriving packet belongs Policer
Verifies whether the incoming
flow follows some rules
143
The shaping mechanism can be placed either at the output of the classifier or directly upstream of the output link depending on whether
incoming or outgoing traffic is shaped.
Multiplexing with Simple router v3
Best-effort, min and max bandwidth flows
Discard low preference packets
earlier than high preference ones
Shaping
Classification
Output link
Input links Policing
144
What about delay guarantees ?
Shaping
Classification
Output link
Input links Policing
discard packets
What can we do to ensure that packets from
interactive streaming application will be sent
earlier than packets from batch application ?
145
What about delay guarantees (2)
Solution
Add delay differentiation to loss differentiation
some packets should be sent earlier than others
Replace FIFO buffer by set of queues and scheduler
Q[1]
Q[2]
Q[3]
Output link
Output link
Q[N]
FIFO
packets from all flows
are placed inside the All packets from one flow Scheduler
same queue are placed in the same queue. Selects the packet to
Packets from different flows be transmitted first on
may enter in different queues. the output link
FIFO
transmit packet at head
of the FIFO queue
146
QoS capable router
Shaper Scheduler
Delays flows which do not Chooses the packet to
follow some rules Queuing strategy be transmitted first on
Logical organization of the the output link
router's buffers
Q[1]
Shaping
Q[2]
Q[3]
Classification
Output link
Input links Q[N]
Policing
Buffer acceptance
Classifier accepts or rejects an
Identifies the flow to which incoming packet
the arriving packet belongs Policer
Verifies whether the incoming
flow follows some rules
CNPP/2008.6. © O. Bonaventure, 2008
147
In practice, the shaper could also be located on the output link, but we don't address this issue here to keep the picture simple and
understandable.