TCP Congestion Control: Algorithms and Analysis: Try Homework Problem at
TCP Congestion Control: Algorithms and Analysis: Try Homework Problem at
Little’s Law
Average
g population
p p
= (average delay) x N
(throughput)
average delay 1 delayi
N i1
where N is number of departures
throughput N/T
where T is duration of observation
average population (to be defined)
1
Number in system n(t)
Time t
P1
0 1 2 a–1 a s–1 s
Sender
acknowledged unacknowledged
next expected r + RW – 1
Sink : received
P2
0 1 2 r
Receiver
delivered receive window
2
Sliding Windows in Action
Data unit r has just been received by P2
Receive window slides forward
P2
P sends
d cumulative
l ack
k with
h sequence
number it expects to receive next (r+3)
Source: send window
P1
0 1 2 a–1 a s–1 s
Sender
acknowledged unacknowledged
r+3
next expected r + RW – 1
Sink:
P2
0 1 2 r
Receiver
delivered receive window
P1
0 1 2 a–1 a s–1 s
Sender
acknowledged
next expected r + RW – 1
Sink:
P2
0 1 2 r
Receiver
delivered receive window
3
Window Flow Control
RTT
Source 1 2 W 1 2 W
time
data ACKs
Destination 1 2 W 1 2 W
time
W
Throughput packets/sec
RTT
W MSS
= bytes/sec
RTT
4
Clarifications
Average number in the send buffer is typically
less than W unless packet arrival rate to send
buffer is infinite -> previous formula provides a
throughput upper bound
Effect of Congestion
W too big for each of many flows -> congestion
Packet loss -> transmissions on links prior to packet
loss are wasted
Congestion collapse due too many retransmissions
and too much waste
October 1986, Internet had its first congestion
collapse
goodput
load
TCP Congestion Control (Simon Lam) 10
5
TCP Window Control
6
Network Congestion Control
Sender calculates cwnd from indications of
network congestion
Congestion indications
timeout (loss)
dupACK (loss likely)
queueing delay
mark (needs ECN)
TCP algorithms to calculate cwnd
Tahoe, Reno, Vegas, …
Link algorithms:
RED, REM …
pl((t))
xi(t)
7
TCP Congestion Control
Tahoe (Jacobson 1988)
Slow Start
Congestion Avoidance
Fast
F t Retransmit
R t it
Reno (Jacobson 1990)
Fast Recovery
Its variants: NewReno, SACK
Vegas (Brakmo & Peterson 1994)
New Congestion Avoidance
AQM
RED
E (Floyd
(Fl d & Jacobson
b 1993)
1 )
• Probabilistic marking or dropping
REM (Athuraliya & Low 2000)
• Clear buffer, match rate
Others…
Slow Start
Start with cwnd = 1
On each successful ACK, increment cwnd
cwnd cwnd + 1
Exponential growth of cwnd
each RTT: cwnd 2 x cwnd
8
Slow Start
sender receiver
cwnd
1
data packet
1 RTT
ACK
2
3
4
5
6
7
8
Congestion Avoidance
sender receiver CA starts when
cwnd
1 cwnd ssthresh
data packet
ACK
On each successful
2 1 RTT
ACK:
cwnd cwnd + 1/cwnd
3
4
each RTT:
cwnd cwnd + 1
9
Packet Loss
Assumption: loss indicates congestion
Packet
P k t loss
l detected
d t t db by
Retransmission timeout (RTO timer)
Duplicate ACKs (at least 3)
Packets
1 2 3 4 5 6 7
Acknowledgements
1 2 3 3 3 3
TCP Congestion Control (Simon Lam) 19
Fast Retransmit
A timeout is quite long (> RTT)
Upon
p receiving
g 3 dupACKs,
p , immediately
y
retransmit without waiting for timeout
Adjusts ssthresh
ssthresh max(flightsize/2, 2)
where
h flightsize
fli htsiz is numb
number of
f outstanding
utst ndin ppackets,
ck ts
which may be less than W = min(rwnd, cwnd)
10
TCP Tahoe (Jacobson 1988)
cwnd
time
SS CA
(in RTTs)
Successive Timeouts
When there is another timeout, double the
timeout value
Keep doing so for each additional loss-
retransmission
Exponential backoff up to
max timeout value equal
to 64 times initial timeout
value
11
Summary: Tahoe
Basic ideas
Probe network for spare capacity during SS and
CA and increase send rate
Drastically reduce rate on congestion indication
Self-clocking
Error recovery by retransmission
Round trip time estimation (to get TO value)
for every ACK {
if (W < ssthresh) then W++ (SS)
else W += 1/W (CA)
}
for every loss indication {
ssthresh = W/2
W = 1
} TCP Congestion Control (Simon Lam) 23
cwnd
time
SS CA
12
TCP Reno (Jacobson 1990)
cwnd
time
SS CA
cwnd 3 dupACKs
halved
Slow start until cwnd
Initial slow start reaches ssthresh
t
13
Fast recovery (in detail)
Example: FR/FR
S 1 2 3 4 5 6 7 8 1 9 10 11
time
Exit FR/FR
R 0 0 0 0 0 0 0 8 time
cwnd 8 7 9 11 4
ssthresh 4 4 4 4
Above scenario: Packet 1 is lost, p
packets 2, 3, and
4 are received;
i d dupACKs
d ACK with
i h seq. no. 0 returned
d
Fast retransmit
Retransmit on 3 dupACKs
Fast recovery
Inflate window such that new packets 9, 10, and 11 can be
sent while repairing loss
TCP Congestion Control (Simon Lam) 28
14
Summary: Reno
Basic ideas
Fast recovery avoids slow start
dupACKs: fast retransmit + fast recovery
Timeout: fast retransmit + slow start
dupACKs
congestion
avoidance FR/FR
timeout
24 Kbytes
16 Kbytes
8 Kbytes
time
15
TCP throughput (send rate)
throughput = W packets/sec
RTT
W changes with the arrival of each
congestion indication
To calculate
l l (average)
( ) send
d rate, we need
d
the average value of W
Q: W is a function of what parameter?
First approximation
M. Mathis, et al., “The Macroscopic Behavior of the TCP Congestion
Avoidance Algorithm,”ACM Computer Communicatons Review, 27(3), 1997.
No slow-start, no timeout, long-lived TCP
connection
Independent identically distributed “periods”
Each packet may be lost with probability p
16
Geometric Distribution
Ave. no. of transmissions to get first loss
n ibi i (1 p)i1 p
i 1 i 1
p i (1 p )i 1
i 1
d d
p
dp i 1
(1 p )i p
dp
(1 p)
i 0
i
d 1 1
p p 2
dp 1 1 p p
1/ p
17
TCP ACK generation [RFC 1122, RFC 2581]
Arrival of out
out-of-order
of order segment Immediately send duplicate ACK
ACK,
higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected
1 3
RTT 2bp
TCP Congestion Control (Simon Lam) 36
18
Modeling
M d li TCP Throughput:
Th h t A Si Simple
l
Model and its Empirical Validation,
Proc. ACM SIGCOMM, 1998
Jitendra Padhye, Victor Firoiu,
D Towsley,
Don T l andd Jim
Ji Kurose
K
Motivation
Previous formulas not so accurate when
loss rates are high
TCP traces show that there are more loss
indications due to timeouts (TO) than due
to triple dupACKs (TD)
19
Objectives
More accurate steady-state throughput
formula as a function of loss rate and RTT
by also accounting for TO behavior of a
TCP connection
Formula applicable over a wider range of
loss rates
Explicit statements of assumptions and
approximations
i ti used
d iin d
derivation
i ti of f
throughput formula
Formula to include the impact of a small
rwnd
TCP Congestion Control (Simon Lam) 39
20
A3. Time to send W packets is
less than RTT •ACK reception
marks the end of
current round and
Start of round
beginning of next
time round.
•Approximation: For
b > 1, ACK is not
received
immediately after
one RTT, but it is so
End of round
assumed in the
analysis
space
TCP Congestion Control (Simon Lam) 41
21
Markov regenerative assumption
For the i-th TD period, Wi is window size at
the end of the period, Yi is the number of
packets sent in the period
A6. Assume {Wi} to be a Markov
regenerative process with rewards {Yi}
Given A6, the steady-state TCP throughput
is
N t E[Yi ] E[Y ]
B lim Bt lim
t t t E[ Ai ] E[ A]
when ACK of
last packet is
received
22
Loss assumptions
A7. Losses in different rounds are
independent
approximation
A8. Losses within the same round are
correlated as follows: If a packet is lost,
all remaining packets transmitted until the
end of that round are also lost
approximation
i ti – bursty
b t lloss b behavior
h i b butt only
l
within the same round
all lost packets in the same round are counted
as a single loss indication when estimating p
23
AIMD throughput derivation (2)
Another way to
X i / b 1
Wi 1 compute E[Y]
Yi
k 0 2
k )b i
(
XW X X
i i 1 i ( i 1) i
2 2 b
X W
i (Wi i 1 1) i
2 2
W
Let E[ ] be E[ ] and we have
2 <- A9. Assume that
E[ X ] E[W ] {Xi} and {Wi} are
E[Y ] ( E[W ] 1) E[ ]
2 2 mutually
bE[W ] E[W ] W independent i.i.d.
( E[W ] 1) E[ ]
4 2 2 sequences of
random variables
TCP Congestion Control (Simon Lam) 47
1 p
E [W ]
p
se n d ra te B( p)
E [ A]
TCP Congestion Control (Simon Lam) 48
24
AIMD throughput (4)
To get a simple formula, collect terms that are o(1/sqrt(p))
8
E[W ] o(1/ p )
3bp
b 2b
E[ X ] E[W ] o(1/ p )
2 3p
1/ p o(1/ p) 1 3
send rate B( p) o(1/ p )
2b RTT 2bp
RTT o(1/ p )
3p
AIMD with TO
25
Throughput of AIMD with TO (1)
E[ M ] E[n]E[Y ] E[ R ] Assumption of
Markov
E[ S ] E[n]E[ A] E[ Z TO ]
regenerative
E[ M ] E[n]E[Y ] E[ R] process again.
send rate B
E[ S ] E[n]E[ A] E[ Z TO ]
E[Y ] Q E[ R]
B
E[ A] Q E[ Z TO ]
1
where Q
E[n] <- Probability
P b bilit ththatt a
1 given loss
E[ R] indication is a TO
1 p
with Q and E[ Z TO ] to be determined
TCP Congestion Control (Simon Lam) 51
26
Approximate solution for Q (cont.)
<- penultimate round of w
(1 p ) k p packets, first k packets
A( w, k )
1 (1 p ) w ack’d given there is a loss
( w) 1
Q if w 3 <- at most 2 dupACKs
p
2 w 2 <- probability of fewer than 3
A( w, k ) A( w, k ) C ( k , m) packets sent successfully
k 0 k 3 m 0
in penultimate round or
if w 4 less than 3 acks in last
round
TCP Congestion Control (Simon Lam) 53
( w)]
Q is E[Q
But we don’t know the probability distribution of Wi
Approximation 3 3bp
Q Q ( E[W ]) min(1, ) min(1,3 )
E[W ] 8
TCP Congestion Control (Simon Lam) 54
27
Throughput of AIMD with TO (2)
P[ R k ] p k 1 (1 p ) for k 1, 2,...
<- duration of k
Lk (2 1)TO
k
for k 6 TOs in a row
(63 64(k 6))TO for k 7
1 p 2 p 2 4 p 3 8 p 4 16 p 5 32 p 6
E[ Z TO ] TO
1 p
TO
f ( p)
T0 (1 32 p 2 ) <- approximation
1 p
E[Y ] Q E[ R ]
sendd rate B ( p )
E[ A] Q E[ Z TO ]
1 p ( E[W ]) 1
E[W ] Q
p 1 p
B( p)
RTT ( E[ X ] 1) Q ( E[W ])T f ( p )
1 p TCP Congestion Control (Simon Lam)
O
55
3 p 8
1
<-Eq. (29)
2bp 3bp most well-
RTT min 1,3 p(1 32 p )T0
2
3 8 known version
of throughput
formula
TCP Congestion Control (Simon Lam) 56
28
Impact of receiver’s rwnd limitation
Compute E[W ]. If E[W ] Wmax , use Eq. (27): Full model Eq. (31)
1 p ( E[W ]) 1
E[W ] Q
p 1 p
B( p) if E[W] <Wmax,
( E[W ])T f ( p)
RTT ( E[ X ] 1) Q
1 p
O
1 p (W ) 1
Wmax Q otherwise, use Wmax for
1 p
max
p E[W] and recompute
B( p)
b 1 p f ( p) E[X]
RTT ( Wmax 2) Q(Wmax )TO
8 pWmax 1 p (derivation omitted)
TCP Congestion Control (Simon Lam) 57
Wmax 1
B( p ) min( , )
RTT 2bp 3bp
RTT min 1,3 p (1 32 p )T0
2
3 8
29
Summary data from traces (1 hour)
Saturated TCP
sender
p computed
from dividing
total no. of loss
indications by
total number of
packets sent
RTT and TO
values are
averaged over
entire 1-hour
trace
30
Experimental comparison (1)
Wmax = 33 Wmax=44
TCP Congestion Control (Simon Lam) 62
31
Experimental comparison (3)
Wmax=8 Wmax=48
TCP Congestion Control (Simon Lam) 63
32
Average errors
N predicted N observed
N observed
ave. error observations
Conclusions
A much more rigorous analysis than the one by
Mathis et al.
Numerous assumptions and approximations used
but (almost) all of them are explicitly stated
Large amount of experimental measurements on
the Internet to validate accuracy of the full model
(less for the approximate model)
Throughput formula accounts for loss indications
due to TO as well as rwnd restriction
Using
U i theh formula
f l requires
i accurate measurements of
f
loss rate and RTT values (which could be tricky)
For TCP Reno and drop-tail router
Accuracy (like beauty) is in the eye of the
beholder. What do you think?
TCP Congestion Control (Simon Lam) 66
33
TCP Throughput limited by loss rate
TCP average throughput (approximate) in
terms of loss rate, L:
1 22 MSS
1.22
RTT p
Example: 1500-byte segments, 100ms RTT,
to get 10 Gbps throughput, loss rate needs
to be very low
p = 2x10
2 10-10
10
The End
34