Lecture21 Congestion Control
Lecture21 Congestion Control
Computer Networks:
Architecture and Protocols
Lecture 21
TCP and Conges3on Control
Rachit Agarwal
Recap: WHYs behind TCP design
• Started from first principles
• Correctness condition for reliable transport
2
Recap: Transport layer
• Transport layer offer a “pipe” abstraction to applications
• Data goes in one end of the pipe and emerges from other
• Connection oriented
• Explicit set-up and tear-down of TCP session
• Flow control
• Ensures the sender does not overwhelm the receiver
• Congestion control
• Dynamic adaptation to network path’s capacity
Any Questions?
From design to implementation: major notation change
• Previously we focused on packets
• Packets had numbers
• ACKs referred to those numbers
• Window sizes expressed in terms of # of packets
• Retransmissions
• Can’t be correct without retransmitting lost/corrupted data
• TCP retransmits based on timeouts and duplicate ACKs
• Timeouts based on estimate of RTT
• Flow Control
• Congestion Control
Segments, Sequence Numbers and ACKs
TCP “Stream of Bytes” Service
Application @ Host A
Byte 80
Byte 0
Byte 1
Byte 2
Byte 3
Byte 80
Byte 0
Byte 1
Byte 2
Byte 3
Application @ Host B
TCP “Stream of Bytes” Service
Application @ Host A
Byte 80
Byte 0
Byte 1
Byte 2
Byte 3
TCP Data
Byte 80
Byte 0
Byte 1
Byte 2
Byte 3
Application @ Host B
TCP Segment
IP data (datagram)
TCP data (segment) TCP Hdr IP Hdr
• IP Packet
• No bigger than Maximum Transmission Unit (MTU)
• E.g., up to 1500 bytes with Ethernet
• TCP Packet
• IP packet with a TCP header and data inside
• TCP header >= 20 bytes long
• TCP Segment
• No more than MSS (Maximum Segment Size) bytes
• E.g., up to 1460 consecutive bytes from the stream
• MSS = MTU - IP header - TCP header
Sequence Numbers
Host A
TCP
Sequence number TCP Data Hdr
= 1st byte in segment
= ISN + k
Host B
ACKing and Sequence Numbers
• Sender sends segments (byte stream)
• Data starts with Initial Sequence Number (ISN): X
• Packet contains B bytes
• X, X+1, X+2, …, X+B-1
Sequence Number
Starting byte offset
of data carried in
Acknowledgement
this segment
Options (variable)
Data
TCP Header
Acknowledgement
gives sequence Sequence Number
number just
beyond highest
Acknowledgement
sequence number
received in order
(“What byte is HdrLen 0 Flags Advertised Window
next”)
Checksum Urgent Pointer
Options (variable)
Data
TCP Connection Establishment
and Initial Sequence Numbers
Initial Sequence Number (ISN)
• Sequence number for the very first byte
• E.g., Why not just use ISN = 0?
• Practical issue
• IP addresses and port #s uniquely identify a connection
• Eventually, though, these port #s do get used again
• … small chance an old packet is still in flight
AC K
Y N +
S
Each host tells its ISN to
ACK the other host.
Data
Data
Options (variable)
Data
Options (variable)
Options (variable)
Options (variable)
y, A ck = x + 1
SYN + ACK , Se q N u m =
ACK, ACK =
y+1
accept()
Any Questions?
TCP Retransmission
Two Mechanisms for Retransmissions
• Duplicate ACKs
• Timeouts
Loss with Cumulative ACKs
• Sender sends packets with 100B and seqnos
• 100, 200, 300, 400, 500, 600, 700, 800, 900
1 1
Timeout
RTT RTT
1
Timeout
Timeout too long -> inefficient Timeout too short -> duplicate packets
Setting RTO value
• Many ideas
• See backup slides for some examples (not needed for exams)
Sequence Number
Acknowledgement
Options (variable)
Data
Implementing Sliding Window
• Sender maintains a window
• Data that has been sent out but not yet ACK’ed
• What’s missing?
• Approach
• Gentle increase when un-congested (exploration)
• Rapid decrease when congested
Additive Increase, Multiplicative Decrease (AIMD)
• Additive increase
• On success of last window of data, increase by one MSS
• If W packets in a row have been ACKed, increase W by one
• i.e., +1/W per ACK
• Multiplicative decrease
• On loss of packets by DupACKs, divide congestion window by half
• Special case: when timeout, reduce congestion window to one MSS
AIMD
• ACK: CWND -> CWND + 1/CWND
• When CWND is measured in MSS
• Note: after a full window, CWND increase by 1 MSS
• Thus, CWND increases by 1 MSS per RTT
Loss
Halved
t
Any Questions?
Slow Start
AIMD Starts Too Slowly
Window Need to start with a small CWND to avoid overloading the network
• Consider
• RTT = 100ms, MSS=1000bytes
• Window size to fill 1Mbps of BW = 12.5 MSS
• Window size to fill 1 Gbps = 12,500 MSS
• With just AIMD, it takes about 12500 RTTs to get to this
window size!
• ~21 mins
“Slow Start” Phase
• Start with a small congestion window
• Initially, CWND is 1 MSS
• So, initial sending rate is MSS/RTT
1 2 3 4 8
Src
D A D D A A D D D D
A A A A
Dst
Slow Start and the TCP Sawtooth
Window
• If timer expires
• Set SSHTHRESH <- CWND/2 (“Slow Start Threshold”)
• Set CWND <- 1 (MSS)
• Retransmit first lost packet
• Execute Slow Start until CWND > SSTHRESH
• After which switch to Additive Increase
Summary of Increase
• “Slow start”: increase CWND by 1 (MSS) for each ACK
• A factor of 2 per RTT
• Events at sender
• ACK (new data)
• dupACK (duplicate ACK for old data)
• Timeout
Remains in congestion
avoidance after fast
retransmission
Time Diagram
Window
• TCP Reno
• CWND = 1 on timeout Our default assumption
• CWND = CWND/2 on triple dupACK
• TCP-newReno
• TCP-Reno + improved fast recovery
• TCP-SACK
• Incorporates selective acknowledgements
Done!
Next lecture: Critical Analysis of TCP
TCP Back up slides
Could Base RTO on RTT Estimation
• Use exponential averaging if RTT samples
SampleRTT
EstimatedRTT
Time
Exponential Averaging Example
EstimatedRTT (α = 0.5)
RTT
EstimatedRTT (α = 0.8)
0 1 2 3 4 5 6 7 8 9 time
Exponential Averaging in Action
Orig Orig
in al Tr in al Tr
ansm ansm
issio issio
n n
Retr
Sampled ansm Sampled
issio Ret ACK
RTT n RTT ran
smi
ssio
n
ACK
TCP Timers
• Two important quantities
• RTO: value you set timer to for timeouts
• ETO: current estimate of appropriate “raw” timeout
• Receive A200
• A200 means bytes up to 199 rep’d, expecting 200 next
• Clean sample
• … Connection continues…
Example
• Consider a TCP connection with:
• CWND = 10 packets
• Last ACK was for packet # 101
• i.e., receiver expecting next packet to have seq no 101
• If dupACKcount = 3
• ssthresh = CWND / 2
• CWND = ssthresh + 3
• Congestion Avoidance
• Leave when timeout
• Fast recovery
• Enter when dupACK=3
• Leave when New ACK or Timeout
TCP State Machine
dupACK Timeout
new ACK
CWND > ssthresh
dupACK
new ACK
new ACK
Timeout
dupACK=3 dupACK=3
fast
recovery
dupACK
TCP State Machine
dupACK Timeout
new ACK
CWND > ssthresh
dupACK
new ACK
new ACK
Timeout
dupACK=3 dupACK=3
fast
recovery
dupACK
TCP State Machine
dupACK Timeout
new ACK
CWND > ssthresh
dupACK
new ACK
new ACK
Timeout
dupACK=3 dupACK=3
fast
recovery
dupACK