0% found this document useful (0 votes)
2 views

ACN_Week_8

Chapter 3 discusses transport-layer services, focusing on congestion control principles and mechanisms such as TCP and UDP. It highlights the causes and costs of congestion, including packet loss and delays, and explores various congestion control strategies like AIMD and delay-based approaches. The chapter also addresses TCP's behavior in response to congestion and fairness among multiple TCP sessions sharing bandwidth.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

ACN_Week_8

Chapter 3 discusses transport-layer services, focusing on congestion control principles and mechanisms such as TCP and UDP. It highlights the causes and costs of congestion, including packet loss and delays, and explores various congestion control strategies like AIMD and delay-based approaches. The chapter also addresses TCP's behavior in response to congestion and fairness among multiple TCP sessions sharing bandwidth.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

Chapter 3: roadmap

 Transport-layer services
 Multiplexing and demultiplexing
 Connectionless transport: UDP
 Principles of reliable data transfer
 Connection-oriented transport: TCP
 Principles of congestion control
 TCP congestion control
 Evolution of transport-layer
functionality
Transport Layer: 3-1
Principles of congestion control
Congestion:
 informally: “too many sources sending too much data too fast for
network to handle”
 manifestations:
• long delays (queueing in router buffers)
• packet loss (buffer overflow at routers)
 different from flow control! congestion control:
 a top-10 problem! too many senders,
sending too fast

flow control: one sender


too fast for one receiver
Transport Layer: 3-2
Causes/costs of congestion: scenario 1
original data: lin throughput: lout
Simplest scenario:
Host A
 one router, infinite buffers
 input, output link capacity: R infinite shared
output link buffers
 two flows
R R
 no retransmissions needed

Host B

R/2
Q: What happens as
lout

delay
arrival rate lin
throughput:

approaches R/2?
lin R/2 lin R/2
maximum per-connection large delays as arrival rate
throughput: R/2 lin approaches capacity
Transport Layer: 3-3
Causes/costs of congestion: scenario 2
 one router, finite buffers
 sender retransmits lost, timed-out packet
• application-layer input = application-layer output: lin = lout
• transport-layer input includes retransmissions : l’in lin

Host A lin : original data


l'in: original data, plus lout
retransmitted data

R R

Host B finite shared output


link buffers
Transport Layer: 3-4
Causes/costs of congestion: scenario 2
Idealization: perfect knowledge R/2
 sender sends only when router buffers available

lout
throughput:
Host A lin : original data lin
copy l'in: original data, plus lout R/2

retransmitted data

free buffer space!

R R

Host B finite shared output


link buffers
Transport Layer: 3-5
Causes/costs of congestion: scenario 2
Idealization: some perfect knowledge
 packets can be lost (dropped at router) due to
full buffers
 sender knows when packet has been dropped:
only resends if packet known to be lost

Host A lin : original data


copy l'in: original data, plus
retransmitted data

no buffer space!

R R

Host B finite shared output


link buffers
Transport Layer: 3-6
Causes/costs of congestion: scenario 2
Idealization: some perfect knowledge R/2
“wasted” capacity due
 packets can be lost (dropped at router) due to

lout
to retransmissions
full buffers

throughput:
when sending at
 sender knows when packet has been dropped: R/2, some packets
only resends if packet known to be lost are needed
retransmissions

Host A lin : original data lin R/2


l'in: original data, plus
retransmitted data

free buffer space!

R R

Host B finite shared output


link buffers
Transport Layer: 3-7
Causes/costs of congestion: scenario 2
Realistic scenario: un-needed duplicates R/2
 packets can be lost, dropped at router due to

lout
“wasted” capacity due
full buffers – requiring retransmissions to un-needed
retransmissions
 but sender times can time out prematurely,

throughput:
sending two copies, both of which are delivered when sending at
R/2, some packets
are retransmissions,
including needed
and un-needed
Host A lin : original data lin
copy R/2 duplicates, that are
timeo
ut l'in: original data, plus delivered!
retransmitted data

free buffer space!

R R

Host B finite shared output


link buffers
Transport Layer: 3-8
Causes/costs of congestion: scenario 2
Realistic scenario: un-needed duplicates R/2
 packets can be lost, dropped at router due to

lout
“wasted” capacity due
full buffers – requiring retransmissions to un-needed
retransmissions
 but sender times can time out prematurely,

throughput:
sending two copies, both of which are delivered when sending at
R/2, some packets
are retransmissions,
including needed
and un-needed
lin R/2 duplicates, that are
delivered!
“costs” of congestion:
 more work (retransmission) for given receiver throughput
 unneeded retransmissions: link carries multiple copies of a packet
• decreasing maximum achievable throughput

Transport Layer: 3-9


Causes/costs of congestion: scenario 3
 four senders Q: what happens as lin and lin’ increase ?
 multi-hop paths
A: as red lin’ increases, all arriving blue pkts at upper
 timeout/retransmit queue are dropped, blue throughput g 0
Host A lin : original data
Host B
l'in: original data, plus
retransmitted data
finite shared
output link buffers

Host D
lout
Host C

Transport Layer: 3-10


Causes/costs of congestion: scenario 3
R/2
lout

lin’ R/2

another “cost” of congestion:


 when packet dropped, any upstream transmission capacity and
buffering used for that packet was wasted!

Transport Layer: 3-11


Causes/costs of congestion: insights
 throughput can never exceed capacity

 delay increases as capacity approached

 loss/retransmission decreases effective


throughput
 un-needed duplicates further decreases
effective throughput
 upstream transmission capacity / buffering
wasted for packets lost downstream
Transport Layer: 3-12
Approaches towards congestion control

End-end congestion control:


 no explicit feedback from
network
 congestion inferred from ACKs
data data
ACKs
observed loss, delay
 approach taken by TCP

Transport Layer: 3-13


Approaches towards congestion control

Network-assisted congestion
control: explicit congestion info
 routers provide direct feedback
to sending/receiving hosts with data data
ACKs
flows passing through congested ACKs

router
 may indicate congestion level or
explicitly set sending rate
 TCP ECN, ATM, DECbit protocols
Transport Layer: 3-14
Chapter 3: roadmap
 Transport-layer services
 Multiplexing and demultiplexing
 Connectionless transport: UDP
 Principles of reliable data transfer
 Connection-oriented transport: TCP
 Principles of congestion control
 TCP congestion control
 Evolution of transport-layer
functionality
Transport Layer: 3-15
TCP congestion control: AIMD
 approach: senders can increase sending rate until packet loss
(congestion) occurs, then decrease sending rate on loss event
Additive Increase Multiplicative Decrease
increase sending rate by 1 cut sending rate in half at
maximum segment size every each loss event
RTT until loss detected
TCP sender Sending rate

AIMD sawtooth
behavior: probing
for bandwidth

time Transport Layer: 3-16


TCP AIMD: more
Multiplicative decrease detail: sending rate is
 Cut in half on loss detected by triple duplicate ACK (TCP Reno)
 Cut to 1 MSS (maximum segment size) when loss detected by
timeout (TCP Tahoe)

Why AIMD?
 AIMD – a distributed, asynchronous algorithm – has been
shown to:
• optimize congested flow rates network wide!
• have desirable stability properties

Transport Layer: 3-17


TCP congestion control: details
sender sequence number space
cwnd TCP sending behavior:
 roughly: send cwnd bytes,
wait RTT for ACKS, then
send more bytes
last byte
available but cwnd
ACKed sent, but not-
not used
TCP rate ~
~ bytes/sec
yet ACKed RTT
(“in-flight”) last byte sent

 TCP sender limits transmission: LastByteSent- LastByteAcked < cwnd


 cwnd is dynamically adjusted in response to observed
network congestion (implementing TCP congestion control)
Transport Layer: 3-18
TCP slow start
Host A Host B
 when connection begins,
increase rate exponentially
until first loss event:
one s e gm
ent

RTT
• initially cwnd = 1 MSS two segm
en ts
• double cwnd every RTT
• done by incrementing cwnd
for every ACK received four segm
ents

 summary: initial rate is


slow, but ramps up
exponentially fast time

Transport Layer: 3-19


TCP: from slow start to congestion avoidance

Q: when should the exponential


increase switch to linear?
X
A: when cwnd gets to 1/2 of its
value before timeout.

Implementation:
 variable ssthresh
 on loss event, ssthresh is set to
1/2 of cwnd just before loss event

* Check out the online interactive exercises for more examples: h ttp://gaia.cs.umass.edu/kurose_ross/interactive/
Transport Layer: 3-20
Summary: TCP congestion control
New
New ACK!
new ACK
duplicate ACK
dupACKcount++
ACK!
new ACK .
cwnd = cwnd + MSS (MSS/cwnd)
dupACKcount = 0
cwnd = cwnd+MSS transmit new segment(s), as allowed
dupACKcount = 0
L transmit new segment(s), as allowed
cwnd = 1 MSS
ssthresh = 64 KB cwnd > ssthresh
dupACKcount = 0
slow L congestion
start timeout avoidance
ssthresh = cwnd/2
cwnd = 1 MSS duplicate ACK
timeout dupACKcount = 0 dupACKcount++
ssthresh = cwnd/2 retransmit missing segment
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
timeout
New
ACK!
ssthresh = cwnd/2
cwnd = 1 New ACK
dupACKcount = 0
cwnd = ssthresh dupACKcount == 3
dupACKcount == 3 retransmit missing segment dupACKcount = 0
ssthresh= cwnd/2 ssthresh= cwnd/2
cwnd = ssthresh + 3 cwnd = ssthresh + 3
retransmit missing segment
retransmit missing segment
fast
recovery
duplicate ACK
cwnd = cwnd + MSS
transmit new segment(s), as allowed

Transport Layer: 3-21


TCP CUBIC
 Is there a better way than AIMD to “probe” for usable bandwidth?
 Insight/intuition:
• Wmax: sending rate at which congestion loss was detected
• congestion state of bottleneck link probably (?) hasn’t changed much
• after cutting rate/window in half on loss, initially ramp to to Wmax faster, but then
approach Wmax more slowly

Wmax classic TCP

TCP CUBIC - higher


Wmax/2 throughput in this
example

Transport Layer: 3-22


TCP CUBIC
 K: point in time when TCP window size will reach Wmax
• K itself is tuneable
 increase W as a function of the cube of the distance between current
time and K
• larger increases when further away from K
• smaller increases (cautious) when nearer K
 TCP CUBIC default
Wmax
in Linux, most
TCP Reno
popular TCP for TCP CUBIC
popular Web TCP
sending
servers rate

time
t0 t1 t2 t3 t4
Transport Layer: 3-23
TCP and the congested “bottleneck link”
 TCP (classic, CUBIC) increase TCP’s sending rate until packet loss occurs
at some router’s output: the bottleneck link

source destination
application application
TCP TCP
network network
link link
physical physical
packet queue almost
never empty, sometimes
overflows packet (loss)

bottleneck link (almost always busy)


Transport Layer: 3-24
TCP and the congested “bottleneck link”
 TCP (classic, CUBIC) increase TCP’s sending rate until packet loss occurs
at some router’s output: the bottleneck link
 understanding congestion: useful to focus on congested bottleneck link

insight: increasing TCP sending rate will


source not increase end-end throughout destination
with congested bottleneck
application application
TCP TCP
network network
link link
physical physical

insight: increasing TCP


sending rate will
increase measured RTT
Goal: “keep the end-end pipe just full, but not fuller”
RTT
Transport Layer: 3-25
Delay-based TCP congestion control
Keeping sender-to-receiver pipe “just full enough, but no fuller”: keep
bottleneck link busy transmitting, but avoid high delays/buffering
# bytes sent in
measured last RTT interval
RTTmeasured throughput =
RTTmeasured
Delay-based approach:
 RTTmin - minimum observed RTT (uncongested path)
 uncongested throughput with congestion window cwnd is cwnd/RTTmin
if measured throughput “very close” to uncongested throughput
increase cwnd linearly /* since path not congested */
else if measured throughput “far below” uncongested throughout
decrease cwnd linearly /* since path is congested */
Transport Layer: 3-26
Delay-based TCP congestion control

 congestion control without inducing/forcing loss


 maximizing throughout (“keeping the just pipe full… ”) while keeping
delay low (“…but not fuller”)
 a number of deployed TCPs take a delay-based approach
 BBR deployed on Google’s (internal) backbone network

Transport Layer: 3-27


Explicit congestion notification (ECN)
TCP deployments often implement network-assisted congestion control:
 two bits in IP header (ToS field) marked by network router to indicate congestion
• policy to determine marking chosen by network operator
 congestion indication carried to destination
 destination sets ECE bit on ACK segment to notify sender of congestion
 involves both IP (IP header ECN bit marking) and TCP (TCP header C,E bit marking)

source TCP ACK segment


destination
application application
ECE=1
TCP TCP
network network
link link
physical physical

ECN=10 ECN=11

IP datagram
Transport Layer: 3-28
TCP fairness
Fairness goal: if K TCP sessions share same bottleneck link of
bandwidth R, each should have average rate of R/K
TCP connection 1

bottleneck
TCP connection 2 router
capacity R

Transport Layer: 3-29


Q: is TCP Fair?
Example: two competing TCP sessions:
 additive increase gives slope of 1, as throughout increases
 multiplicative decrease decreases throughput proportionally

R equal bandwidth share


Is TCP fair?
Connection 2 throughput

A: Yes, under idealized


loss: decrease window by factor of 2 assumptions:
congestion avoidance: additive increase  same RTT
loss: decrease window by factor of 2
congestion avoidance: additive increase
 fixed number of sessions
only in congestion
avoidance

Connection 1 throughput R
Transport Layer: 3-30
Fairness: must all network apps be “fair”?
Fairness and UDP Fairness, parallel TCP
 multimedia apps often do not connections
use TCP  application can open multiple
• do not want rate throttled by parallel connections between two
congestion control hosts
 instead use UDP:  web browsers do this , e.g., link of
• send audio/video at constant rate, rate R with 9 existing connections:
tolerate packet loss • new app asks for 1 TCP, gets rate R/10
 there is no “Internet police” • new app asks for 11 TCPs, gets R/2
policing use of congestion
control

Transport Layer: 3-31


Transport layer: roadmap
 Transport-layer services
 Multiplexing and demultiplexing
 Connectionless transport: UDP
 Principles of reliable data transfer
 Connection-oriented transport: TCP
 Principles of congestion control
 TCP congestion control
 Evolution of transport-layer
functionality
Transport Layer: 3-32
Evolving transport-layer functionality
 TCP, UDP: principal transport protocols for 40 years
 different “flavors” of TCP developed, for specific scenarios:
Scenario Challenges
Long, fat pipes (large data Many packets “in flight”; loss shuts down
transfers) pipeline
Wireless networks Loss due to noisy wireless links, mobility;
TCP treat this as congestion loss
Long-delay links Extremely long RTTs
Data center networks Latency sensitive
Background traffic flows Low priority, “background” TCP flows

 moving transport–layer functions to application layer, on top of UDP


• HTTP/3: QUIC
Transport Layer: 3-33
QUIC: Quick UDP Internet Connections
 application-layer protocol, on top of UDP
• increase performance of HTTP
• deployed on many Google servers, apps (Chrome, mobile YouTube app)

HTTP/2 HTTP/2 (slimmed)


Application HTTP/3
TLS QUIC

Transport TCP UDP

Network IP IP

HTTP/2 over TCP HTTP/2 over QUIC over UDP

Transport Layer: 3-34


QUIC: Quick UDP Internet Connections
adopts approaches we’ve studied in this chapter for
connection establishment, error control, congestion control
• error and congestion control: “Readers familiar with TCP’s loss
detection and congestion control will find algorithms here that parallel
well-known TCP ones.” [from QUIC specification]
• connection establishment: reliability, congestion control,
authentication, encryption, state established in one RTT

 multiple application-level “streams” multiplexed over single QUIC


connection
• separate reliable data transfer, security
• common congestion control
Transport Layer: 3-35
QUIC: Connection establishment

TCP handshake
(transport layer) QUIC handshake

data
TLS handshake
(security)
data

TCP (reliability, congestion control QUIC: reliability, congestion control,


state) + TLS (authentication, crypto authentication, crypto state
state)
 1 handshake
 2 serial handshakes

Transport Layer: 3-36


QUIC: streams: parallelism, no HOL blocking

HTTP HTTP
GET GET HTTP
application

GET
HTTP HTTP
GET GET
HTTP
GET QUIC QUIC QUIC QUIC QUIC QUIC
encrypt encrypt encrypt encrypt encrypt encrypt
QUIC QUIC QUIC QUIC QUIC QUIC
TLS encryption TLS encryption RDT RDT RDT RDT
error!
RDT RDT

QUIC Cong. Cont. QUIC Cong. Cont.


transport

TCP RDT TCP


error! RDT

TCP Cong. Contr. TCP Cong. Contr. UDP UDP

(a) HTTP 1.1 (b) HTTP/2 with QUIC: no HOL blocking


Transport Layer: 3-37
Chapter 3: summary
 principles behind transport Up next:
layer services:  leaving the network
• multiplexing, demultiplexing “edge” (application,
• reliable data transfer transport layers)
• flow control  into the network “core”
• congestion control
 two network-layer
 instantiation, implementation chapters:
in the Internet • data plane
• UDP • control plane
• TCP

Transport Layer: 3-38


Additional Chapter 3 slides

Transport Layer: 3-39


Go-Back-N: sender extended FSM
rdt_send(data)
if (nextseqnum < base+N) {
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
L else
refuse_data(data)
base=1
nextseqnum=1
timeout
start_timer
Wait udt_send(sndpkt[base])
rdt_rcv(rcvpkt) udt_send(sndpkt[base+1])
&& corrupt(rcvpkt) …
udt_send(sndpkt[nextseqnum-
1])
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
stop_timer
else
start_timer
Transport Layer: 3-40
Go-Back-N: receiver extended FSM
any other event
udt_send(sndpkt) rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
L && hasseqnum(rcvpkt,expectedseqnum)
expectedseqnum=1 Wait extract(rcvpkt,data)
sndpkt = deliver_data(data)
make_pkt(expectedseqnum,ACK,chksum) sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++

ACK-only: always send ACK for correctly-received packet with highest


in-order seq #
• may generate duplicate ACKs
• need only remember expectedseqnum
 out-of-order packet:
• discard (don’t buffer): no receiver buffering!
• re-ACK pkt with highest in-order seq # Transport Layer: 3-41
TCP sender (simplified)
data received from application above
create segment, seq. #: NextSeqNum
pass segment to IP (i.e., “send”)
NextSeqNum = NextSeqNum + length(data)
if (timer currently not running)
L start timer
NextSeqNum = InitialSeqNum wait
SendBase = InitialSeqNum for
event timeout
retransmit not-yet-acked
segment with
smallest seq. #
ACK received, with ACK field value y start timer

if (y > SendBase) {
SendBase = y
/* SendBase–1: last cumulatively ACKed byte */
if (there are currently not-yet-acked segments)
start timer
else stop timer
}
Transport Layer: 3-42
TCP 3-way handshake FSM
closed
Socket connectionSocket =
welcomeSocket.accept();
L Socket clientSocket =
newSocket("hostname","port number");
SYN(x)
SYNACK(seq=y,ACKnum=x+1) SYN(seq=x)
create new socket for
communication back to client
listen

SYN
SYN sent
rcvd
SYNACK(seq=y,ACKnum=x+1)
ESTAB
ACK(ACKnum=y+1) ACK(ACKnum=y+1)
L

Transport Layer: 3-43


Closing a TCP connection
client state server state
ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1
can still
FIN_WAIT_2 wait for server send data
close

LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime

CLOSED

Transport Layer: 3-44


TCP throughput
 avg. TCP thruput as function of window size, RTT?
• ignore slow start, assume there is always data to send
 W: window size (measured in bytes) where loss occurs
• avg. window size (# in-flight bytes) is ¾ W
• avg. thruput is 3/4W per RTT
3 W
avg TCP thruput = bytes/sec
4 RTT
W

W/2
TCP over “long, fat pipes”
 example: 1500 byte segments, 100ms RTT, want 10 Gbps throughput
 requires W = 83,333 in-flight segments
 throughput in terms of segment loss probability, L [Mathis 1997]:

1.22 . MSS
TCP throughput =
RTT L
➜ to achieve 10 Gbps throughput, need a loss rate of L = 2·10-10 – a very
small loss rate!
 versions of TCP for long, high-speed scenarios

Transport Layer: 3-46

You might also like