Slide Deck - Transport Layer - New
Slide Deck - Transport Layer - New
Sources
Message Application
User Datagram Protocol (UDP)
UDP / TCP segment Transport Transmission Control Protocol (TCP)
Network
Data Link
Physical
Protocols @ Different layers
Source: http://walkwidnetwork.blogspot.com/2013/04/application-layer-internet-protocol.html
Transport Layer Protocols
• The well-known port numbers are less than 1,024. These are used by processes that provide widely
used types of network services.
• Registered Port Numbers: They are assigned by IANA (Internet Assigned Numbers Authority,
Owner ICANN) for specific service upon application by a requesting entity.
• Dynamic Port Numbers: This range is used for private or customized services, for temporary purposes,
and for automatic allocation
https://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers#Well-known
TCP/IP Protocol Suite
_ports
Socket Pair
application application
proce socket proce controlled by
ss ss app developer
transport transport
network network controlled
link
by OS
link Internet
physical physical
log spor
running on different hosts
tra
ica
n
▪ transport protocols actions in end
l en
d-e
t
systems:
nd
local or
• sender: breaks application messages regional
ISP
into segments, passes to network layer home network content
• receiver: reassembles segments into provider
network datacenter
messages, passes to application layer applicatio
n
network
Sender:
application ▪ is passed an application
app. msg
application-layer message
transport ▪ determines segment TThhtransport
app. msg
header fields values
network (IP) ▪ creates segment network (IP)
physical physical
Receiver:
application ▪ receives segment from IP application
▪ checks header values
transport
transport
app. msg ▪ extracts application-layer
message
network (IP)
network (IP) ▪ demultiplexes message up
link to application via socket link
physical physical
Th app. msg
log spor
tra
ica
n
• congestion control
l en
d-e
• flow control
t
nd
• connection setup local or
regional
▪UDP: User Datagram Protocol home network
ISP
content
• unreliable, unordered delivery provider
network datacenter
• no-frills extension of “best-effort” IP applicatio
n
network
application
Hn Ht HTTP
msg
client
application application
HTTP
HTTP Ht msg
transport
HTTP
msg msg
transport network transport
network link network
link physical link
physical physical
de-multiplexing
application
? transport
de-multiplexing
Demultiplexing
multiplexing
application
transport
multiplexing
Multiplexing
How demultiplexing works
▪ host receives IP datagrams 32
bits
• each datagram has source IP source port dest port
address, destination IP address # #
• each datagram carries one other header fields
transport-layer segment
• each segment has source,
application
destination port number data
▪ host uses IP addresses & port (payload)
numbers to direct segment to
appropriate socket TCP/UDP segment format
Packets
Error
Control
1. Checksum Error Control Service is responsible for:
2. Acknowledgement 1. Detecting and discarding corrupted packets.
3. Retransmission 2. Keeping track of lost and discarded packets and resending them
3. Recognizing duplicate packets and discarding them.
4. Buffering Out-of-Order packets until the missing packets arrive.
✔Simple Protocol
✔ Stop-and-Wait Protocol
✔ Go-Back-N Protocol
✔ Selective-Repeat Protocol
33
Flow Control and Error Control Protocol
Cont..
Flow control: It coordinates amount of data that can be sent before receiving an ack.
• Stop and Wait
• Sliding window
Error Control: It is refer to the methods of error detection and retransmission. The most
popular retransmission scheme is known as Automatic-Repeat-Request (ARQ). Three
popular ARQ techniques
34
Chapter 3: roadmap
● Transport-layer services
● Connection-oriented transport:
TCP
• segment structure
• reliable data transfer
• flow control
• connection management
35
TCP: overview RFCs: 793,1122, 2018, 5681, 7323
▪ point-to-point: ▪ cumulative ACKs
• one sender, one receiver
▪ pipelining:
▪ reliable, in-order byte • TCP congestion and flow control
steam: set window size
• no “message boundaries"
▪ connection-oriented:
▪ full duplex data:
• bi-directional data flow in • handshaking (exchange of control
same connection messages) initializes sender,
• MSS: maximum segment size receiver state before data
exchange
▪ flow controlled:
• sender will not overwhelm receiver
36
Stream delivery
✔Numbering System
✔ Flow Control
✔ Error Control
✔ Congestion Control
40
TCP sequence numbers, ACKs
outgoing segment from
Sequence numbers: sender
source port
#
dest port #
sequence number
• byte stream “number” of acknowledgement
number rwn
first byte in segment’s data checksum urgd pointer
window
Acknowledgements: size
N
• seq # of next byte expected
from other side sender sequence number space
to implementor acknowledgement
number
A rwn
checksum urgd pointer
TCP sequence numbers
User types‘C’
Seq=42, ACK=79, data = ‘C’
host ACKs receipt of‘C’,
echoes back ‘C’
Seq=79, ACK=43, data = ‘C’
host ACKs receipt
of echoed ‘C’
Seq=43, 43
ACK=80
• Suppose a TCP connection is transferring a file of 5,000 bytes. The first byte is numbered 10,001.
What are the sequence numbers for each segment if data are sent in five segments, each carrying
1,000 bytes?
Solution
The following shows the sequence number for each segment:
45
TCP segment format
• Before discussing TCP in more detail, let us discuss the TCP packets themselves. A packet in
TCP is called a segment.
TCP options
application data sent by
RST, SYN, FIN: connection data application into
management (variable length) TCP socket
47
TCP Flag Bits
In practice, the PSH, URG, and the urgent pointer are not used.
48
Connection establishment using three-way handshaking
● A SYN segment cannot carry data,
but it consumes one sequence
number.
● A SYN + ACK segment cannot carry
data, but does consume one
sequence number.
● An ACK segment, if carrying no
data, consumes no sequence
number.
Connection establishment using three-way handshaking
1. Sender starts the process with the following:
● Sequence number (Seq=521): contains the random
initial sequence number generated at the sender
side.
● Syn flag (Syn=1): request the receiver to synchronize
its sequence number with the above-provided
sequence number.
● Maximum segment size (MSS=1460 B): sender tells
its maximum segment size, so that receiver sends
datagram which won’t require any fragmentation.
MSS field is present inside Option field in TCP
header.
● Window size (window=14600 B): sender tells
about his buffer capacity in which he has to store
https://www.geeksforgeeks.org/tcp-connection-establishment/ messages from the receiver.
Connection establishment using three-way handshaking
2. TCP is a full-duplex protocol so both sender and receiver require
a window for receiving messages from one another.
1. When one end sends a data segment to the other end, it must
include an ACK. That gives the next sequence number it expects to
receive. (Piggyback)
55
Rules for Generating the ACKs Cont..
● Receipt of three
duplicate ACKs
indicates 3
segments received
after a missing
segment – lost
segment is likely.
So retransmit!
● Retransmission
after 3 duplicates
Acknowledgement
(or) early
Retransmission
TCP/IP Protocol Suite
59
Lost acknowledgment
62
Window size or Advertisement Window
● A sender should never send more than what the receiver receives.
● During connection establishment phase, both sides advertise
window size.
● Importance of persistence timer (PT).
TCP round trip time, timeout
Q: how to set TCP timeout Q: how to estimate RTT?
value? ▪ SampleRTT:measured time
▪ longer than RTT, but RTT varies! from segment transmission until
ACK receipt
▪ too short: premature timeout,
• ignore retransmissions
unnecessary retransmissions
▪ SampleRTT will vary, want
▪ too long: slow reaction to estimated RTT “smoother”
segment loss • average several recent
measurements, not just current
SampleRTT
64
TCP round trip time, timeout
EstimatedRTT = (1- α)*EstimatedRTT + α*SampleRTT
▪ exponential weighted moving average (EWMA)
▪ influence of past sample decreases exponentially fast
▪ typical value: α = 0.125
RTT: gaia.cs.umass.edu to
fantasia.eurecom.fr
(milliseconds)
RTT
sampleRTT
EstimatedRTT
time
(seconds) 65
TCP round trip time, timeout
▪ timeout interval: EstimatedRTT plus “safety margin”
• large variation in EstimatedRTT: want a larger safety margin
TimeoutInterval = EstimatedRTT + 4*DevRTT
* Check out the online interactive exercises for more examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
66
Chapter 3: roadmap
● Transport-layer services
● Multiplexing and demultiplexing
● Connectionless transport: UDP
● Principles of reliable data transfer
● Connection-oriented transport:
TCP
● Principles of congestion control
● TCP congestion control
● Evolution of transport-layer
functionality
67
Principles of congestion control
Congestion:
▪ informally: “too many sources sending too much data too fast for
network to handle”
• long delays (queueing in router buffers)
• packet loss (buffer overflow at routers)
70
Approaches towards congestion control
Network-assisted congestion
control:
explicit congestion info
▪ routers provide direct feedback
to sending/receiving hosts with
flows passing through congested ACK
dat
a
dat
ACK
a
router s
s
▪ may indicate congestion level or
explicitly set sending rate
● TCP ECN (Explicit Congestion
Notification)
71
Chapter 3: roadmap
● Transport-layer services
● Multiplexing and demultiplexing
● Connectionless transport: UDP
● Principles of reliable data transfer
● Connection-oriented transport:
TCP
● Principles of congestion control
● TCP congestion control
● Evolution of transport-layer
functionality
72
TCP: Triggering congestion control
● Two ways to trigger a congestion notification in TCP – (1) RTO, (2) Duplicate ACK
● Duplicate ACK: Receiver sends a duplicate ACK when it receives out of order
segment
○ A loose way of indicating congestion
○ TCP arbitrarily assumes that THREE duplicate ACKs (DUPACKs) imply that a
packet has been lost – triggers congestion control mechanism
○ The identity of the lost packet can be inferred – the very next packet in
sequence
○ Retransmit the lost packet and trigger congestion control
73
TCP congestion control: AIMD
▪ approach: senders can increase sending rate until packet loss (congestion)
occurs, then decrease sending rate on loss event.
74
time
TCP AIMD: more
Multiplicative decrease detail: sending rate is
▪ Cut to 1 MSS (maximum segment size) when loss detected by
timeout (TCP Tahoe)
▪ Cut in half on loss detected by triple duplicate ACK (TCP Reno)
75
TCP Congestion Control
76
TCP Congestion Control
77
TCP slow start
Host Host
▪ When connection begins, A B
RT
T
• initially cwnd = 1 MSS two
• double cwnd every RTT segments
Implementation:
▪ variable ssthresh
▪ on loss event, ssthresh is set to
1/2 of cwnd just before loss event
* Check out the online interactive exercises for more examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
79
Slow Start Cont..
• Slow start causes exponential growth, eventually it will send too many packets
into the network too quickly.
• To keep slow start under control, the sender keeps a threshold for the
connection called the slow start threshold (ssthresh).
80
Congestion Control: Slow start, exponential increase
In the congestion
avoidance algorithm
the size of the
congestion window
increases additively
until
congestion is
detected.
84
Fast Recovery – TCP Reno
● Once a congestion is detected through 3 DUPACKs, do TCP really
need to set CWnd = 1 MSS ?
● DUPACK means that some segments are still flowing in the network
– a signal for temporary congestion, but not a prolonged one
85
Fast Recovery – TCP Reno
● Fast recovery:
● set ssthresh to half of the current congestion window. Retransmit the missing segment.
● set cwnd = ssthresh + 3.
● Each time another duplicate ACK arrives, set cwnd = cwnd + 1. Then, send a new data segment
if allowed by the value of cwnd.
● Once receive a new ACK (an ACK which acknowledges all intermediate segments sent between
the lost packet and the receipt of the first duplicate ACK), exit fast recovery. This causes setting
cwnd to ssthresh (the ssthresh in step 1). Then, continue with linear increasing due to
congestion avoidance algorithm.
86
Example: Fast Recovery – TCP Reno
88
Summary: TCP congestion control
Ne
Ne w
w new ACK
duplicate
ACK
dupACKcou
AC
new ACK
K!
.
dupACKcount = 0
AC
cwnd = cwnd + MSS (MSS/cwnd)
K!
nt++ cwnd = cwnd+MSS transmit new segment(s), as allowed
dupACKcount = 0
Λ transmit new segment(s), as allowed
cwnd = 1 MSS
ssthresh = 64 KB cwnd >
dupACKcount = 0 slow ssthresh congestion
start time avoidance
ssthresh
out = cwnd/2
cwnd = 1 MSS duplicate
time dupACKcount = 0 ACK
dupACKcou
out = cwnd/2
ssthresh retransmit missing segment nt++
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
time
Ne
w
out
ssthresh = cwnd/2 AC
cwnd = 1 K!New ACK
dupACKcount = 0
dupACKcount == 3 cwnd = ssthresh dupACKcount == 3
retransmit missing segment dupACKcount = 0
ssthresh= cwnd/2 ssthresh= cwnd/2
cwnd = ssthresh + 3 cwnd = ssthresh + 3
retransmit missing segment
retransmit missing segment
fast
recovery
duplicate
ACK
cwnd = cwnd + MSS
transmit new segment(s), as allowed
89
TCP Congestion Control Algorithms
90
Chapter 3: roadmap
● Transport-layer services
● Connectionless transport: UDP
● Connection-oriented transport:
TCP
• segment structure
• reliable data transfer
• flow control
• connection management
● Principles of congestion control
● TCP congestion control
91
UDP: User Datagram Protocol
▪ Simple and quick Internet Why is there a UDP?
transport protocol ▪ no connection
▪ “best effort” service, UDP establishment (which can
segments may be: add RTT delay)
• lost ▪ simple: no connection state
• delivered out-of-order to at sender, receiver
app ▪ small header size
▪ connectionless: ▪ no congestion control
• no handshaking between ▪ UDP can blast away as fast as
desired!
UDP sender, receiver ▪ can function in the face of
• each UDP segment handled congestion
independently of others
92
UDP: User Datagram Protocol
▪ UDP use:
▪ streaming multimedia apps (loss tolerant, rate sensitive)
▪ DNS
▪ Simple Network Management Protocol (SNMP)
▪ HTTP/3
▪ if reliable transfer needed over UDP (e.g., HTTP/3):
▪ add needed reliability at application layer
▪ add congestion control at application layer
93
UDP: User Datagram Protocol [RFC 768]
94
UDP Header
data to/from
UDP segment format application layer
96
Questions
Solution
a. The source port number is the first four hexadecimal digits (CB84)16 or 52100.
b. The destination port number is the second four hexadecimal digits (000D)16 or 13.
c. The third four hexadecimal digits (001C)16 define the length of the whole UDP
packet as 28 bytes.
d. The length of the data is the length of the whole packet
minus the length of the header, or 28 – 8 = 20 bytes.
e. Since the destination port number is 13 (well-known port), the packet is from the
client to the server.
Received: 4 6 11
receiver-computed sender-computed
checksum
= checksum (as received)
101
UDP checksum
Goal: detect errors (i.e., flipped bits) in transmitted segment
sender: receiver:
▪ treat contents of UDP ▪ compute checksum of received
segment (including UDP header
fields and IP addresses) as segment
sequence of 16-bit integers ▪ check if computed checksum equals
▪ checksum: addition (one’s checksum field value:
complement sum) of segment
content • Not equal - error detected
▪ checksum value put into • Equal - no error detected. But maybe
UDP checksum field errors nonetheless? More later ….
102
Internet checksum: an example
example: add two 16-bit integers
1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
sum 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
Note: when adding numbers, a carryout from the most significant bit needs to be
added to the result
* Check out the online interactive exercises for more examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
103
Internet checksum: weak protection!
example: add two 16-bit integers
0 1
1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0
1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 Even though
numbers have
sum 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0 changed (bit
flips), no change
checksum 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 in checksum!
104
UDP Checksum Calculations
105
Cont..
At receiver, add everything including checksum and complement if solution is zero then packet is correctly received.
Source: TCP/IP Protocol Suite, Forouzan
Point to Note
115
Introduction: Change
• Increasing scale of ............ everything
• Flow size changes
• Flow count increases (e.g., web pages)
• Flow diversity increase (e.g., web pages)
https request
o Multiple connections
4x in 4 years
116
HTTP Network Stack
HTTP / 1.1
• January 1997
HTTP • Many parallel TCP connection
TLS 1.2+ (6 connections per host name)
TCP • HTTP head of line blocking
IP HTTP / 2
• May 2015
• Using Single connection per
host
• Many parallel streams
• TCP head of line blocking
TLS – Transport Layer Security
TCP – Transport Control Protocol
IP – Internet Protocol
https://http3-explained.haxx.se/en/why-quic 117
QUIC: Quick UDP Internet
Connections
● application-layer protocol, on top of UDP
○ increase performance of HTTP
○ deployed on many Google servers, apps (Chrome, mobile YouTube app)
https://http3-explained.haxx.se/en/why-quic 119
QUIC: Quick UDP Internet
Connections
adopts approaches we’ve studied in this chapter for connection establishment,
error control, congestion control
• error and congestion control: “Readers familiar with TCP’s loss detection and
congestion control will find algorithms here that parallel well-known TCP ones.”
[from QUIC specification]
• connection establishment: reliability, congestion control, authentication,
encryption, state established in one RTT
120
QUIC: Connection
establishment
TCP handshake
(transport layer) QUIC handshake
TLS handshake
dat
(security) a
dat
a
TCP (reliability, congestion control QUIC: reliability, congestion control,
state) + TLS (authentication, crypto authentication, crypto state
state)
▪2 serial handshakes ▪ 1 handshake
121
QUIC: streams: parallelism, no HOL
blocking
HTTP HTTP
GET GET HTTP
GET
HTTP HTTP
application
GET GET
HTTP
GET QUIC QUIC QUIC QUIC QUIC QUIC
encrypt encrypt encrypt encrypt encrypt encrypt
QUI QUI QUI QUI QUI QUI
TLS encryption TLS encryption C C C C error!
C C
RDT RDT RDT RDT RDT RDT
QUIC Cong. Cont. QUIC Cong. Cont.
TCP RDT TCP
error! RDT
transport
RFC 9000
https://conferences.sigcomm.org/sigcomm/2020/tutorial-quic.ht 123
Source: SIGCOMM 2020, QUIC Tutorial, Link: