0% found this document useful (0 votes)
20 views

Lecture - 6 UDP TCP (Final)

Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Lecture - 6 UDP TCP (Final)

Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 146

Chapter 3

Transport Layer

A note on the use of these ppt slides:


We’re making these slides freely available to all (faculty, students, readers).
They’re in PowerPoint form so you can add, modify, and delete slides
(including this one) and slide content to suit your needs. They obviously
Computer Networking:
represent a lot of work on our part. In return for use, we only ask the A Top Down Approach
following:
 If you use these slides (e.g., in a class) in substantially unaltered form, 5th edition.
that you mention their source (after all, we’d like people to use our book!) Jim Kurose, Keith Ross
Addison-Wesley, April
 If you post any slides in substantially unaltered form on a www site, that
you note that they are adapted from (or perhaps identical to) our slides, and
note our copyright of this material. 2009.
Thanks and enjoy! JFK/KWR

All material copyright 1996-2009


J.F Kurose and K.W. Ross, All Rights Reserved
Transport Layer 3-1
Chapter 3: Transport Layer
Our goals:
 understand principles  learn about transport
behind transport layer protocols in the
layer services: Internet:
 multiplexing/  UDP: connectionless
demultiplexing transport
 reliable data transfer  TCP: connection-oriented
 flow control transport
 congestion control
 TCP congestion control

Transport Layer 3-2


Chapter 3 outline
 3.1 Transport-layer  3.5 Connection-oriented
services transport: TCP
 3.2 Multiplexing and  segment structure
demultiplexing  reliable data transfer
 3.3 Connectionless
 flow control
 connection management
transport: UDP
 3.6 Principles of
 3.4 Principles of
reliable data transfer congestion control
 3.7 TCP congestion
control

Transport Layer 3-3


Transport services and protocols
application
transport
 provide logical communication network
data link
between app processes physical

running on different hosts

lo
gi
ca
 transport protocols run in

le
nd
end systems

-e
nd
 send side: breaks app

tr
an
messages into segments,

pos
passes to network layer

rt
 rcv side: reassembles application
transport
segments into messages, network
data link
passes to app layer physical

 more than one transport


protocol available to apps
 Internet: TCP and UDP

Transport Layer 3-4


Transport vs. network layer
 network layer: logical Household analogy:
communication 12 kids sending letters
between hosts to 12 kids
 transport layer: logical  processes = kids
communication  app messages = letters
between processes in envelopes
 relies on, enhances,  hosts = houses
network layer services
 transport protocol =
Ann and Bill
 network-layer protocol
= postal service

Transport Layer 3-5


Internet transport-layer protocols
 reliable, in-order application
transport
network
delivery (TCP) data link
physical
network
 congestion control data link

lo
physical network

gi
data link
 flow control

ca
physical

le
connection setup

nd

-e
nd
 unreliable, unordered network

tr
data link

an
physicalnetwork
delivery: UDP

s
data link

po
physical

r
t
 no-frills extension of network
data link
application
“best-effort” IP physical network
data link
transport
network
 services not available: physical data link
physical

 delay guarantees
 bandwidth guarantees

Transport Layer 3-6


Chapter 3 outline
 3.1 Transport-layer  3.5 Connection-oriented
services transport: TCP
 3.2 Multiplexing and  segment structure
demultiplexing  reliable data transfer
 3.3 Connectionless
 flow control
 connection management
transport: UDP
 3.6 Principles of
 3.4 Principles of
reliable data transfer congestion control
 3.7 TCP congestion
control

Transport Layer 3-7


Multiplexing/demultiplexing
Demultiplexing at rcv host: Multiplexing at send host:
gathering data from multiple
delivering received segments
sockets, enveloping data with
to correct socket
header (later used for
demultiplexing)
= socket = process

P3 P1
P1 P2 P4 application
application application

transport transport transport

network network network

link link link

physical physical physical

host 2 host 3
host 1
Transport Layer 3-8
How demultiplexing works
 host receives IP datagrams
 each datagram has source IP
address, destination IP address 32 bits
 each datagram carries 1
source port # dest port #
transport-layer segment
 each segment has source,
destination port number other header fields
 host uses IP addresses & port
numbers to direct segment to
appropriate socket
application
data
(message)

TCP/UDP segment format

Transport Layer 3-9


Connectionless demultiplexing
 When host receives UDP
 Create sockets with port
segment:
numbers:
DatagramSocket mySocket1 = new
 checks destination port
DatagramSocket(12534); number in segment
DatagramSocket mySocket2 = new  directs UDP segment to
DatagramSocket(12535); socket with that port
 UDP socket identified by number
 IP datagrams with
two-tuple:
(dest IP address, dest port number) different source IP
addresses and/or source
port numbers directed
to same socket

Transport Layer 3-10


Connectionless demux (cont)
DatagramSocket serverSocket = new DatagramSocket(6428);

P2 P1
P1
P3

SP: 6428 SP: 6428


DP: 9157 DP: 5775

SP: 9157 SP: 5775


client DP: 6428 DP: 6428 Client
server
IP: A IP: C IP:B

SP provides “return address”

Transport Layer 3-11


Connection-oriented demux
 TCP socket identified  Server host may support
by 4-tuple: many simultaneous TCP
 source IP address sockets:
 source port number  each socket identified by
 dest IP address its own 4-tuple
 dest port number  Web servers have
 receiving host uses all different sockets for
four values to direct each connecting client
segment to appropriate  non-persistent HTTP will
socket have different socket for
each request

Transport Layer 3-12


Connection-oriented demux
(cont)

P1 P4 P5 P6 P2 P1P3

SP: 5775
DP: 80
S-IP: B
D-IP:C

SP: 9157 SP: 9157


client DP: 80 DP: 80 Client
server
IP: A S-IP: A
IP: C S-IP: B IP:B
D-IP:C D-IP:C

Transport Layer 3-13


Connection-oriented demux:
Threaded Web Server

P1 P4 P2 P1P3

SP: 5775
DP: 80
S-IP: B
D-IP:C

SP: 9157 SP: 9157


client DP: 80 DP: 80 Client
server
IP: A S-IP: A
IP: C S-IP: B IP:B
D-IP:C D-IP:C

Transport Layer 3-14


Chapter 3 outline
o 3.1 Transport-layer o 3.5 Connection-oriented
services transport: TCP
o 3.2 Multiplexing and o segment structure
demultiplexing o reliable data transfer
o 3.3 Connectionless
o flow control
o connection management
transport: UDP
o 3.6 Principles of
o 3.4 Principles of
reliable data transfer congestion control
o 3.7 TCP congestion
control

Transport Layer 3-15


UDP: User Datagram Protocol [RFC 768]

o “no frills,” “bare bones”


Internet transport Why is there a UDP?
protocol o no connection
o “best effort” service, UDP
establishment (which can
segments may be: add delay)
o lost
o simple: no connection state
o delivered out of order at sender, receiver
to app o small segment header
o connectionless: o no congestion control: UDP
o no handshaking between can blast away as fast as
UDP sender, receiver desired
o each UDP segment
handled independently
of others

Transport Layer 3-16


UDP: more
o often used for streaming
multimedia apps 32 bits
o loss tolerant
Length, in source port # dest port #
o rate sensitive bytes of UDP length checksum
segment,
o other UDP uses
including
o DNS header
o SNMP
o reliable transfer over UDP: Application
add reliability at data
application layer (message)
o application-specific
error recovery!
UDP segment format

Transport Layer 3-17


Introduction
oThe transport layer is responsible for the delivery
of a message from one process to another
oThe Internet model has three protocols at the
transport layer: UDP, TCP, and SCTP.
o UDP: Is the simplest of the three.
o TCP: A complex transport layer protocol.
o SCTP: The new transport layer protocol that is designed for
specific applications such as multimedia.
Connectionless Versus Connection-
Oriented Service
o A transport layer protocol can either be
connectionless or connection-oriented.
o Connectionless Service
 In a connectionless service, the packets are sent
from one party to another with no need for
connection establishment or connection release.
 The packets are not numbered; they may be
delayed or lost or may arrive out of sequence.
 There is no acknowledgment .
o UDP is a connectionless transport layer
protocols.
Connectionless Versus Connection-
Oriented Service
o Connection Oriented Service
 In a connection-oriented service, a connection is
first established between the sender and the
receiver.

 Data are transferred.

 At the end, the connection is released. TCP and


SCTP are connection-oriented protocols.
Reliable Versus Unreliable
o The transport layer service can be reliable or
unreliable.
o For reliability, we implement flow and error control
at the transport layer.
o This means a slower and more complex service.

o UDP is connectionless and unreliable;

o TCP and SCTP are connection oriented and reliable.

o These three protocols can respond to the demands of


the application layer programs.
Error control

If the data link layer is reliable and has flow and error control, do we need
this at the transport layer, too? Yes
User Datagram Protocol (UDP )
o UDP is a connectionless, unreliable transport protocol.

o It does not add anything to the services of IP except to provide


process-to process communication instead of host-to-host
communication.

o UDP is a very simple protocol using a minimum of overhead.

o If a process wants to send a small message and does not care


much about reliability, it can use UDP.

o Sending a small message by using UDP takes less interaction


between the sender and receiver than using TCP or SCTP.
User Datagram

o UDP packets, called user datagrams, have a fixed size


header of 8 bytes.
o Source port number: This is the port number used by the
process running on the source host.
o Destination port number: This is the port number used by
the process running on the destination host.
o Length: This is a 16-bit field that defines the total length
of the user datagram.
o Checksum: This field is used to detect errors over the
entire user datagram (header plus data). The inclusion of
the checksum in the UDP datagram is optional
UDP Operation
UDP provides a connectionless service
 no relationship between the different user datagram even if they are
coming from the same source process and going to the same destination
program.

 Also, there is no connection establishment and no connection


termination.

 Each user datagram can travel on a different path.

 The user datagrams are not numbered.

 Each UDP user datagram request must be small enough to fit


into one user datagram.

 Only those processes sending short messages should use UDP.


UDP Operation
Flow and Error Control
 UDP is a very simple, unreliable transport protocol.

 There is no flow control: The receiver may overflow with


incoming messages.

 There is no error control mechanism in UDP except for


the checksum.

 The sender does not know if a message has been lost or


duplicated.

 When the receiver detects an error through the


checksum, the user datagram is discarded.
Use of UDP
o UDP is suitable for a process that requires
simple request-response communication with
little concern for flow and error control.
o UDP is suitable for a process with internal flow
and error control mechanisms.
o For example, the Trivial File Transfer Protocol (TFTP)
process includes flow and error control..
o UDP is a suitable transport protocol for
multicasting. Multicasting capability is embedded
in the UDP software but not in the TCP software.
o UDP is used for management processes such as
SNMP
o UDP is used for some route updating protocols
such as Routing Information Protocol (RIP)
UDP checksum
Goal: detect “errors” (e.g., flipped bits) in transmitted
segment

Sender: o Receiver:
o treat segment contents o compute checksum of received
as sequence of 16-bit segment
integers o check if computed checksum
o checksum: addition (1’s equals checksum field value:
complement sum) of o NO - error detected
segment contents o YES - no error detected.
o sender puts checksum But maybe errors
value into UDP checksum nonetheless? More later ….
field

Transport Layer 3-28


Internet Checksum Example
 Note
 When adding numbers, a carryout from the
most significant bit needs to be added to the
result
 Example: add two 16-bit integers

1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1

sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
Transport Layer 3-29
Checksum

• It is a count of the no of bits in a


transmission unit that is included with the
unit so that the receiver can check to see
whether the same numbers of bits have
arrived.
• If the counts match, it’s assumed that the
complete transmission was received.
• if not, an error is detected.
At the receiver Side

• Divide data into K chunks of n bit each.


• Add all using 1s complement addition.
• Take the compliment of the result.
• Result will be sent with the data as a
checksum.

• 1’s Complement addition: If the result has a


carry bit add carry with LSB (Least
Significant Bit) of the result.
Example

• Suppose data is 01110011, we divide the data


into 4 bit chunks. So, n = 4. Since total
number of digits in data is 8. So, there will be
2 chunks of data ( 0111 and 0011 ).
• Adding both chunks –
0111
0011
———
1 0 1 0
———
• Result (1010) has no carry bit. Now, take 1’s
compliment. It will be 0101. So, checksum =
0101.
The data that will be sent = 011100110101
• Note – If result has carry bit(s) after
addition, it is simply discarded.
At the Receiver Side
• Now, after receiving data from sender, receiver will
divide the data into n bit chunks. Note that there is
also checksum appended with data. So, there will be K
+ 1 chunks.
• Now, add all chunks using 1’s compliment addition.
• Then, take the compliment of the result. If all the
bits in result is 0, then, accept the data. Else, discard
the data because there is an error.
• In previous example, sender sent 011100110101. Also, data
were divided into 4 bits chunk. So, after receiving result, the
result will be divided into chunks of 4 bit. So, chunks will be
0111, 0011 and 0101.
• Adding all chunks using 1’s complement –
0111
0011
0101
————–
1111
————–
Now, taking the compliment of the
result(1111), we will get 0000. So, there is no
error.
Chapter 3 outline
o 3.1 Transport-layer o 3.5 Connection-oriented
services transport: TCP
o 3.2 Multiplexing and o segment structure
demultiplexing o reliable data transfer
o 3.3 Connectionless
o flow control
o connection management
transport: UDP
o 3.6 Principles of
o 3.4 Principles of
reliable data transfer congestion control
o 3.7 TCP congestion
control

Transport Layer 3-36


Principles of Reliable data transfer
o important in app., transport, link layers
o top-10 list of important networking topics!

o characteristics of unreliable channel will determine complexity of reliable data transfer protocol
(rdt)

Transport Layer 3-37


Principles of Reliable data transfer
o important in app., transport, link layers
o top-10 list of important networking topics!

o characteristics of unreliable channel will determine complexity of reliable data transfer protocol
(rdt)

Transport Layer 3-38


Principles of Reliable data transfer
o important in app., transport, link layers
o top-10 list of important networking topics!

o characteristics of unreliable channel will determine complexity of reliable data transfer protocol
(rdt)

Transport Layer 3-39


Reliable data transfer: getting started
rdt_send(): called from above, deliver_data(): called by
(e.g., by app.). Passed data to rdt to deliver data to upper
deliver to receiver upper layer

send receive
side side

udt_send(): called by rdt, rdt_rcv(): called when packet


to transfer packet over arrives on rcv-side of channel
unreliable channel to receiver

Transport Layer 3-40


Reliable data transfer: getting started
We’ll:
o incrementally develop sender, receiver sides of
reliable data transfer protocol (rdt)
o consider only unidirectional data transfer
o but control info will flow on both directions!
o use finite state machines (FSM) to specify
sender, receiver
event causing state transition
actions taken on state transition
state: when in this
“state” next state state state
1 event
uniquely determined 2
by next event actions

Transport Layer 3-41


Rdt1.0: reliable transfer over a reliable channel

o underlying channel perfectly reliable


o no bit errors
o no loss of packets

o separate FSMs for sender, receiver:


o sender sends data into underlying channel
o receiver read data from underlying channel

Wait for rdt_send(data) Wait for rdt_rcv(packet)


call from call from extract (packet,data)
above packet = make_pkt(data) below deliver_data(data)
udt_send(packet)

sender receiver

Transport Layer 3-42


Rdt2.0: channel with bit errors
o underlying channel may flip bits in packet
o checksum to detect bit errors

o the question: how to recover from errors:


o acknowledgements (ACKs): receiver explicitly tells sender
that pkt received OK
o negative acknowledgements (NAKs): receiver explicitly
tells sender that pkt had errors
o sender retransmits pkt on receipt of NAK
o new mechanisms in rdt2.0 (beyond rdt1.0):
o error detection
o receiver feedback: control msgs (ACK,NAK) rcvr->sender

Transport Layer 3-43


rdt2.0: FSM specification
rdt_send(data)
receiver
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)

rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for call Wait for rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
from above ACK or NAK
udt_send(sndpkt)
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for call
 from below

sender
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

Transport Layer 3-44


rdt2.0: operation with no errors
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)

rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for call Wait for rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
from above ACK or NAK
udt_send(sndpkt)
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for call
 from below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

Transport Layer 3-45


rdt2.0: error scenario
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)

rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for call Wait for rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
from above ACK or NAK
udt_send(sndpkt)
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for call
 from below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

Transport Layer 3-46


rdt2.0 has a fatal flaw!
o What happens if o Handling duplicates:
ACK/NAK corrupted? o sender retransmits current
o sender doesn’t know what pkt if ACK/NAK garbled
happened at receiver! o sender adds sequence
o can’t just retransmit: number to each pkt
possible duplicate o receiver discards (doesn’t
deliver up) duplicate pkt

stop and wait


Sender sends one packet,
then waits for receiver
response

Transport Layer 3-47


rdt2.1: sender, handles garbled
ACK/NAKs
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isNAK(rcvpkt) )
Wait for call Wait for
0 from above ACK or NAK
0 udt_send(sndpkt)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt) && notcorrupt(rcvpkt)
&& isACK(rcvpkt)



Wait for ACK Wait for
or NAK 1 call 1 from
rdt_rcv(rcvpkt) && above
( corrupt(rcvpkt) ||
isNAK(rcvpkt) )
rdt_send(data)
sndpkt = make_pkt(1, data, checksum)
udt_send(sndpkt) udt_send(sndpkt)

Transport Layer 3-48


rdt2.1: receiver, handles garbled ACK/NAKs
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)

rdt_rcv(rcvpkt) && rdt_rcv(rcvpkt) &&


(corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum) (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt) udt_send(sndpkt)

Wait for Wait for


0 from 1 from
rdt_rcv(rcvpkt) && rdt_rcv(rcvpkt) &&
below below
not corrupt(rcvpkt) && not corrupt(rcvpkt) &&
has_seq1(rcvpkt) has_seq0(rcvpkt)

sndpkt = make_pkt(ACK, chksum) sndpkt = make_pkt(ACK, chksum)


udt_send(sndpkt) udt_send(sndpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)

extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)

Transport Layer 3-49


rdt2.1: discussion
Sender: Receiver:
o seq # added to pkt o must check if received
o two seq. #’s (0,1) will packet is duplicate
suffice. Why? o state indicates whether
0 or 1 is expected pkt
o must check if received
seq #
ACK/NAK corrupted o note: receiver can not
o twice as many states
know if its last
o state must “remember” ACK/NAK received OK
whether “current” pkt
at sender
has 0 or 1 seq. #

Transport Layer 3-50


rdt2.2: a NAK-free protocol
o same functionality as rdt2.1, using ACKs only

o instead of NAK, receiver sends ACK for last pkt received OK


o receiver must explicitly include seq # of pkt being ACKed

o duplicate ACK at sender results in same action as NAK:


retransmit current pkt

Transport Layer 3-51


rdt2.2: sender, receiver fragments
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for ACK isACK(rcvpkt,1) )
Wait for call
0 from above 0
udt_send(sndpkt)
sender FSM
fragment rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,0)
rdt_rcv(rcvpkt) &&
(corrupt(rcvpkt) ||
has_seq1(rcvpkt)) 
Wait for
0 from receiver FSM
udt_send(sndpkt) below fragment
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK1, chksum)
udt_send(sndpkt)

Transport Layer 3-52


rdt3.0: channels with errors and loss

o New assumption: o Approach: sender waits


underlying channel can “reasonable” amount of
also lose packets (data time for ACK
or ACKs) o retransmits if no ACK
o checksum, seq. #, ACKs, received in this time
retransmissions will be o if pkt (or ACK) just delayed
of help, but not enough (not lost):
o retransmission will be
duplicate, but use of seq.
#’s already handles this
o receiver must specify seq
# of pkt being ACKed
o requires countdown timer

Transport Layer 3-53


rdt3.0 sender
rdt_send(data) rdt_rcv(rcvpkt) &&
sndpkt = make_pkt(0, data, checksum) ( corrupt(rcvpkt) ||
udt_send(sndpkt) isACK(rcvpkt,1) )
start_timer
rdt_rcv(rcvpkt) 
 Wait for Wait for
call 0from ACK0 timeout
above udt_send(sndpkt)
start_timer
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
rdt_rcv(rcvpkt)
&& isACK(rcvpkt,1)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,0)
stop_timer
stop_timer

Wait for Wait for


timeout ACK1 call 1 from
udt_send(sndpkt) above
start_timer rdt_rcv(rcvpkt)
rdt_send(data) 
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) || sndpkt = make_pkt(1, data, checksum)
isACK(rcvpkt,0) ) udt_send(sndpkt)
start_timer

Transport Layer 3-54


rdt3.0 in action

Transport Layer 3-55


rdt3.0 in action

Transport Layer 3-56


Performance of rdt3.0

o rdt3.0 works, but performance stinks


o ex: 1 Gbps link, 15 ms prop. delay, 8000 bit packet:

L 8000bits
d trans   9
 8 microseconds
R 10 bps
o U sender : utilization – fraction of time sender busy sending

U L/R .008
= = = 0.00027
sender 30.008
RTT + L / R microsec
o 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link onds
o network protocol limits use of physical resources!

Transport Layer 3-57


rdt3.0: stop-and-wait operation
sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send
ACK

ACK arrives, send next


packet, t = RTT + L / R

U L/R .008
= = = 0.00027
sender 30.008
RTT + L / R microsec
onds

Transport Layer 3-58


Pipelined protocols
o Pipelining: sender allows multiple, “in-flight”, yet-
to-be-acknowledged pkts
o range of sequence numbers must be increased
o buffering at sender and/or receiver

o Two generic forms of pipelined protocols: go-Back-N,


selective repeat
Transport Layer 3-59
Pipelining: increased utilization
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R

Increase utilization
by a factor of 3!

U 3*L/R .024
= = = 0.0008
sender 30.008
RTT + L / R microsecon
ds
Transport Layer 3-60
Pipelining Protocols
Go-back-N: overview Selective Repeat: overview
o sender: up to N o sender: up to N unACKed
unACKed pkts in packets in pipeline
pipeline o receiver: ACKs individual
o receiver: only sends pkts
cumulative ACKs o sender: maintains timer
o doesn’t ACK pkt if for each unACKed pkt
there’s a gap o if timer expires: retransmit
o sender: has timer for only unACKed packet
oldest unACKed pkt
o if timer expires:
retransmit all unACKed
packets

Transport Layer 3-61


Go-Back-N
o Sender:
o k-bit seq # in pkt header
o “window” of up to N, consecutive unACKed pkts allowed

o ACK(n): ACKs all pkts up to, including seq # n - “cumulative ACK”


o may receive duplicate ACKs (see receiver)
o timer for each in-flight pkt
o timeout(n): retransmit pkt n and all higher seq # pkts in window

Transport Layer 3-62


GBN in
action

Transport Layer 3-63


GBN: sender extended FSM
rdt_send(data)

if (nextseqnum < base+N) {


sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
else
refuse_data(data)


base=1
nextseqnum=1
timeout
start_timer
Wait
udt_send(sndpkt[base])
udt_send(sndpkt[base+1])
rdt_rcv(rcvpkt) …
udt_send(sndpkt[nextseqnum-1])
&& corrupt(rcvpkt)

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
stop_timer
else
start_timer

Transport Layer 3-64


GBN: receiver extended FSM
default

udt_send(sndpkt) rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
&& hasseqnum(rcvpkt,expectedseqnum)

Wait
extract(rcvpkt,data)
expectedseqnum=1 deliver_data(data)
sndpkt = sndpkt = make_pkt(expectedseqnum,ACK,chksum)
make_pkt(expectedseqnum,ACK,chksum) udt_send(sndpkt)
expectedseqnum++

o ACK-only: always send ACK for correctly-received


pkt with highest in-order seq #
o may generate duplicate ACKs
o need only remember expectedseqnum
o out-of-order pkt:
o discard (don’t buffer) -> no receiver buffering!
o Re-ACK pkt with highest in-order seq #

Transport Layer 3-65


Selective Repeat
o receiver individually acknowledges all correctly
received pkts
o buffers pkts, as needed, for eventual in-order delivery
to upper layer
o sender only resends pkts for which ACK not
received
o sender timer for each unACKed pkt
o sender window
o N consecutive seq #’s
o again limits seq #s of sent, unACKed pkts

Transport Layer 3-66


Selective repeat: sender, receiver windows

Transport Layer 3-67


Selective repeat
sender receiver
o data from above : o pkt n in [rcvbase, rcvbase+N-
o if next available seq # in 1]
o send ACK(n)
window, send pkt
o out-of-order: buffer
o timeout(n):
o in-order: deliver (also
o resend pkt n, restart timer
deliver buffered, in-order
o ACK(n) in pkts), advance window to
[sendbase,sendbase+N]: next not-yet-received pkt
o mark pkt n as received
o pkt n in [rcvbase-N,rcvbase-1]
o if n smallest unACKed pkt,
ACK(n)
advance window base to
next unACKed seq # o otherwise: ignore

Transport Layer 3-68


Selective repeat in action

Transport Layer 3-69


Selective repeat:
dilemma
o Example:
o seq #’s: 0, 1, 2, 3
o window size=3

o receiver sees no
difference in two
scenarios!
o incorrectly passes
duplicate data as new in
(a)

o Q: what relationship
between seq # size and
window size?
Transport Layer 3-70
Chapter 3 outline
o 3.1 Transport-layer o 3.5 Connection-oriented
services transport: TCP
o 3.2 Multiplexing and o segment structure
demultiplexing o reliable data transfer
o 3.3 Connectionless
o flow control
o connection management
transport: UDP
o 3.6 Principles of
o 3.4 Principles of
reliable data transfer congestion control
o 3.7 TCP congestion
control

Transport Layer 3-71


TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581

o point-to-point: o full duplex data:


o one sender, one receiver o bi-directional data flow

o reliable, in-order byte in same connection


o MSS: maximum segment
steam:
size
o no “message boundaries”
o connection-oriented:
o pipelined:
o handshaking (exchange
o TCP congestion and flow
of control msgs) init’s
control set window size sender, receiver state
o send & receive buffers before data exchange
o flow controlled:
application application
sock et
w rites data reads data
sock et
o sender will not
door
overwhelm receiver
door
TCP TCP
send buffer receive buffer
segm ent

Transport Layer 3-72


Transmission Control Protocol(TCP)

o TCP, like UDP, is a process-to-process


(program-to-program) protocol uses port
numbers.
o Unlike UDP, TCP is a connection oriented
protocol; it creates a virtual connection
between two TCPs to send data.
o TCP uses flow and error control mechanisms
at the transport level.
Stream Delivery Service
o TCP allows the sending process to deliver data
as a stream of bytes and allows the receiving
process to obtain data as a stream of bytes.
o TCP creates an environment in which the two
processes seem to be connected by an imaginary
"tube“ that carries their data across the
Internet. The sending process produces (writes
to) the stream of bytes, and the receiving
process consumes (reads from) them.
Sending and receiving buffers
o The sending and the receiving processes may not write or read
data at the same speed, TCP needs buffers for storage.
o Two buffers, the sending buffer and the receiving buffer, one
for each direction.
o These buffers are also necessary for flow and error control
mechanisms used by TCP.
o One way to implement a buffer is to use a circular array of 1-
byte locations
Sending and receiving buffers
The sending buffer has three types of chambers:
 The white section :
Contains empty chambers that can be filled by the
sending process
 The gray section :
 holds bytes that have been sent but not yet acknowledged.
 TCP keeps these bytes in the buffer until it receives an
acknowledgment.
 The colored section :
 Contains bytes to be sent by the sending TCP.
 TCP may be able to send only part of this colored section.
 This could be due to the slowness of the receiving process or to
congestion in the network.
 After the bytes in the gray chambers are acknowledged, the
chambers are recycled and available for use by the sending
process.
Sending and receiving buffers

The circular buffer at the receiver is divided


into two areas:
 The white area :
Contains empty chambers to be filled by
bytes received from the network.
 The colored sections:
Contain received bytes that can be read by the receiving
process.
 When a byte is read by the receiving process, the
chamber is recycled and added to the pool of empty
chambers
Segments
o The IP layer, as a service provider for TCP, needs to send
data in packets, not as a stream of bytes.
o The transport layer, TCP groups a number of bytes
together into a packet called a segment.
o TCP adds a header to each segment and delivers the
segment to the IP layer for transmission.
o TCP offers full-duplex service, in which data can flow in
both directions at the same time.
o Each TCP then has a sending and receiving buffer, and segments
move in both directions
Connection-oriented Services

When a process at site A wants to send and


receive data from another process at site B,
the following occurs:
1.The two processes establish a connection
between them.( virtual connection , not a
physical connection)
2.Data are exchanged in both directions.
3.The connection is terminated.
Reliable Transport Protocol
 TCP is a reliable transport protocol.
It uses an acknowledgment mechanism to check the
safe arrival of data.
 Flow Control:
The receiver of the data controls the amount of data
that are to be sent by the sender.
 Error Control.
To provide reliable service, TCP implements an error
control mechanism.
 Congestion Control:
The amount of data sent by a sender is controlled by
the level of congestion in the network.
TCP Features
Numbering System:
 There are two fields called the sequence number and the
acknowledgment number.
 These two fields refer to the byte number and not the segment
number.
 TCP numbers all data bytes that are transmitted in a
connection.
 Numbering is independent in each direction.
 When TCP receives bytes of data from a process, it stores
them in the sending buffer and numbers them.
 The numbering does not necessarily start from 0.
 TCP generates a random number between 0 and 232 - 1 for
the number of the first byte.
 Byte numbering is used for flow and error control. For example:
if the random number happens to be 1057 and the total data to
be sent are 6000 bytes, the bytes are numbered from 1057 to
7056.
Notes
o The bytes of data being transferred in each
connection are numbered by TCP.
o The numbering starts with a randomly
generated number.
o After the bytes have been numbered, TCP
assigns a sequence number to each segment
that is being sent.
o The sequence number for each segment is
the number of the first byte carried in that
segment
Example:
Suppose a TCP connection is transferring a file of 5000 bytes. The first
byte is numbered 10,00l. What are the sequence numbers for each
segment if data are sent in five segments, each carrying 1000 bytes?

The following shows the sequence number for each


segment:
Acknowledgment Number
 Each party uses an acknowledgment number to confirm the
bytes it has received.
 The acknowledgment number defines the number of the next
byte that the party expects to receive.
 The acknowledgment number is cumulative, which means that
the party takes the number of the last byte that it has
received, safe adds 1 to it. This sum is the acknowledgment
number.
 For example :If the receiver of the segment has successfully received
byte number x from the other party, it defines x + 1 as the
acknowledgment number.
 The term cumulative here means that if a party uses 5643 as an
acknowledgment number, it has received all bytes from the
beginning up to 5642.
 Note that this does not mean that the party has received 5642
bytes because the first byte number does not have to start
from 0.
TCP Segment Format
TCP Segment Format
 The segment consists of a 20-60-byte header.
 Source port address:
This is a 16-bit field , it defines the port
number of the application program in the host
that is sending the segment.
 Destination port address:
This is a 16-bit field, it defines the port
number of the application program in the host
that is receiving the segment.
TCP Segment Format
 Sequence number: This 32-bit field defines the
number assigned to the first byte of data
contained in this segment.
 Acknowledgment number: This 32 bit field
defines the number of the next byte a party
expects to receive.
 Header length: A 4-bit field that indicates the
number of 4-byte words in the TCP header. The
length of the header can be between 20 and 60
bytes.
 40 bytes are for options. If there are no options, a
header is 20 bytes else it can be of upmost 60 bytes
 Reserved. This is a 6-bit field reserved for
future use.
TCP Segment Format
 Control: This field defines 6 different
control bits or flags. One or more of these
bits can be set at a time. These bits enable
flow control, connection establishment and
termination, connection abortion, and the
mode of data transfer in TCP
TCP Segment Format
 Window size: Defines the size of the window, in
bytes, that the other party must maintain. the
length of this field is 16 bits, which means that
the maximum size of the window is 65,535 bytes.
This value is normally referred to as the
receiving window (rwnd) and is determined by the
receiver. The sender must obey the dictation of
the receiver in this case.
 Checksum: This 16-bit field contains the
checksum. The inclusion of the checksum for TCP
is mandatory.
 Options: There can be up to 40 bytes of optional
information in the TCP header.
TCP Segment Format
 Urgent data :
 The data are presented from the application program to TCP as
a stream of bytes.
 Each byte of data has a position in the stream.
 on occasion an application program needs to send urgent bytes.
 This means that the sending application program wants a piece
of data to be read out of order by the receiving application
program.
 The sending TCP creates a segment and inserts the urgent data
at the beginning of the segment.
 The rest of the segment can contain normal data from the
buffer
 The urgent pointer field in the header defines the end of the
urgent data and the start of normal data.
TCP Segment Format
Urgent pointer :
 This l6-bit field, which is valid only if the urgent flag is
set, is used when the segment contains urgent data.
 It defines the number that must be added to the
sequence number to obtain the number of the last urgent
byte in the data section of the segment.

When the receiving TCP receives a segment with the URG bit
set, it extracts the urgent data from the segment using
the value of the urgent pointer, and delivers them, out of
order, to the receiving application program.
TCP Connection
 A Connection-oriented transport protocol
establishes a virtual path between the source
and destination.
 In TCP, connection-oriented transmission
requires three phases:
1. connection establishment
2. data transfer
3. connection termination
Connection Establishment
 TCP transmits data in full-duplex mode. When two TCPs in two
machines are connected, they are able to send segments to each
other simultaneously.
 Each party must initialize communication and get approval from
the other party before any data are transferred.
 The connection establishment in TCP is called three way
handshaking.

Example: Client-server communication using TCP as the transport


layer protocol.
1.The server issues a request for a passive open: The server
program tells its TCP that it is ready to accept a connection.
2.The client program issues a request for an active open: A
client that wishes to connect to an open server tells its TCP
that it needs to be connected to that particular server. TCP can
now start the three-way handshaking process
Three-way handshaking process
1.The client sends the first segment, a SYN segment, in which only
the SYN flag is set.
 This segment is for synchronization of sequence numbers. It consumes one
sequence number.
 When the data transfer starts, the sequence number is incremented by 1.
 The SYN segment carries no real data
2. The server sends the second segment, a SYN +ACK segment, with 2
flag bits set: SYN and ACK.
 This segment has a dual purpose. It is a SYN segment for communication in the
other direction and serves as the acknowledgment for the SYN segment.
 It consumes one sequence number.
3. The client sends the third segment. This is just an ACK segment.
 It acknowledges the receipt of the second segment with the ACK flag and
acknowledgment number field.
 The sequence number in this segment is the same as the one in the SYN segment.
 The ACK segment does not consume any sequence numbers.
Connection establishment using three-way
handshaking
Data transfer
 After connection is established, bidirectional data transfer can
take place. The client and server can both send data and
acknowledgments
 The acknowledgment is piggybacked with the data.

Example:
 After connection is established, the client sends 2000 bytes of
data in two segments.
 The server then sends 2000 bytes in one segment.
 The client sends one more acknowledgment segment.
 The first three segments carry both data and acknowledgment,
but the last segment carries only an acknowledgment because
there are no more data to be sent.
Data transfer
 TCP opens a connections using an initial sequence number (ISN) of
14534. The other party opens the connection with an ISN of 21732. Show
the three TCP segment during the connection establishment.
Connection Termination Using
Three-way Handshaking
Any of the two parties involved in exchanging data (client or server) can close the
connection using three-Way Handshaking
Three-way Handshaking steps
1. The client TCP, after receiving a close command from the client process,
sends the first segment, a FIN segment in which the FIN flag is set.
 A FIN segment can include the last chunk of data sent by the client, or it
can be just a control segment
 If it is only a control segment, it consumes only one sequence number.

2. The server TCP, after receiving the FIN segment, sends the second segment,
a FIN +ACK segment, to confirm the receipt of the FIN segment from the
client and at the same time to announce the closing of the connection in the
other direction.
 This segment can also contain the last chunk of data from the server.
 If it does not carry data, it consumes only one sequence number.

3. The client TCP sends the last segment, an ACK segment, to confirm the
receipt of the FIN segment from the TCP server.
 This segment contains the acknowledgment number, which is 1 plus the
sequence number received in the FIN segment from the server.
 This segment cannot carry data and consumes no sequence numbers.
NOTE

 The FIN segment consumes one sequence


number if it does not carry data.
 The FIN + ACK segment consumes one
sequence number if it does not carry data.
 No retransmission timer is set for an ACK
segment.
 Data may arrive out of order and be
temporarily stored by the receiving TCP, but
TCP guarantees that no out-of-order segment
is delivered to the process.
TCP seq. #’s and ACKs
Seq. #’s:
Host A Host B
 byte stream
“number” of first User Seq=4
2, A C
byte in segment’s types K=79,
da t a =
‘C’ ‘C’
data host ACKs
ACKs: receipt of
ta = ‘C’ ‘C’, echoes
 seq # of next byte 3, da
9 , A CK=4 back ‘C’
expected from other S eq =
7

side
 cumulative ACK host ACKs
receipt Seq=4
Q: how receiver handles of echoed 3, ACK
=80
out-of-order segments ‘C’
 A: TCP spec doesn’t
say, - up to
time
implementer
simple telnet scenario

Transport Layer 3-101


TCP Round Trip Time and Timeout
Q: how to set TCP Q: how to estimate RTT?
timeout value?  SampleRTT: measured time from
 longer than RTT segment transmission until ACK
 but RTT varies
receipt
 ignore retransmissions
 too short: premature
timeout  SampleRTT will vary, want
 unnecessary estimated RTT “smoother”
 average several recent
retransmissions
 too long: slow reaction measurements, not just
to segment loss current SampleRTT

Transport Layer 3-102


TCP Round Trip Time and Timeout
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT

Transport Layer 3-103


Example RTT estimation:
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

350

300

250
RTT (milliseconds)

200

150

100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)

SampleRTT Estimated RTT

Transport Layer 3-104


TCP sender events:
o data rcvd from app: o timeout:
o create segment with o retransmit segment
seq # that caused timeout
o seq # is byte-stream o restart timer
number of first data o ACK rcvd:
byte in segment o if acknowledges
o start timer if not previously unACKed
already running (think segments
of timer as for oldest o update what is known to
unACKed segment) be ACKed
o expiration interval: o start timer if there are
TimeOutInterval outstanding segments

Transport Layer 3-105


TCP: retransmission scenarios
Host A Host B Host A Host B

Seq=9 Seq=9
2, 8 b 2, 8 b
y t es d y t es d
at a Seq= at a

Seq=92 timeout
100,
20 b y
t es d
ata
timeout

=100
ACK 0
10
X CK
A AC
=
K =120
loss
Seq=9 Seq=9
2, 8 b
2, 8 b
y t es d Sendbase y t es d
at a
at a
= 100

Seq=92 timeout
SendBase
= 120 =1 20
K
CK =100 AC
A

SendBase
= 100 SendBase
= 120 premature timeout
time time
lost ACK scenario
Transport Layer 3-106
TCP retransmission scenarios (more)
Host A Host B

Seq=9
2, 8 b
y t es d
at a

=100
timeout

Seq=1 A CK
00 , 2 0
b y t es
dat a
X
loss

SendBase CK =120
A
= 120

time
Cumulative ACK scenario

Transport Layer 3-107


TCP ACK generation [RFC 1122, RFC 2581]

Event at Receiver TCP Receiver action


Arrival of in-order segment with Delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

Arrival of in-order segment with Immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

Arrival of out-of-order segment Immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

Arrival of segment that Immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

Transport Layer 3-108


Fast Retransmit
o time-out period often o If sender receives 3
relatively long: ACKs for same data, it
o long delay before assumes that segment
resending lost packet after ACKed data was
o detect lost segments lost:
via duplicate ACKs. o fast retransmit: resend
o sender often sends segment before timer
many segments back-to- expires
back
o if segment is lost, there
will likely be many
duplicate ACKs for that
segment

Transport Layer 3-109


Host A Host B

seq # x1
seq # x2
seq # x3
ACK x1
seq # x4 X
seq # x5
ACK x1
ACK x1
ACK x1
triple
duplicate
ACKs re s e n
d s eq
X2
timeout

time

Transport Layer 3-110


Chapter 3 outline
o 3.1 Transport-layer o 3.5 Connection-oriented
services transport: TCP
o 3.2 Multiplexing and o segment structure
demultiplexing o reliable data transfer
o 3.3 Connectionless
o flow control
o connection management
transport: UDP
o 3.6 Principles of
o 3.4 Principles of
reliable data transfer congestion control
o 3.7 TCP congestion
control

Transport Layer 3-111


TCP Flow Control
flow control
sender won’t overflow
o receive side of TCP
receiver’s buffer by
connection has a transmitting too
receive buffer: much,
too fast

IP
(currently)
TCP data application
o speed-matching
unused buffer
datagrams space
(in buffer) process service: matching
send rate to receiving
application’s drain rate
o app process may be
slow at reading from
buffer
Transport Layer 3-112
TCP Flow control: how it works
(currently) o receiver: advertises
IP TCP data application
unused buffer
datagrams space
(in buffer) process unused buffer space by
including rwnd value in
rwnd segment header
RcvBuffer
o sender: limits # of
o (suppose TCP receiver unACKed bytes to rwnd
discards out-of-order o guarantees receiver’s
segments) buffer doesn’t overflow
o unused buffer space:
o = rwnd
o = RcvBuffer-[LastByteRcvd
- LastByteRead]

Transport Layer 3-113


Chapter 3 outline
o 3.1 Transport-layer o 3.5 Connection-oriented
services transport: TCP
o 3.2 Multiplexing and o segment structure
demultiplexing o reliable data transfer
o 3.3 Connectionless
o flow control
o connection management
transport: UDP
o 3.6 Principles of
o 3.4 Principles of
reliable data transfer congestion control
o 3.7 TCP congestion
control

Transport Layer 3-114


TCP Connection Management
o Recall: TCP sender, o Three way handshake:
receiver establish
“connection” before
o Step 1: client host sends TCP
exchanging data segments SYN segment to server
o specifies initial seq #
o initialize TCP variables:
o no data
o seq. #s
o buffers, flow control
o Step 2: server host receives
info (e.g. RcvWindow) SYN, replies with SYNACK
o client: connection initiator
segment
o Socket clientSocket = new
oserver allocates buffers
Socket("hostname","port o specifies server initial seq.
number"); #
o server: contacted by client o Step 3: client receives
o Socket connectionSocket = SYNACK, replies with ACK
welcomeSocket.accept(); segment, which may contain
data

Transport Layer 3-115


TCP Connection Management (cont.)

Closing a connection: client server

client closes socket: close


FIN
clientSocket.close();

Step 1: client end system


ACK
sends TCP FIN control close
segment to server FIN

Step 2: server receives FIN,

timed wait
ACK
replies with ACK. Closes
connection, sends FIN.

closed

Transport Layer 3-116


TCP Connection Management (cont.)

o Step 3: client receives FIN, client server


replies with ACK.
closing
FIN
o Enters “timed wait” - will
respond with ACK to
received FINs
ACK
closing
o Step 4: server, receives
FIN
ACK. Connection closed.
o Note: with small

timed wait
ACK
modification, can handle
simultaneous FINs. closed

closed

Transport Layer 3-117


TCP Connection Management (cont)

TCP server
lifecycle

TCP client
lifecycle

Transport Layer 3-118


Chapter 3 outline
o 3.1 Transport-layer o 3.5 Connection-oriented
services transport: TCP
o 3.2 Multiplexing and o segment structure
demultiplexing o reliable data transfer
o 3.3 Connectionless
o flow control
o connection management
transport: UDP
o 3.6 Principles of
o 3.4 Principles of
reliable data transfer congestion control
o 3.7 TCP congestion
control

Transport Layer 3-119


Principles of Congestion Control

o Congestion:
o informally: “too many sources sending too much
data too fast for network to handle”
o different from flow control!
o manifestations:
o lost packets (buffer overflow at routers)
o long delays (queueing in router buffers)
o a top-10 problem!

Transport Layer 3-120


Causes/costs of congestion: scenario 1
Host A out
 two senders, two in : original data

receivers
unlimited shared
 one router,
Host B
output link buffers

infinite buffers
 no retransmission

 large delays
when congested
 maximum
achievable
throughput
Transport Layer 3-121
Causes/costs of congestion: scenario 2
o one router, finite buffers
o sender retransmission of lost packet

Host A in : original data out

'in : original data, plus


retransmitted data

Host B finite shared output link


buffers

Transport Layer 3-122


Causes/costs of congestion: scenario 2
o always: = 
 (goodput)
in out
o “perfect” retransmission only when loss:  > out
in
o retransmission of delayed (not lost) packet makes 
in larger
(than perfect case) for same out
R/2 R/2 R/2

R/3
out

out

out
R/4

R/2 R/2 R/2


in in in

a. b. c.
o “costs” of congestion:
o more work (retrans) for given “goodput”
o unneeded retransmissions: link carries multiple copies of pkt

Transport Layer 3-123


Causes/costs of congestion: scenario 3
o four senders
Q: what happens as 
o multihop paths in
and  increase ?
o timeout/retransmit in
Host A out
in : original data
'in : original data, plus
retransmitted data

finite shared output link


buffers

Host B

Transport Layer 3-124


Causes/costs of congestion: scenario 3
Host A 
out

Host B

o another “cost” of congestion:


o when packet dropped, any “upstream transmission capacity used
for that packet was wasted!

Transport Layer 3-125


Approaches towards congestion control
two broad approaches towards congestion control:

o end-end congestion o network-assisted


control: congestion control:
o no explicit feedback from o routers provide feedback
network to end systems
o congestion inferred from o single bit indicating
end-system observed loss, congestion
delay o explicit rate sender
o approach taken by TCP should send at

Transport Layer 3-126


Transport Layer 3-127
Chapter 3 outline
o 3.1 Transport-layer o 3.5 Connection-oriented
services transport: TCP
o 3.2 Multiplexing and o segment structure
demultiplexing o reliable data transfer
o 3.3 Connectionless
o flow control
o connection management
transport: UDP
o 3.6 Principles of
o 3.4 Principles of
reliable data transfer congestion control
o 3.7 TCP congestion
control

Transport Layer 3-128


o The approach taken by TCP is to have each
sender limit the rate at which it sends
traffic into its connection as a function of
perceived network congestion.

o If a TCP sender perceives that there is


little congestion on the path between itself
and the destination, then the TCP sender
increases its send rate;

o if the sender perceives that there is


congestion along the path, then the sender
reduces its send rate.
Transport Layer 3-129
o But this approach raises three questions.

o First, how does a TCP sender limit the rate at


which it sends traffic into its connection?

o Second, how does a TCP sender perceive that


there is congestion on the path between itself
and the destination?

o Third, what algorithm should the sender use to


change its send rate as a function of perceived
end-to-end congestion?

Transport Layer 3-130


o TCP connection consists of a
o receive buffer,
o send buffer,
o and several variables ( LastByteRead , rwnd , and so on).

o cwnd imposes a constraint on the rate at which a


TCP sender can send traffic into the network

o At the beginning of every RTT, the constraint


permits the sender to(LastByteSent−
LastByteAcked ≤ min{cwnd, rwnd}) send cwnd
bytes of data into the connection; at the end of
the RTT the sender receives acknowledgments for
the data.

Transport Layer 3-131


o Thus the sender’s send rate is roughly
cwnd/RTT bytes/sec.

o By adjusting the value of cwnd , the sender


can therefore adjust the rate at which it
sends data into its connection.

Transport Layer 3-132


o When there is excessive congestion, then
one (or more) router buffers along the
path overflows, causing a datagram
(containing a TCP segment) to be dropped.

o The dropped datagram, in turn, results in a


loss event at the sender—either a timeout
or the receipt of three duplicate ACKs—
which is taken by the sender to be an
indication of congestion on the sender-
to-receiver path.

Transport Layer 3-133


How then do the TCP senders determine their
sending rates such that they don’t congest the
network but at the same time make use of all the
available bandwidth?
 A lost segment implies congestion, and
hence, the TCP sender’s rate should be
decreased when a segment is lost.

 An acknowledged segment indicates that


the network is delivering the sender’s
segments to the receiver, and hence, the
sender’s rate can be increased when an
ACK arrives for a previously
unacknowledged segment.
Transport Layer 3-134
TCP congestion control:
o Goal: TCP sender should transmit as fast as possible, but without congesting network
o Q: how to find rate just below congestion level
o Decentralized: each TCP sender sets its own rate, based on implicit feedback:
o ACK: segment received (a good thing!), network not congested, so increase sending rate
o lost segment: assume loss due to congested network, so decrease sending rate

Transport Layer 3-135


TCP congestion control: bandwidth probing
o “probing for bandwidth”: increase transmission rate on receipt of
ACK, until loss occurs, then decrease transmission rate
o continue to increase on ACK, decrease on loss (since available bandwidth is
changing, depending on other connections in network)

ACKs being received,


X loss, so decrease rate
so increase rate
X
X
X
sending rate

TCP’s
X “sawtooth”
behavior

time

o Q: how fast to increase/decrease?


o details to follow
Transport Layer 3-136
TCP Congestion Control: details
o sender limits rate by limiting number of
unACKed bytes “in pipeline”:
LastByteSent-LastByteAcked  cwnd
o cwnd: differs from rwnd (how, why?)
o sender limited by min(cwnd,rwnd)
o roughly, cwnd
bytes

cwnd
rate = bytes/sec
RTT
o cwnd is dynamic, function of perceived
RTT
network congestion
ACK(s)

Transport Layer 3-137


TCP Congestion Control: more details

segment loss event: ACK received: increase


reducing cwnd cwnd
o timeout: no response o slowstart phase:
from receiver o increase exponentially
o cut cwnd to 1 fast (despite name) at
o 3 duplicate ACKs: at connection start, or
following timeout
least some segments
o congestion avoidance:
getting through (recall
o increase linearly
fast retransmit)
o cut cwnd in half, less
aggressively than on
timeout

Transport Layer 3-138


TCP Slow Start
o when connection begins, cwnd = 1
MSS Host A Host B
o example: MSS = 500 bytes &
RTT = 200 msec one s e gm
ent

RTT
o initial rate = 20 kbps
o available bandwidth may be >>
two segm
en ts
MSS/RTT
o desirable to quickly ramp up to
respectable rate
four segm
o increase rate exponentially until ents

first loss event or when threshold


reached
o double cwnd every RTT
o done by incrementing cwnd by
time
1 for every ACK received

Transport Layer 3-139


Transport Layer 3-140
Transport Layer 3-141
o In this version, if a time-out occurs, TCP moves to
the slow-start state (or starts a new round if it
is already in this state);

o on the other hand, if three duplicate ACKs arrive,


TCP moves to the fast-recovery state and
remains there as long as more duplicate ACKs
arrive.

o The fast-recovery state is a state somewhere


between the slow-start and the congestion-
avoidance states.
o It behaves like the slow start, in which the cwnd grows
exponentially, but the cwnd starts with the value of
ssthresh plus 3 MSS (instead of 1).
Transport Layer 3-142
o When TCP enters the fast-recovery state,
three major events may occur.
o If duplicate ACKs continue to arrive, TCP stays in this
state, but the cwnd grows exponentially.
o If a time-out occurs, TCP assumes that there is real
congestion in the network and moves to the slow-start
state.
o If a new (nonduplicate) ACK arrives, TCP moves to the
congestion-avoidance state, but deflates the size of the
cwnd to the ssthresh value, as though the three
duplicate ACKs have not occurred, and transition is from
the slow-start state to the congestion-avoidance state.

Transport Layer 3-143


Transport Layer 3-144
Transport Layer 3-145
TCP: congestion avoidance
o when cwnd > ssthresh AIMD
grow cwnd linearly o ACKs: increase cwnd
o increase cwnd by 1
by 1 MSS per RTT:
MSS per RTT additive increase
o approach possible
o loss: cut cwnd in half
congestion slower (non-timeout-detected
than in slowstart loss ): multiplicative
o implementation: cwnd decrease
= cwnd + MSS/cwnd
for each ACK received AIMD: Additive Increase
Multiplicative Decrease

Transport Layer 3-146

You might also like