Fast TCP:: Motivation, Architecture, Algorithms, Performance
Fast TCP:: Motivation, Architecture, Algorithms, Performance
FAST TCP:
Motivation, Architecture, Algorithms, Performance
David X. Wei Cheng Jin Steven H. Low Sanjay Hegde
Engineering & Applied Science, Caltech
http://netlab.caltech.edu
Abstract— We describe FAST TCP, a new TCP congestion has been proposed, e.g., in [23], [69], [3], [72], [12]. See
control algorithm for high-speed long-latency networks, from [5], [68], [28], [27], [56], [75], [36], [4] for other recent
design to implementation. We highlight the approach taken by proposals.
FAST TCP to address the four difficulties, at both packet and
flow levels, which the current TCP implementation has at large Using queueing delay as a congestion measure has two
windows. We describe the architecture and summarize some of advantages. First, queueing delay can be more accurately
the algorithms implemented in our prototype. We characterize estimated than loss probability both because packet losses
the equilibrium and stability properties of FAST TCP. We in networks with large bandwidth-delay product are rare
provide experimental evaluation of our first prototype in terms events under TCP Reno (e.g., probability on the order 10−7
of throughput, fairness, stability, and responsiveness.
or smaller), and because loss samples provide coarser in-
formation than queueing delay samples. Indeed, measure-
I. I NTRODUCTION AND SUMMARY ments of delay are noisy, just as those of loss probability.
Congestion control is a distributed algorithm to share Each measurement of packet loss (whether a packet is lost)
network resources among competing users. It is important provides one bit of information for the filtering of noise,
in situations where the availability of resources and the set whereas each measurement of queueing delay provides multi-
of competing users vary over time unpredictably, yet efficient bit information. This makes it easier for an equation-based
and fair sharing is desired. These constraints – unpredictable implementation to stabilize a network into a steady state
supply and demand and the desire for efficient distributed with a target fairness and high utilization. Second, based on
operation – necessarily lead to feedback control as the the commonly used ordinary differential equation model of
preferred approach, where traffic sources dynamically adapt TCP/AQM, the dynamics of queueing delay has the right
their rates to congestion in their paths. On the Internet, this scaling with respect to network capacity. This helps maintain
is performed by the Transmission Control Protocol (TCP) in stability as a network scales up in capacity [51], [8], [53].
source and destination computers involved in data transfers. In Section III, we lay out an architecture to implement our
The congestion control algorithm in the current TCP, design, and present an overview of some of the algorithms
which we refer to as Reno in this paper, was developed implemented in our current prototype. Even though the
in 1988 [20] and has gone through several enhancements discussion is in the context of FAST TCP, the architecture
since, e.g., [21], [58], [47], [18], [1], [14]. It has performed can also serve as a general framework to guide the design of
remarkably well and is generally believed to have prevented other congestion control mechanisms, not necessarily limited
severe congestion as the Internet scaled up by six orders to TCP, for high-speed networks. The main components in
of magnitude in size, speed, load, and connectivity. It is the architecture can be designed separately and upgraded
also well-known, however, that as bandwidth-delay product asynchronously.
continues to grow, TCP Reno will eventually become a We evaluate FAST TCP both analytically and experimen-
performance bottleneck itself. The following four difficulties tally. In Section III-B, we present a mathematical model of
contribute to the poor performance of TCP Reno in networks the window control algorithm. We prove that FAST TCP
with large bandwidth-delay products: has the same equilibrium properties as TCP Vegas [50],
1) At the packet level, linear increase by one packet per [44]. In particular, it does not penalize flows with large
Round-Trip Time (RTT) is too slow, and multiplicative propagation delays and it achieves weighted proportional
decrease per loss event is too drastic. fairness [31]. For the special case of single bottleneck link
2) At the flow level, maintaining large average congestion with heterogeneous flows, we prove that the window control
windows requires an extremely small equilibrium loss algorithm of FAST is locally asymptotically stable, in the
probability. absence of feedback delay.
3) At the packet level, oscillation in congestion window In Section IV, we present both experimental and simu-
is unavoidable because TCP uses a binary congestion lation results to illustrate throughput, fairness, stability, and
signal (packet loss). responsiveness of FAST TCP, in the presence of delay and
4) At the flow level, the dynamics is unstable, leading in heterogeneous and dynamic environments where flows
to severe oscillations that can only be reduced by the of different delays join and depart asynchronously. It is
accurate estimation of packet loss probability and a important to evaluate a congestion control algorithm not
stable design of the flow dynamics. only in terms of throughput achieved, but also what it does
to network queues and how that affects other applications
We explain these difficulties in detail in Section II, and moti- sharing the same queue. We compare the performance of
vate a delay-based solution. Delay-based congestion control FAST TCP with Reno, HSTCP (HighSpeed TCP [15]), STCP
To appear in IEEE/ACM Transactions on Networking, 2007. An (Scalable TCP [32]), and BIC TCP [75], using their default
abridged version appears in [25] and an expanded version in [24]. parameters.
2
In Section V, we summarize open issues and provide derivations. The congestion windows in these algorithms all
references for proposed solutions. evolve according to:
qi (t)
II. M OTIVATIONS ẇi (t) = κi (t) · 1 − (3)
ui (t)
A congestion control algorithm can be designed at two where κi (t) := κi (wi (t), Ti (t)) and ui (t) :=
levels. The flow-level (macroscopic) design aims to achieve ui (wi (t), Ti (t)). They differ only in the choice of the
high utilization, low queueing delay and loss, fairness, and gain function κi (wi , Ti ), the marginal utility function
stability. The packet-level design implements these flow level ui (wi , Ti ), and the end-to-end congestion measure qi .
goals within the constraints imposed by end-to-end control. Within this structure, at the flow level, there are thus only
Historically for TCP Reno, packet-level implementation was three design decisions:
introduced first. The resulting flow-level properties, such as
• κi (wi , Ti ): the choice of the gain function κi determines
fairness, stability, and the relationship between equilibrium
window and loss probability, were then understood as an the dynamic properties such as stability and responsive-
afterthought. In contrast, the packet-level designs of HSTCP ness, but does not affect the equilibrium properties.
• ui (wi , Ti ): the choice of the marginal utility function
[15], STCP [32], and FAST TCP are explicitly guided by
flow-level goals. ui determines equilibrium properties such as the equi-
librium rate allocation and its fairness.
• qi : in the absence of explicit feedback, the choice of
A. Packet and flow level modeling congestion measure qi is limited to loss probability or
queueing delay. The dynamics of qi (t) is determined
The congestion avoidance algorithm of TCP Reno and its inside the network.
variants have the form of AIMD [20]. The pseudo code for
window adjustment is: At the flow level, a goal is to design a class of function
pairs, ui (wi , Ti ) and κi (wi , Ti ), so that the feedback system
1 described by (3), together with link dynamics of qi (t) and the
Ack: w ←− w + interconnection, has an equilibrium that is fair and efficient,
w
1 and that the equilibrium is stable, in the presence of feedback
Loss: w ←− w − w delay. The design choices in FAST, Reno, HSTCP, and STCP
2 are shown in Table I. These choices produce equilibrium
This is a packet-level model, but it induces certain flow-level
properties such as throughput, fairness, and stability. κi (wi , Ti ) ui (wi , Ti ) qi
These properties can be understood with a flow-level FAST γαi /τ αi /xi queueing delay
model of the AIMD algorithm, e.g., [29], [19], [39], [41]. Reno 1/Ti 1.5/wi2 loss probability
0.16b(wi )wi0.80
The window wi (t) of source i increases by 1 packet per HSTCP (2−b(wi ))Ti
0.08/wi1.20 loss probability
RTT,1 and decreases per unit time by STCP awi /Ti ρ/wi loss probability
1 4 TABLE I
xi (t)qi (t) · · wi (t) packets C OMMON DYNAMIC STRUCTURE : wi IS SOURCE i’ S WINDOW SIZE , Ti IS
2 3
ITS ROUND - TRIP TIME , qi IS CONGESTION MEASURE , xi = wi /Ti ;
where xi (t) := wi (t)/Ti (t) pkts/sec. Ti (t) is the round-trip a, b(wi ), ρ, γ, αi , τ ARE PROTOCOL PARAMETERS ; SEE [24].
time, and qi (t) is the (delayed) end-to-end loss probability, in
period t.2 Here, 4wi (t)/3 is the peak window size that gives characterizations shown in Table II.
the “average” window of wi (t). Hence, a flow-level model αi
FAST xi =
of AIMD is: qi
1 αi
Reno xi = Ti
· qi0.50
1 2 1 αi
ẇi (t) = − xi (t)qi (t)wi (t) (1) HSTCP xi = Ti
· qi0.84
Ti (t) 3 1 αi
STCP xi = ·
√ Ti qi
Setting ẇi (t) = 0 in (1) yields the well-known 1/ q formula
TABLE II
for TCP Reno discovered in [48], [37], which relates loss
C OMMON EQUILIBRIUM STRUCTURE : xi IS SOURCE i’ S THROUGHPUT IN
probability to window size in equilibrium:
PACKETS / SEC , Ti IS EQUILIBRIUM ROUND - TRIP TIME , qi IS END - TO - END
3 CONGESTION MEASURE IN EQUILIBRIUM . T HE PARAMETERS ARE :
qi∗ = (2) α = 1.225 FOR R ENO , α = 0.120 FOR HSTCP, AND α = 0.075 FOR
2wi∗2
STCP. F OR FAST, αi SHOULD VARY WITH LINK CAPACITY.
In summary, (1) and (2) describe the flow-level dynamics and We next illustrate the equilibrium and dynamics prob-
equilibrium, respectively, for TCP Reno. lems of TCP Reno, at both the packet and flow levels, as
Even though Reno, HSTCP, STCP, and FAST look dif- bandwidth-delay product increases.
ferent at the packet level, they have similar equilibrium and
dynamic structures at the flow level; see [24] for detailed
B. Reno’s problems at large window
1 Itshould be (1 − qi (t)) packets, where qi (t) is the end-to-end loss The equilibrium problem at the flow level is expressed
probability. This is roughly 1 when qi (t) is small.
2 This model assumes that window is halved on each packet loss. It can in (2): the end-to-end loss probability must be exceedingly
be modified to model the case, where window is halved at most once in small to sustain a large window size, making the equilibrium
each RTT. This does not qualitatively change the following discussion. difficult to maintain in practice, as bandwidth-delay product
3
increases. Indeed, from (2), qi∗ wi∗ = 1.5/wi∗ , i.e., the av- schemes such as Reno. Indeed, each measurement of packet
erage number of packet losses (or loss events) per window loss (whether a packet is lost) provides one bit of information
decreases in inverse proportion to the equilibrium window for the filtering of noise, whereas each measurement of
size for Reno. From Table II, this number for HSTCP is queueing delay provides multi-bit information. This facili-
qi∗ wi∗ = 0.0789/wi∗ 0.1976 . Hence it also decreases with the tates an equation-based implementation to stabilize a network
equilibrium window, but more slowly than for TCP Reno. into a steady state with a target fairness and high utilization.
For STCP, this number is qi∗ wi∗ = a(1 − b/2)/b, which is At the flow level, the dynamics of the feedback system
independent of, and hence scalable with, the equilibrium win- must be stable in the presence of delay, as the network capac-
dow size. The recommended values in [32] for the constants ity increases. Here, again, queueing delay has an advantage
are a = 0.01 and b = 0.125, yielding an average loss of over loss probability as a congestion measure: the dynamics
0.075 per window. Even though equilibrium is a flow-level of queueing delay have the right scaling with respect to
notion, this problem with Reno manifests itself at the packet network capacity, according to the commonly used ordinary
level, where a source increases its window too slowly and differential equation model. This helps maintain stability as
decreases it too drastically. In contrast, HSTCP and STCP network capacity grows [51], [8], [53], [52].
increase more aggressively and decrease less drastically. It has been found that delay and packet loss can have a
The causes of the oscillatory behavior of TCP Reno lie in weak correlation, e.g., [45], especially when packet losses
its design at both the packet and flow levels. At the packet can be caused by other reasons than buffer overflow. This
level, the choice of binary congestion signal necessarily leads does not mean that it is futile to use delay as a measure of
to oscillation in congestion windows and bottleneck queues, congestion, but rather, that using delay to predict loss in the
and the parameter setting in Reno worsens the situation as hope of helping a loss-based algorithm adjust its window is
bandwidth-delay product increases. At the flow level, the a wrong approach to address problems at large windows. A
system dynamics given by (1) is unstable at large bandwidth- different approach that fully exploits delay as a congestion
delay products [19], [39]. These must be addressed by measure, augmented with loss information, is needed.
different means.
Congestion window can be stabilized only if multi-bit This motivates the following implementation strategy.
feedback is used.3 This is the approach taken by the equation- First, by explicitly estimating how far the current state
based algorithm in [13], where congestion window is adjusted qi (t)/ui (t) is from the equilibrium value of 1, a delay-based
based on the estimated loss probability in an attempt to scheme can drive the system rapidly, yet in a fair and stable
stabilize around a target value given by (2). This approach manner, toward the equilibrium. The window adjustment is
eliminates the oscillation due to packet-level AIMD, but two small when the current state is close to equilibrium and large
difficulties remain at the flow level. otherwise, independent of where the equilibrium is. This is
First, equation-based control requires the explicit estima- in stark contrast to the approach taken by Reno, HSTCP, and
tion of end-to-end loss probability. This is difficult when the STCP, where window adjustment depends on just the current
loss probability is small. Second, even if loss probability can window size and is independent of where the current state is
be perfectly estimated, Reno’s flow dynamics, described by with respect to the target (compare Figures 1 (a) and (b) in
equation (1) leads to a feedback system that becomes unstable [24]). Like the equation-based scheme in [13], this approach
as feedback delay increases, and more strikingly, as network avoids the problem of slow increase and drastic decrease
capacity increases [19], [39]. The instability at the flow level in Reno, as the network scales up. Second, by choosing a
can lead to severe oscillations that can be reduced only by multi-bit congestion measure, this approach eliminates the
stabilizing the flow level dynamics. We present a delay-based packet-level oscillation due to binary feedback, avoiding
approach to address these problems. Reno’s third problem. Third, using queueing delay as the
congestion measure qi (t) allows the network to stabilize in
the region below the overflowing point, when the buffer
C. Delay-based approach size is sufficiently large. Stabilization at this operating point
The common model (3) can be interpreted as follows: the eliminates large queueing delay and unnecessary packet loss.
goal at the flow level is to equalize marginal utility ui (t) with More importantly, it makes room for buffering “mice” traffic.
the end-to-end measure of congestion qi (t). This interpre- To avoid the second problem in Reno, where the required
tation immediately suggests an equation-based packet-level equilibrium congestion measure (loss probability for Reno,
implementation where the window adjustment ẇi (t) depends and queueing delay here) is too small to practically estimate,
on not only the sign, but also the magnitude of the difference the algorithm must adapt its parameter αi to capacity to
between the ratio qi (t)/ui (t) and the target of 1. Unlike the maintain small but sufficient queueing delay. Finally, to avoid
approach taken by Reno, HSTCP, and STCP, this approach the fourth problem of Reno, the window control algorithm
eliminates packet-level oscillations due to the binary nature of must be stable, in addition to being fair and efficient, at
congestion signal. It however requires the explicit estimation the flow level. The emerging theory of large-scale networks
of the end-to-end congestion measure qi (t). under end-to-end control, e.g., [31], [42], [35], [50], [46],
Without explicit feedback, qi (t) can only be loss prob- [76], [44], [41], [6], [51], [63], [34], [19], [39], [33], [64],
ability, as used in TFRC [13], or queueing delay, as used [53], [8], [52], [73], [11], [56] (see also, e.g., [43], [40],
in TCP Vegas [3] and FAST TCP. Queueing delay can be [30], [57] for recent surveys), forms the foundation of the
more accurately estimated than loss probability both because flow-level design. The theory plays an important role by
loss samples provide coarser information than queueing de- providing a framework to understand issues, clarify ideas, and
lay samples, and because packet losses in networks with suggest directions, leading to a robust and high performance
large bandwidth-delay products tend to be rare events under implementation.
In the next section, we lay out the architecture of FAST
3 See [70] for discussion on congestion signal and decision function. TCP.
4
III. A RCHITECTURE AND ALGORITHMS the total number of packets queued in routers in equilibrium
A. Architecture along the flow’s path. The window update period is 20ms in
our prototype.
We separate the congestion control mechanism of TCP We now provide an analytical evaluation of FAST TCP.
into four components in Figure 1. These four components We present a model of the window control algorithm for
are functionally independent so that they can be designed a network of FAST flows. We show that, in equilibrium,
separately and upgraded asynchronously. the vectors of source windows and link queueing delays are
the unique solutions of a pair of optimization problems (6)–
(7). This completely characterizes the network equilibrium
Data Window Burstiness
Control Control Control
properties such as throughput, fairness, and delay. We also
present a preliminary stability analysis.
We model a network as a set of resources with finite
Estimation capacities cl , e.g., transmission links, processing units, mem-
ory, etc., to which we refer to as “links” in our model. The
TCP Protocol Processing network is shared by a set of unicast flows, identified by their
sources. Let di denote the round-trip propagation delay of
Fig. 1. FAST TCP architecture. source i. Let R be the routing matrix where Rli = 1 if source
i uses link l, and 0 otherwise. Let pP l (t) denote the queueing
The data control component determines which packets to delay at link l at time t. Let qi (t) = l Rli pl (t) be the round-
transmit, window control determines how many packets to trip queueing delay, or in vector notation, q(t) = R T p(t).
transmit, and burstiness control determines when to transmit Then the round trip time of source i is Ti (t) := di + qi (t).
these packets. These decisions are made based on information Each source i adapts its window wi (t) periodically accord-
provided by the estimation component. ing to: 4
More specifically, the estimation component computes two
di wi (t)
pieces of feedback information for each data packet sent wi (t + 1) = γ + αi + (1 − γ)wi (t) (4)
– a multibit queueing delay and an one-bit loss-or-no-loss di + qi (t)
indication – which are used by the other three components. where γ ∈ (0, 1], at time t.
Data control selects the next packet to send from three A key departure of our model from those in the literature is
pools of candidates: new packets, packets that are deemed that we assume that a source’s send rate, defined as xi (t) :=
lost (negatively acknowledged), and transmitted packets that wi (t)/Ti (t), cannot exceed the throughput it receives. This is
are not yet acknowledged. Window control regulates packet justified because of self-clocking: within one round-trip time
transmission at the RTT timescale, while burstiness control after a congestion window is increased, packet transmission
works at a smaller timescale. Burstiness control smoothes will be clocked at the same rate as the throughput the flow
out transmission of packets in a fluid-like manner to track receives. See [66] for detailed justification and validation
the available bandwidth. We employ two mechanisms, one to experiments. A consequence of this assumption is that the
supplement self-clocking in streaming out individual packets link queueing delay vector, p(t), is determined implicitly
and the other to increase window size smoothly in smaller by the instantaneous window size in a static manner: given
bursts. Burstiness reduction limits the number of packets that wi (t) = wi for all i, the link queueing delays pl (t) = pl ≥ 0
can be sent when an ack advances congestion window by a for all l are given by:
large amount. Window pacing determines how to increase
congestion window over the idle time of a connection to
X wi = cl if pl (t) > 0
Rli (5)
the target determined by the window control component. It d i + q i (t) ≤ c l if pl (t) = 0
i
reduces burstiness with a reasonable amount of scheduling P
overhead. For details of these two mechanisms, see [71], [24]. where again qi (t) = l Rli pl (t).
An initial prototype that included some of these features The next result says that the queueing delay is indeed well
was demonstrated in November 2002 at the SuperComputing defined. All proofs are relegated to the Appendix and [24].
Conference, and the experimental results were reported in Lemma 1: Suppose the routing matrix R has full row rank.
[26]. In the following, we explain in detail the design of the Given w = (wi , ∀i), there exists a unique queueing delay
window control component. vector p = (pl , ∀l) that satisfies (5).
The equilibrium values of windows w ∗ and delays p∗ of the
network defined by (4)–(5) can be characterized as follows.
B. Window control algorithm Consider the utility maximization problem
FAST reacts to both queueing delay and packet loss. Under X
normal network conditions, FAST periodically updates the max αi log xi s.t. Rx ≤ c (6)
x≥0
congestion window based on the average RTT and average i
queueing delay provided by the estimation component, ac- 4 Note
cording to: that (4) can be rewritten as (when αi (wi , qi ) = αi , constant)
wi (t + 1) = wi (t) + γi (αi − xi (t)qi (t))
w ←− min {2w , (1 − γ)w
From [44], TCP Vegas updates its window according to
baseRTT
+ γ w +α 1
RTT wi (t + 1) = wi (t) + sgn(αi − xi (t)qi (t))
Ti (t)
where γ ∈ (0, 1], baseRTT is the minimum RTT observed where sgn(z) = −1 if z < 0, 0 if z = 0, and 1 if z > 0. Hence FAST can
so far, and α is a positive protocol parameter that determines be thought of as a high-speed version of Vegas.
5
7
1500
throughput (Kbps)
6
(pkt)
1000
5
avg
B. Case study: dynamic scenario I
q
4
0 0
delays of 100, 150, and 200ms, that started and terminated RENO RENO
5
x 10 2000
8
1500
throughput (Kbps)
6
(pkt)
1000
5
avg
q
1 x 100 ms 4
1 x 150 ms 3
500
1 x 200 ms 2
(pkt)
1000
5
avg
They are shown in Figure 3. As new flows joined or old flows
q
4
2
500
(pkt)
1000
5
avg
throughput.
q
4
2
500
because the number of flows was small. HSTCP, STCP, and BIC TCP BIC TCP
BIC-TCP exhibited strong oscillations that filled the buffer. 10
x 10
5
2000
just below the point where overflow occurs, it had the highest 7
1500
6
(pkt)
1000
5
avg
for the rate allocations for each time interval that contains
2
more than one flow (see Figure 2). The fairness indices are 0
0 1000 2000 3000 4000
sec
5000 6000 7000 8000 9000
0
0 1000 2000 3000 4000
sec
5000 6000 7000 8000 9000
FAST FAST
C. Case study: dynamic scenario II 10
x 10
5
2000
throughput(kbps)
propagation delays, which joined and departed according
qavg (pkt)
1000
5
0 0
0 0.5 1 1.5 2 2.5
RENO RENO
5
x 10 2000
10
2 x 50 ms
8
1500
2 x 100 ms 2 x 100 ms 7
2 x 150 ms 2 x 150 ms 6
throughput(kbps)
qavg (pkt)
2 x 200 ms 2 x 200 ms 5
1000
0 0
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
size are more severe for loss-base protocols. Packet loss was 8
1500
throughput(kbps)
in any significant way. Connections sharing the link achieved
qavg (pkt)
1000
5
times, with little packet loss and high link utilization. Intra- 2
500
0 0
variation in the fairness of FAST TCP. 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
sec 4
x 10
sec 4
x 10
STCP STCP
Time (sec) Sources FAST Reno HSTCP STCP BIC 10
x 10
5
2000
7
1500
qavg (pkt)
1000
5
0 0
16200 – 18000 8 .973 .787 .816 .547 .779 BIC TCP BIC TCP
18000 – 19800 6 .982 .892 .899 .563 .894 10
x 10
5
2000
TABLE IV 6
throughput(kbps)
qavg (pkt)
3
500
D. Overall evaluation 1
0 0
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
ent delays, number of flows, and their arrival and departure Fig. 5. Dynamic scenario II: Throughput trajectory (left column) and
patterns. In all these experiments, the bottleneck link capacity Dummynet queue trajectory (right column).
was 800Mbps and buffer size 2000 packets. We present
here a summary of protocol performance in terms of some there are n active flows in this interval, indexed by i =
quantitative measures on throughput, fairness, stability, and 1, . . . , n. Let
responsiveness.
We use the output of iperf for our quantitative eval- m
1 X
uation. Each iperf session in our experiments produced xi := xi (k)
five-second averages of its throughput. This is the data rate m
k=1
(i.e., goodput) applications such as iperf receives, and is
slightly less than the bottleneck bandwidth due to packet be the average throughput of flow i over this interval. We
header overheads. now define our performance metrics for this interval [1, m]
Let xi (k) be the average throughput of flow i in the using these throughput measurements.
five-second period k. Most tests involved dynamic scenarios 1) Throughput: The average aggregate Pthroughput for the
n
where flows joined and departed. For the definitions below, interval [1, m] is defined as E := i=1 xi .
suppose the composition of flows changes in period k = 1, 2) Intra-protocol fairness: Jain’s fairness index for
remains fixed over period k = 1, . . . , m, and changes again Pn interval [1,
the Pm]n
is defined as [22] F :=
in period k = m + 1, so that [1, m] is the maximum-length ( i=1 xi )2 /(n i=1 x2i ). F ∈ (0, 1] and F = 1 is
interval over which the same equilibrium holds. Suppose ideal (equal sharing).
8
3) Stability: The stability index of flow i is the sample on aggregate throughput. This apparent discrepancy reflects
standard deviation normalized by the average through- the fact that link utilization traces converge more quickly than
put: individual throughput traces. It also serves as a justification
v for the link model (5): the aggregate input rate to a link
m
converges more rapidly than individual rates, and hence the
u
1 ut 1
X 2
Si := (xi (k) − xi ) queue stabilizes quickly to its new level that tracks changes
xi m−1
k=1 in windows.
The smaller the stability index, the less oscillation
a source experiences. The stability index for interval
[1,
Pnm] is the average over the n active sources S := E. NS-2 simulations
i=1 Si /n.
4) Responsiveness: The responsiveness index measures Our dummynet testbed is limited to experiments with
the speed of convergence when network equilibrium single-bottleneck networks and identical protocol. We con-
changes at k P= 1, i.e., when flows join or depart. ducted NS-2 simulation to study the performance of FAST
k in more complex environments. The FAST implementation
Let xi (k) := t=1 xi (t)/k be the running average by
in NS-2 is from CUBIN Lab [78]. We set up the param-
period k ≤ m. Then xi (m) = xi is the average over
eters so that only the original FAST algorithm as used in
the entire interval [1, m].
the dummynet experiments reported above was enabled. To
Responsiveness index R1 measures how fast the run-
eliminate possible simulation artifacts, such as phase effect,
ning average xi (k) of the slowest source converges to
we introduced two-way noise traffic in the simulation, where
x i :5 a certain number of Pareto on-off flows with shape parameter
1.5 were introduced in each direction.6 When a noise flow
xi (k) − xi
R1 := max max k : > 0.1
is “on”, it transmits at a constant rate of 4Mbps. Each noise
i xi
flow has an average burst time of 100ms and an average idle
Responsiveness index R2 measures
P how fast the aggre- time of 100ms. Hence the average length of a flow is 50KB,
gate throughput converges to i xi : similar to web traffic. We repeated each scenario 20 times
P
i (xi (k) − xi )
and report both the average rate and the standard deviation
R2 := max k : P > 0.1
(error bars in the figures).
i xi Three sets of simulations were conducted: FAST with
For each TCP protocol, we obtain one set of computed different noise levels, FAST with Reno traffic, and FAST on
values for each evaluation criterion for all of our experiments. a multilink network. Due to space limitation, we only present
We plot the CDF (cumulative distribution function) of each a few examples from each set of simulations. See [79] for
set of values. These are shown in Figures 6 – 9. complete details.
From Figures 6–9, FAST has the best performance among 1) FAST with noise traffic: This set of simulations re-
all protocols for three evaluation criteria, fairness, stability peated the scenario in Section IV-B, with different levels of
and responsiveness index R1 . It has the second best overall noise traffic. The noise traffic was in the form of multiple
throughput. More importantly, the variation in each of the Pareto on-off flows as described above. We varied the number
distributions is smaller under FAST than under the other of noise flows from 0 to 200, corresponding to an aggregate
protocols, suggesting that FAST had fairly consistent per- noise traffic of 0% to 50% of the bottleneck capacity. Figures
formance in our test scenarios. We also observed that both 11 – 13 show the throughput trajectory of three cases: 0%,
HSTCP and STCP achieved higher throughput and improved 10% (40 noise flows) and 30% (120 noise flows). Each point
responsiveness compared with TCP Reno. STCP had worse in the figures represents the average rate over a 60 second
intra-protocol fairness compared with TCP Reno, while both interval.
BIC-TCP and HSTCP achieved comparable intra-protocol The NS-2 simulation with 0% Noise (Figure 11) should
fairness to Reno. HSTCP, BIC-TCP, and STCP showed be compared with dummynet experiment in Section IV-B.
increased oscillations compared with Reno (Figures 8, 3), Different from dummynet experiments, the NS-2 simulation
and the oscillations became worse as the number of sources was clean, and new flows mistook queueing delay due to
increased (Figure 5). existing flows as part of their propagation delays, leading
From Figure 9, FAST TCP achieved a much better respon- to unfair throughputs. However, when the noise was 10%
siveness index R1 (which is based on worst case individual of the capacity, such unfairness was eliminated. The queue
throughput) than the other schemes. We caution however that was frequently emptied and new flows observed the correct
it can be hard to quantify “responsiveness” for protocols that propagation delays and converged to the correct equilibrium
do not stabilize into an equilibrium point or a periodic limit rates, as shown in Figure 12. Figure 13 shows the throughput
cycle, and hence the unresponsiveness of Reno, HSTCP, and when the noise was 30% of the capacity. FAST throughputs
STCP, as measured by index R1 , should be interpreted with oscillated, adapting to mice that joined and left frequently. In
care. Indeed, from Figure 10, all protocols except TCP Reno the period of 720 to 1080 second, the mice traffic generated
perform well on the responsiveness index R2 which is based so much packet loss that the three FAST flows could not
keep α packets in the queue and they behaved like an AIMD
5 The natural definition of responsiveness index as the earliest period after
algorithm. Such AIMD behavior led to discrimination against
which the throughput xi (k) (as opposed to the running average xi (k) of
the throughput) stays within 10% of its equilibrium value is unsuitable for long RTT flows (flow 1 and flow 3).
TCP protocols that do not stabilize into an equilibrium value. Hence we
define it in terms of xi (k) which, by definition, always converges to xi by
the end of the interval k = m. This definition captures the intuitive notion 6 We also conducted simulations with exponential on-off traffic. The results
of responsiveness if xi (k) settles into a periodic limit cycle. are similar.
9
1 1 1
FAST FAST
0.9 Reno 0.9 Reno 0.9
HSTCP HSTCP
STCP STCP
0.8 BIC TCP 0.8 BIC TCP 0.8
CDF
CDF
0.5 0.5 0.5
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
CDF
CDF
0.5 0.5
0.4 0.4
0.3 0.3
6 6 6
throughput (Kbps)
throughput (Kbps)
throughput (Kbps)
5 5 5
4 4 4
3 3 3
2 2 2
1 1 1
time(sec) time(sec) time(sec)
0 0 0
0 200 400 600 800 1000 1200 1400 1600 1800 0 200 400 600 800 1000 1200 1400 1600 1800 0 200 400 600 800 1000 1200 1400 1600 1800
Fig. 11. FAST with 0% mice traffic. Fig. 12. FAST with 10% mice traffic. Fig. 13. FAST with 30% mice traffic.
5 3 FAST vs 1 Reno 5 2 FAST vs 2 Reno 5 1 FAST vs 3 Reno
x 10 x 10 x 10
7 7 7
6 6 6
5 5 5
throughput (Kbps)
throughput (Kbps)
throughput (Kbps)
4 FAST throughput 4 4
Reno throughput
Theoretical Reno throughput
3 3 3
2 2 2
Fig. 14. 3 FAST flows vs 1 Reno flow. Fig. 15. 2 FAST flows vs 2 Reno flows. Fig. 16. 1 FAST flow vs 3 Reno flows.
4
x 10
16
B. Queueing delay measurement
14
12
10
burstiness of the FAST flows themselves, leading to slight
8
theory
simulation
unfairness among flows with different RTTs, as shown in
6 IV-B. Such error can be greatly reduced by deploying a
4 burstiness control algorithm in the sender, as shown in [71].
2 Like Vegas, FAST is affected by queueing delay in reverse
bottleneck capacity (Kbps)
0
0 2 4 6 8 10
path, as shown in [4]. There are a number of ways that have
x 10
5
RTT observed in a certain preceding period, not since the problem of insufficient buffering by choosing α among a small set of pre-
determined values based on achieved throughput. This can sometimes lead
beginning of the connection, so that the estimate tracks route to unfair throughput allocation as reported in some of the literature. This
changes. version was used around early 2004, but discontinued since.
11
on network parameters that guarantee (global) uniqueness of Since the last term is independent of p, minimizing D(p)
equilibrium points for general networks. over p ≥ 0 is the same as minimizing (7) over p ≥ 0. Hence
We show in [60] that any desired inter-protocol fairness are there exists a unique solution (x∗ , p∗ ) for (6)–(7).
in principle achievable by an appropriate choice of FAST pa- We now show that (x∗ , p∗ ) is the equilibrium point of (4)–
rameter, and that intra-protocol fairness among flows within (5). In equilibrium, we have wi (t + 1) = wi (t) =: wi . From
each protocol is unaffected by the presence of the other (4), the corresponding queueing delays pl uniquely defined
protocol except for a reduction in effective link capacities. by (5) must be such that the P end-to-end queueing delays are
How to design practical distributed algorithms that use only strictly positive, i.e., qi = l Rli pl > 0 for all i even though
local information to achieve a desired inter-protocol fairness some pl can be zero. Then αi (wP i , qi ) = αi in equilibrium,
is however an open problem. and, from (4), we have qi = l Rli pl = αi /xi , where
xi := wi /(di + qi ). But this is the Karush-Kuhn-Tucker con-
Acknowledgments: We gratefully acknowledge the contri- dition for (6). Moreover, (5) is the complementary slackness
butions of the FAST project team and our collaborators, at condition. Hence the equilibrium of (4)–(5) coincides with
http://netlab.caltech.edu/FAST/ , in particular, the optimal solution of (6)–(7), i.e. w = w ∗ and p = p∗ .
G. Almes, J. Bunn, D. H. Choe, R. L. A. Cottrell, V.
Doraiswami, J. C. Doyle, W. Feng, O. Martin, H. Newman, F.
Paganini, S. Ravot, S. Shalunov, S. Singh, J. Wang, Z. Wang, C. Proof of Theorem 3
S. Yip. We thank J. Wang for pointing out several errors
in an earlier version. This work is funded by NSF (grants Let N be the number of sources. Let q(t) = p(t) denote
ANI-0113425 and ANI-0230967), Caltech Lee Center for the queueing delay at the single link (omitting the subscripts).
Advanced Networking, ARO (grant DAAD19-02-1-0283), It is more convenient to work with normalized window
AFOSR (grant F49620-03-1-0119), DARPA, and Cisco. wi (t)
yi (t) := (11)
di
VI. A PPENDIX : PROOFS P
Let Y (t) := i yi (t) be the aggregate normalized window.
A. Proof of Lemma 1 Then q(t) > 0 if and only if Y (t) > c.
Fix w.8 Define Ui (xi ) = wi log xi − di xi and consider the The window control algorithm (4) can be expressed in
following optimization problem: terms of updates on y(t):
X
max Ui (xi ) subject to Rx ≤ c
(9) γq(t)
x≥0 yi (t + 1) = 1− yi (t) + γ α̂i (12)
i di + q(t)
Since the objective function is strictly concave and the P
feasible set is compact, there exists a unique optimal solution where α̂i := αi /di . Let α̂ := i α̂i .
x∗ . Moreover, since R has full row rank, there exists a unique We first prove that the queue is lower bounded by a positive
Lagrange multiplier p∗ for the dual problem. See, e.g., [42] constant after a finite time.
for details. We claim that p∗ is the unique solution of (5) Theorem 5: 1) For all t > c/γ α̂, we have q(t) > 0.
and, for all i, 2) Moreover, given any > 0 we have
x∗i = wi /(di + qi∗ ) (10) α̂ α̂
· min di − < q(t) < · max di +
c i c i
Now, (10) can be rewritten as, for all i,
X wi for all sufficiently large t.
Rli p∗l = qi∗ = − di = Ui0 (x∗i ) Proof (Theorem 5). For the first claim, we will prove that
x∗i
l the queue will be nonzero at some t > c/γ α̂, and that once
which is the Karush-Kuhn-Tucker it is nonzero, it stays nonzero.
P condition for (9). Hence
(10) holds. Then (5) becomes i Rli x∗i ≤ cl , with equality Suppose q(t) = 0. Summing (12) over i, we have Y (t +
if p∗l > 0. But this is just the complementary slackness 1) = Y (t) + γ α̂, i.e., Y (t) grows linearly in time by γ α̂ in
condition for (9). each period. Since Y (0) ≥ 0, Y (t) > c after at most c/γ α̂
periods. Hence there is some t > c/γ α̂ such that q(t) > 0.
B. Proof of Theorem 2 We now show that q(t) > 0 implies q(t + 1) > 0.
Since γ < 1, we have from (12)
Clearly unique solution x∗ for (6) and unique solution
∗
p for its dual exist, since the utility functions αi log xi are q(t)
yi (t + 1) ≥ 1− yi (t) + γ α̂i
strictly concave and R is full rank (see e.g. [42]). We need di + q(t)
to show that the dual problem of (6) is indeed given by (7). di (t)
Now the dual objective function is given by [2] = yi (t) + γ α̂i
! di + q(t)
X X X
D(p) := max αi log xi − xi Rli pl + cl pl Summing over i gives
xi ≥0
i l l X di (t)
Y (t + 1) ≥ yi (t) + γ α̂
X X X
= cl p l − αi log Rli pl di + q(t)
i
l i i
But q(t) > 0 if and only if
X
+ αi (log αi − 1)
i X di (t)
yi (t) = c (13)
8 cf. the proof of a similar result in [50]. i
di + q(t)
12
Y (t + 1) ≥ c + γ α̂ > c X di yi
G(y, q) := −c = 0 (16)
i.e., q(t + 1) > 0. This proves the first claim. i
di + q
For the second claim, we first prove that Y (t) converges to
its limit point Y ∗ := c+α̂ geometrically (and monotonically): Lemma 1 guarantees that given any y ∈ <N + , there is a unique
q ∈ <+ that satisfies (16).
Y (t) = Y ∗ + (Y (0) − Y ∗ )(1 − γ)t (14) An important implication of Theorem 5(2) is that we can
To prove (14), rewrite (12) as restrict our space of y to a subset of <N
+:
:= { y ∈ <N
di yi (t) Y + | the unique q(y) defined implicitly
yi (t + 1) = (1 − γ)yi (t) + γ + α̂i
di + q(t) by (16) is greater than α̂ · min di /(2c) }(17)
i
Summing over i and using (13), we have
The key feature of Y we will need in Lemma 9 is that, for
Y (t + 1) = (1 − γ)Y (t) + γ(c + α̂) all y ∈ Y, q(y) is lower bounded uniformly in y. Define
F : Y −→ Y by
from which (14) follows.
Noting that d/(d + q) is a strictly increasing function of
γq(y)
d, we have from (13) Fi (y) := 1− yi + γ α̂i (18)
di + q(y)
mini di (t) X di (t)
· Y (t) ≤ yi (t) = c where q(y) is implicitly defined by (16). Then the evolution
mini di + q(t) i
di + q(t) (12) of the normalized window is y(t + 1) = F (y(t)).
maxi di (t) Our main result is to show that the iteration F is locally
≤ · Y (t) asymptotically stable by proving that the spectral radius of
maxi di + q(t)
∂F/∂y is strictly less than 1 on Y.
Hence Theorem 6: Fix any γ ∈ (0, 1]. For all y ∈ Y, the spectral
q(t) Y (t) q(t) radius of ∂F/∂y is strictly less than 1.
1+ ≥ ≥ 1+ (15) Theorem 6 implies a neighborhood of the unique fixed
mini di c maxi di
point y ∗ defined by y ∗ = F (y ∗ ) such that given any initial
From (14), we have normalized window y(0) in this neighborhood, y(t + 1) =
Y (t) α̂
Y (0) − Y ∗
F (y(t)) converges to y ∗ . This implies Theorem 3.
= 1+ + (1 − γ)t
c c c Sketch of proof (Theorem 6). We will show through
Hence, (15) becomes: Lemmas 7–9 that the spectral radius ρ(∂F/∂y) is uniformly
bounded away from 1, i.e., given γ ∈ (0, 1], there exists
Y (0) − Y ∗
q(t) α̂ t η 0 > 0 such that for all y ∈ Y,
≥ + (1 − γ)
mini di c c
∂F
q(t) ρ < η0 < 1 (19)
≥ ∂y
maxi di
Since γ ∈ (0, 1], the absolute value of the term in the square Let q(y) denote the unique solution of (16). Let
bracket can be made arbitrarily small by taking sufficiently −1
large t. Hence, given any 0 > 0, di yi X d j y j
βi := (20)
q(t) α̂ (di + q(y))2 (d j + q(y)) 2
≥ − 0 j
mini di c di
and µi := (21)
di + q(y)
q(t) α̂
≤ + 0 By Theorem 5(2), we have
maxi di c X
for all sufficiently large t. This proves the second claim.9 0 < β i , µi < 1 and βi = 1
i
Hence without loss of generality, we will assume Let M :=diag(µi ) be the diagonal matrix with µi as its
nonzero entries. Let β := (βi , for all i)T and µ := (βi ,
α̂ for all i)T be column vectors.
q(t) > · min di for all t ≥ 0
2c i The proof of the following lemma is straightforward and
This implies that, for all t ≥ 0, equality holds in (5), or can be found in [24].
equivalently, (13) holds. Lemma 7: For γ ∈ (0, 1],
∂F
= γ M − βµT + (1 − γ)I
9 When γ = 1, then the proof shows that we can set = 0 in the statement
of Theorem 5 after at most c/α̂ periods. Moreover, Y (t) = c + α̂ for all ∂y
t ≥ 1. It also implies that, if di = d for all i, then q(t) = α̂d/c for all
t ≥ c/α̂, i.e., the system converges in finite time. where I is the N × N identity matrix.
13
[26] C. Jin, D. X. Wei, S. H. Low, G. Buhrmaster, J. Bunn, D. H. Choe, general networks. IEEE/ACM Transactions on Networking , 13(1):43-
R. L. A. Cottrell, J. C. Doyle, W. Feng, O. Martin, H. Newman, 56, February 2005.
F. Paganini, S. Ravot, and S. Singh. FAST TCP: from theory to [53] Fernando Paganini, Zhikui Wang, Steven H. Low, and John C. Doyle.
experiments. IEEE Network, 2005. A new TCP/AQM for stability and performance in fast networks. In
[27] Shudong Jin, Liang Guo, Ibrahim Matta, and Azer Bestavros. A spec- Proc. of IEEE Infocom, April 2003. http://www.ee.ucla.edu/
trum of TCP-friendly window-based congestion control algorithms. ˜paganini .
IEEE/ACM Transactions on Networking, 11(3), June 2003. [54] Attila Pásztor and Darryl Veitch. Pc based precision timing without
[28] D. Katabi, M. Handley, and C. Rohrs. Congestion control for high- GPS. In Proc. ACM Sigmetrics, June 2002.
bandwidth delay product networks. In Proc. ACM Sigcomm, August [55] Luigi Rizzo. Dummynet. http://info.iet.unipi.it/
2002. http://www.ana.lcs.mit.edu/dina/XCP/ . ˜luigi/ip_dummynet/ .
[29] Frank P. Kelly. Mathematical modelling of the Internet. In B. Engquist [56] R. Shorten, D. Leith, J. Foy, and R. Kilduff. Analysis and design of
and W. Schmid, editors, Mathematics Unlimited - 2001 and Beyond, congestion control in synchronised communication networks. In Proc.
pages 685–702. Springer-Verlag, Berlin, 2001. of 12th Yale Workshop on Adaptive and Learning Systems, May 2003.
[30] Frank P. Kelly. Fairness and stability of end-to-end congestion control. www.hamilton.ie/doug_leith.htm .
European Journal of Control, 9:159–176, 2003. [57] R. Srikant. The Mathematics of Internet Congestion Control.
[31] Frank P. Kelly, Aman Maulloo, and David Tan. Rate control for Birkhauser, 2004.
communication networks: Shadow prices, proportional fairness and [58] W. Stevens. TCP Slow Start, Congestion Avoidance, Fast Retransmit,
stability. Journal of Operations Research Society, 49(3):237–252, and Fast Recovery algorithms. RFC 2001, January 1997.
March 1998. [59] L. Tan, C. Yuan and M. Zukerman. FAST TCP: Fairness and Queuing
[32] Tom Kelly. Scalable TCP: Improving performance in highspeed wide Issues. to appear in IEEE Communications Letters.
area networks. Computer Communication Review, 32(2), April 2003. [60] Ao Tang, Jiantao Wang, Sanjay Hegde, and Steven H. Low. Equi-
http://www- lce.eng.cam.ac.uk/˜ctk21/scalable/ . librium and fairness of networks shared by TCP Reno and FAST.
[33] S. Kunniyur and R. Srikant. Designing AVQ parameters for a general Telecommunications Systems, special issue on High Speed Transport
topology network. In Proceedings of the Asian Control Conference, Protocols, 2005
September 2002. [61] Ao Tang, Jiantao Wang, Steven H. Low, and Mung Chiang. Equilib-
[34] S. Kunniyur and R. Srikant. A time-scale decomposition approach rium of heterogeneous congestion control protocols. In Proc. of IEEE
to adaptive ECN marking. IEEE Transactions on Automatic Control, Infocom, March 2005.
June 2002. [62] Darryl Veitch, Satish Babu, and Attila Pásztor. Robust remote syn-
[35] S. Kunniyur and R. Srikant. End-to-end congestion control: utility chronisation of a new clock for PCs. Preprint, January 2004.
functions, random losses and ECN marks. IEEE/ACM Transactions [63] Glenn Vinnicombe. On the stability of networks operating TCP-like
on Networking, 11(5):689 – 702, October 2003. congestion control. In Proc. of IFAC World Congress, 2002.
[36] A. Kuzmanovic and E. Knightly. TCP-LP: a distributed algorithm for [64] Glenn Vinnicombe. Robust congestion control for the Internet. sub-
low priority data transfer. In Proc. of IEEE Infocom, 2003. mitted for publication, 2002.
[37] T. V. Lakshman and Upamanyu Madhow. The performance of [65] Jiantao Wang, Ao Tang, and Steven H. Low. Local stability of FAST
TCP/IP for networks with high bandwidth–delay products and ran- TCP. In Proc. of the IEEE Conference on Decision and Control,
dom loss. IEEE/ACM Transactions on Networking, 5(3):336–350, December 2004.
June 1997. http://www.ece.ucsb.edu/Faculty/Madhow/ [66] Jiantao Wang, David X. Wei, and Steven H. Low. Modeling and
Publications/ton97.ps . stability of FAST TCP. In Proc. of IEEE Infocom, March 2005.
[38] Y. Li. Implementing High-Speed TCP. http://www.hep.ucl. [67] Jiantao Wang, David X. Wei, and Steven H. Low. Modeling and
ac.uk/˜ytl/tcpip/hstcp/index.html . stability of FAST TCP. In P. Agrawal, M. Andrews, P. J. Fleming,
[39] S. H. Low, F. Paganini, J. Wang, and J. C. Doyle. Linear stability G. Yin and L. Zhang, editors, IMA Volumes in Mathematics and
of TCP/RED and a scalable control. Computer Networks Journal, its Applications, Volume 143, Wireless Communications. Springer
43(5):633–647, 2003. http://netlab.caltech.edu . Science, 2006.
[40] S. H. Low and R. Srikant. A mathematical framework for designing [68] R. Wang, M. Valla, M. Sanadidi, B. Ng, and M. Gerla. Using
a low-loss, low-delay internet. Networks and Spatial Economics, adaptive rate estimation to provide enhanced and robust transport over
special issue on “Crossovers between transportation planning and heterogeneous networks. In Proc. of IEEE ICNP, 2002.
telecommunications”, E. Altman and L. Wynter, 4:75–101, March [69] Z. Wang and J. Crowcroft. Eliminating periodic packet losses in the
2004. 4.3-Tahoe BSD TCP congestion control algorithm. ACM Computer
[41] Steven H. Low. A duality model of TCP and queue management Communications Review, April 1992.
algorithms. IEEE/ACM Trans. on Networking, 11(4):525–536, August [70] David X. Wei. Congestion Control Algorithms for High
2003. http://netlab.caltech.edu . Speed Long Distance TCP Connections. Master Thesis, Cal-
[42] Steven H. Low and David E. Lapsley. Optimization flow control, ifornia Institute of Technology, Pasadena, Jun 2004. URL:
I: basic algorithm and convergence. IEEE/ACM Transactions on http://netlab.caltech.edu/pub/projects/FAST/msthesis-dwei.
Networking, 7(6):861–874, December 1999. http://netlab. [71] David X. Wei, Steven H. Low, and Sanjay Hegde. A burstiness
caltech.edu . control for TCP. In Proceedings of the 3rd International Workshop
on Protocols for Fast Long-Distance Networks (PFLDnet’2005), Feb
[43] Steven H. Low, Fernando Paganini, and John C. Doyle. Internet 2005.
congestion control. IEEE Control Systems Magazine, 22(1):28–43, [72] E. Weigle and W. Feng. A case for TCP Vegas in high-performance
February 2002. computational grids. In Proceedings of the 9th International Sympo-
[44] Steven H. Low, Larry Peterson, and Limin Wang. Understanding sium on High Performance Distributed Computing (HPDC’01), August
Vegas: a duality model. J. of ACM, 49(2):207–235, March 2002. 2001.
http://netlab.caltech.edu . [73] J.T. Wen and M. Arcak. A unifying passivity framework for network
[45] Jim Martin , Arne Nilsson , Injong Rhee. Delay-based congestion flow control. IEEE Transaction on Automatic Control, to appear, 2003.
avoidance for TCP. IEEE/ACM Transactions on Networking, v.11 n.3, [74] Bartek Wydrowski. High-resolution one-way delay measurement using
p.356-369, June 2003. RFC1323. Preprint, August 2004.
[46] L. Massoulie and J. Roberts. Bandwidth sharing: objectives and [75] Lisong Xu, Khaled Harfoush, and Injong Rhee. Binary increase
algorithms. IEEE/ACM Transactions on Networking, 10(3):320–328, congestion control for fast long distance networks. In IEEE Proc.
June 2002. of INFOCOM, March 2004.
[47] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP Selective [76] H. Yaiche, R. R. Mazumdar, and C. Rosenberg. A game theoretic
Acknowledgment Options. RFC 2018, October 1996. framework for bandwidth allocation and pricing in broadband net-
[48] Matthew Mathis, Jeffrey Semke, Jamshid Mahdavi, and Teunis Ott. works. IEEE/ACM Transactions on Networking, 8(5), October 2000.
The macroscopic behavior of the TCP congestion avoidance algorithm. [77] The Network Simulator - ns-2. URL: http://www.isi.edu/nsnam/ns/.
ACM Computer Communication Review, 27(3), July 1997. http: [78] CUBIN Lab. FAST TCP simulator module for NS-2. URL:
//www.psc.edu/networking/papers/model_ccr97.p s. http://www.cubinlab.ee.mu.oz.au/ns2fasttcp/.
[49] J. Mo, R. La, V. Anantharam, and J. Walrand. Analysis and comparison [79] NetLab, Caltech NS-2 Simulation results of FAST. URL:
of TCP Reno and Vegas. In Proceedings of IEEE Infocom, March 1999. http://netlab.caltech.edu/pub/projects/FAST/ns2-test.
[50] Jeonghoon Mo and Jean Walrand. Fair end-to-end window-based
congestion control. IEEE/ACM Transactions on Networking, 8(5):556–
567, October 2000.
[51] Fernando Paganini, John C. Doyle, and Steven H. Low. Scalable laws
for stable network congestion control. In Proceedings of Conference
on Decision and Control, December 2001. http://www.ee.ucla.
edu/˜paganini .
[52] Fernando Paganini, Zhikui Wang, John C. Doyle, and Steven H. Low.
Congestion control for high performance, stability and fairness in