0% found this document useful (0 votes)
68 views

Accepted Manuscript: 10.1016/j.jnca.2017.12.016

This document presents a new Ethernet Congestion Control Protocol (ECCP) that uses commodity Ethernet switches as a router's switch fabric. ECCP aims to maintain low queue lengths and latency in Ethernet networks by controlling transmission rates based on available bandwidth estimation before links become saturated. The authors develop a mathematical model of ECCP using delay differential equations and analyze its stability using phase plane methods. Extensive simulations validate that ECCP can avoid congestion while achieving high throughput and minimum latency without requiring switch modifications.

Uploaded by

anup_sky88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Accepted Manuscript: 10.1016/j.jnca.2017.12.016

This document presents a new Ethernet Congestion Control Protocol (ECCP) that uses commodity Ethernet switches as a router's switch fabric. ECCP aims to maintain low queue lengths and latency in Ethernet networks by controlling transmission rates based on available bandwidth estimation before links become saturated. The authors develop a mathematical model of ECCP using delay differential equations and analyze its stability using phase plane methods. Extensive simulations validate that ECCP can avoid congestion while achieving high throughput and minimum latency without requiring switch modifications.

Uploaded by

anup_sky88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Accepted Manuscript

Zero-queue ethernet congestion control protocol based on available bandwidth


estimation

Mahmoud Bahnasy, Halima Elbiaze, Bochra Boughzala

PII: S1084-8045(17)30423-X
DOI: 10.1016/j.jnca.2017.12.016
Reference: YJNCA 2036

To appear in: Journal of Network and Computer Applications

Received Date: 4 January 2017


Revised Date: 9 November 2017
Accepted Date: 23 December 2017

Please cite this article as: Bahnasy, M., Elbiaze, H., Boughzala, B., Zero-queue ethernet congestion
control protocol based on available bandwidth estimation, Journal of Network and Computer
Applications (2018), doi: 10.1016/j.jnca.2017.12.016.

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and all
legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT

Zero-queue Ethernet Congestion Control Protocol based


on available bandwidth estimation

PT
Mahmoud Bahnasya , Halima Elbiazeb , Bochra Boughzalac
a École de Technologie Supérieure, Montréal, Canada
b Université du Québec à Montréal, Canada

RI
c Ericsson Research, Canada

SC
Abstract

Router’s switch fabric has strict characteristics in terms of packet loss, latency,

U
fairness and head-of-line (HOL) blocking. Network manufacturers address these

AN
requirements using specialized, proprietary and highly expensive switches. Si-
multaneously, IEEE introduces Data Center Bridging (DCB) as an enhancement
to existing Ethernet bridge specifications which include technological enhance-
M
ments addressing packet loss, HOL blocking and latency issues. Motivated by
DCB enhancements, we investigate the possibility of using Ethernet commodity
D

switches as a switch fabric for routers. Thereby, we present Ethernet Congestion


Control Protocol (ECCP) that uses Ethernet commodity switches to achieves
TE

flexible and cost-efficient switch fabric, and fulfills the strict router character-
istics. Furthermore, we present a mathematical model of ECCP using Delay
Differential Equations (DDEs), and analyze its stability using the phase plane
EP

method. We deduced the sufficient conditions of the stability of ECCP that


could be used for parameter setting properly. We also discovered that the sta-
C

bility of ECCP is mainly ensured by the sliding mode motion, causing ECCP
to keep cross traffic close to the maximum link capacity and queue length close
AC

to zero. Extensive simulation scenarios are driven to validate the analytical re-
sults of ECCP behavior. Our analysis shows that ECCP is practical in avoiding

Email addresses: [email protected] (Mahmoud Bahnasy),


[email protected] (Halima Elbiaze), [email protected]
(Bochra Boughzala)

Preprint submitted to Journal of LATEX Templates December 26, 2017


ACCEPTED MANUSCRIPT

congestion and achieving minimum network latency. Moreover, to verify the


performance of ECCP in real networks, we conducted a testbed implementation
for ECCP using Linux machines and a 10-Gbps switch.

PT
Keywords: Data Center Bridging, Congestion Control, Congestion
Prevention, Priority-based Flow Control PFC, Quantized Congestion

RI
Notification QCN, Ethernet Congestion Control Protocol ECCP

SC
1. Introduction

Router’s switch fabric is an essential technology that is traditionally ad-

U
dressed using custom Application-Specific Integrated Circuit (ASIC). This ASIC
must fulfill particular characteristics including low packet loss, fairness between
5
AN
flows, and low latency [1]. The emergence of very-high-speed serial interfaces
and new router’s architectures increase the design and manufacturing cost of the
switch fabric chipset. Traditionally, switch fabric is manufactured using either
M
shared memory or crossbar switch as shown in Fig. 1a and Fig. 1b respectively.
The shared memory architecture requires memory that works N times faster
D

10 than port speed, where N is the number of ports which raises scalability issue.
On the other hand, crossbar architecture tries to keep the buffering at the edge
TE

of the router (Virtual Output Queue VOQ inside line cards). Because this ar-
chitecture requires N VOQs at each ingress port and a central unit (arbiter), it
EP

faces scalability issue [2].

Terminal Terminal
Interfaces Interfaces
Line Card
Line Card

Line Card
C
Line Card
Line Card

Line Card

... .. .
AC

shared memory Crossbar switch Arbiter

(a) Shared-memory-based switch fab- (b) Crossbar-based switch fabric archi-


ric architecture tecture

Figure 1: Router’s switch fabric architectures

2
ACCEPTED MANUSCRIPT

15 In this research, we introduce a new router architecture that uses Ethernet


commodity switches as a switch fabric. In this architecture, we keep all buffering
at the edge of the router and an Ethernet switch is used as a switch fabric. IEEE

PT
has recently presented Data Center Bridging (DCB) [3] that comprises several
enhancements to Ethernet network. However, Ethernet network still suffers

RI
20 from HOL blocking, congestion spreading and high latency. To overcome these
limitations and achieve a non-blocking switch fabric, we present Ethernet Con-

SC
gestion Control Protocol (ECCP) that maintains Ethernet network non-blocked
by preserving switches’ queue lengths close to zero leading to minimum latency
and no HOL blocking. Unlike traditional Congestion control mechanisms that

U
25 use packet accumulation in buffers to trigger the rate control process, ECCP
estimates available bandwidth and uses this information to control transmission
AN
rates before link saturation or data accumulation. Accordingly, it achieves min-
imum latency by trading off a small margin of link capacity. Therefore, ECCP
M
achieves (i) low queue length, (ii) low latency, and (iii) high throughput, (iv)
30 with no switch modification. Such a mechanism could be used in manufactur-
ing a cost-efficient routers’ switch fabric while guaranteeing traditional router
D

characteristics. Besides, it can be utilized as a reliable and robust layer 2 con-


TE

gestion control mechanism for data center applications (e.g. high-performance


computing [4], remote direct memory access (RDMA) [5], and Fibre Channel
35 over Ethernet (FCoE) [6]).
EP

Furthermore, we introduce a mathematical model of ECCP while using the


phase plane method. First, we build a fluid-flow model for ECCP to derive
the delay differential equations (DDEs) that represent ECCP. Then, we sketch
C

the phase trajectories of the rate increase and rate decrease subsystems. Con-
AC

40 sequently, we combine these phase trajectories to understand the transition


between ECCP’s subsystems and to obtain the phase trajectory of the global
ECCP system. Subsequently, the stability of ECCP is analyzed based on this
phase trajectory. Our analysis reveals that the stability of ECCP depends
mainly on the sliding mode motion [7]. Thereafter, we deduce stability con-
45 ditions that assist in defining proper parameters for ECCP. Besides, several

3
ACCEPTED MANUSCRIPT

simulations are conducted using OMNEST [8] to verify our mathematical anal-
ysis. Finally, a Linux-based implementation of ECCP is conducted to verify
ECCP’s performance through experiment.

PT
The rest of this paper is organized as follows. Related work is introduced in
50 Section 2. Section 3 presents ECCP mechanism. Section 4 introduces the phase

RI
plane analysis method in brief. The mathematical model of ECCP is derived
in Section 5. The stability analysis of ECCP is deduced in Section 6. Linux-

SC
based implementation is presented in Section 7. Finally, Section 8 introduces
conclusion and future work.

U
55 2. RELATED WORK

AN
In this section, we present some research work that is closely related to con-
gestion control in both Ethernet layer and Transmission Control Protocol (TCP)
layer. IEEE has recently presented Data Center Bridging (DCB) [3] that com-
M
prise several enhancements for Ethernet network to create a consolidation of I/O
60 connectivity through data centers. DCB aims to eliminate packet loss due to
D

queue overflow. Ethernet PAUSE IEEE 802.3x and Priority-based Flow Control
(PFC) [9] are presented in DCB as link level (hop-by-hop) mechanisms. Ether-
TE

net PAUSE was issued to solve packet loss problem by sending a PAUSE request
to the sender when the receiver buffer reaches a certain threshold. Thus, the
EP

65 sender stops sending data until a local timer expires or a resume notification is
received from the receiver. PFC divides data path into eight traffic classes, each
could be controlled individually. Yet, PFC is still limited because it operates on
C

port plus priority level which can cause congestion-spreading and HOL blocking
[9, 10].
AC

70 Quantized Congestion Notification (QCN) [11, 12] is an end-to-end control


mechanism which is standardized in IEEE 802.1 Qau [11]. QCN aims to keep
queue length at a predefined level called equilibrium queue length (Q eq ). QCN
consists of two parts, (i) Congestion Point (CP) (in bridges) and (ii) Reaction
Point (RP) (in hosts) (Fig. 2). The CP measures queue length (Q), and cal-

4
ACCEPTED MANUSCRIPT

CP
Q Qeq
Data Frames sampling

RP

PT
CNM Frames
Fb = -((Q - Qeq) + w (Q - Qold))

RI
Figure 2: QCN framework: CP in the bridge, and RP in the host’s NIC

SC
75 culates a feedback (Fb) value, in a probabilistic manner, to reflect congestion
severity (Equation 1).

U
Fb = −((Q − Q eq ) + w × (Q − Q ol d )). (1)
AN
Where Q ol d is the previous queue length and w is a constant which equals 2
(for more details refer to [11]). If the calculated Fb is negative, CP creates a
M
Congestion Notification Message (CNM) and sends it to the RP.
80 QCN reduces the overhead of control information traffic and reduces the
required computational power by calculating Fb in a probabilistic manner. At
D

the end host, when RP receives CNM, it decreases its transmission rate accord-
TE

ingly. If no CNM is received, the RP increases its transmission rate according


to a three-phase rate increase algorithm [11].
85 Due to the probabilistic manner of calculating Fb, QCN experiences several
EP

issues regarding fairness [13, 14] and queue length fluctuation [15]. In addition,
QCN does not achieve minimum latency as it keeps queue length at a certain
level (Q eq ).
C

Several research papers have discussed various enhancements for QCN. For
AC

90 example, [15] presents the use of delay variation as an indication of congestion


to address queue fluctuation issue. Other studies like [13, 14] addressed QCN
fairness issue by using new Active Queue Management (AQM) [16] algorithms
that are capable of identifying the culprit flows. Thus, they send CNMs for each
culprit flow. These techniques achieve fairness but they are implemented in the
95 switch which we aim to avoid.

5
ACCEPTED MANUSCRIPT

Data Center TCP (DCTCP) [17] uses switches that support Explicit Con-
gestion Notification (ECN) to mark packets that arrive while queue length is
greater than a predefined threshold. DCTCP source reacts by reducing the

PT
window proportionally to the fraction of marked packets. Data Center QCN
100 (DCQCN) [18] combines the characteristics of Data Center TCP (DCTCP) [17]

RI
and QCN in order to achieve QCN-like behavior while using the ECN mark-
ing feature. DCQCN requires very strict parameters selection regarding byte

SC
counter and marking probability.
Trading a little bandwidth to achieve low queue length and low latency is dis-
105 cussed in a number of papers. For example, HULL (High-bandwidth Ultra-Low

U
Latency) is presented in [19] to reduce average and tail latencies in data centers
by sacrificing a small amount of bandwidth (e.g., 10%). HULL presents the
AN
Phantom Queue (PQ) as a new marking algorithm. Phantom queues simulate
draining data at a fraction (< 1) of link rate. This process generates a virtual
M
110 backlog that is used to mark data packets before congestion. The challenges of
HULL are the needs of switch modification.
TIMELY [20] is a congestion control scheme for data centers. It uses the
D

deviation of Round-Trip Time (RTT) to identify congestion, instead of ECN


TE

marking in DCTCP. TIMELY can significantly reduce queuing delay and it


115 would be interesting to compare ECCP and TIMELY in future work.
Enhanced Forward Explicit Congestion Notification (E-FECN) [21] and proac-
EP

tive congestion control algorithm (PERC) [22] are presented as congestion con-
trol mechanisms that exploit the measured available bandwidth to control data
rates. However, these two methods require switch modifications which we aim
C

120 to avoid.
AC

Few centralized solutions are proposed in the literature. For example, Fast-
pass [23] embraces central control for every packet transmission which raises a
scalability issue.
Another approach to enhance the performance of TCP protocol was to dis-
125 tinguish between congestive packet loss and non-congestive packet loss [24, 25].
Therefore, the TCP congestion avoidance algorithm could be activated only

6
ACCEPTED MANUSCRIPT

when congestive packet loss is detected. E.g., TCP INVS [24] estimates network
queue length and compare this estimation to a threshold. If the estimated queue
length exceeds the threshold, the loss is caused by congestion. Consequently,

PT
130 TCP INVS activate the traditional congestion avoidance algorithm. Otherwise,
the loss is considered as a non-congestion loss and TCP INVS ignores it and

RI
avoids limiting congestion window growth. In addition, [25] proposes an RTT es-
timation algorithm using Autoregressive Integrated Moving Average (ARIMA)

SC
model. By analyzing the estimated RTT, one can estimates the sharp and sud-
135 den changes in the RTT, thereby differentiating the non-congestive packet loss
from congestive packet loss. While these mechanisms acheive better through-

U
put on lossy networks, it introduces an extra packet loss that is not suitable for
router switch fabric or data center network.
AN
Optimizing the routing decision to control the congestion is also proposed in
140 several research papers. Most of this research follows a key idea called the back-
M
pressure algorithm [26] where traffic is directed around a queuing network to
achieve maximum network throughput. An example of this scheme is presented
in [27] where the authors developed a second-order joint congestion control and
D

routing optimization framework that aims to offer resource optimization and


TE

145 fast convergence. Such a scheme can significantly reduce queuing delay and it
would be interesting to investigate this scheme in future work.
EP

3. ECCP : Ethernet Congestion Control Protocol

In this section, we present ECCP as a distributed congestion prevention


C

algorithm that works on Ethernet layer. ECCP controls data traffic according
to the estimate Available Bandwidth (AvBw) through a network path. ECCP
AC

150

strives to keep link occupancy less than the maximum capacity by a percentage
called Availability Threshold (AvT). Traditional congestion control mechanisms
aim to keeps the queue around a target level. These mechanisms can reduce
queuing latency, but they cannot eliminate it. In these mechanisms, a non-zero
155 queue must be observed before reaction, and sources need one RTT to react

7
ACCEPTED MANUSCRIPT

to this observation, which causes data accumulation in queues even further.


On the other hand, ECCP uses AvBw as a congestion signal to trigger sources
reaction before data accumulation in the queue. Therefore, ECCP achieves a

PT
close-to-zero queue length, leading to minimum network latency.
160 ECCP estimates AvBw through network path by sending trains of probe

RI
frames periodically through this path. Sender adds sending time and other
information as train identifier and sequence number within the train to each

SC
probe frame. On the receiver side, ECCP receives these frames and estimates
AvBw using a modified version of Bandwidth Available in Real-Time (BART)
165 [28]. Afterward, ECCP transmits this information back to the sender. At

U
the sender side, ECCP controls transmission rate based on the received AvBw
value. ECCP advocates rate-based control schemes instead of window-based
AN
control schemes because window-based schemes encounter significant challenges
particularly with the rapid increase of the control cycle time, defined mainly
M
170 by propagation delay, compared to transmission time in modern networks [29].
In addition, [30] and [31] state that at high line rates, queue size fluctuations
become fast and difficult to control because queuing delay is shorter than the
D

control loop delay. Thus, rate based control schemes are more reliable.
TE

3.1. ECCP components

175 In this section, ECCP architecture is described in detailed and the inter-
EP

actions between its components are explained. ECCP prevents congestion by


keeping a percentage of the link capacity available called Availability Threshold
( AvT ). Thus, for any link of maximum capacity C, AvT creates a bandwidth
C

stability margin equals AvT × C. This bandwidth stability margin allows ECCP
AC

180 to send probe traffic without causing queue accumulation. ECCP does not re-
quire switch modification because all its functionalities are implemented inside
line cards or hosts.
Fig. 3 depicts ECCP components1 : (1) probe sender , (2) probe re-

1 The rate limiter in Fig. 3 is outside the scope of this paper

8
ACCEPTED MANUSCRIPT

ceiver , (3) bandwidth estimator , and (4) rate controller . These modules
are implemented in every line card in the router or every host.

PT
Rate Limiter
Probe
Probe
Rate Limiter Receiver

RI
Sender
Network Queue
Rate Bandwidth
Controller Estimator

SC
Data Frame Data Path
Probe Frame Probe frames Path
AvBw Estimation Frame Control Path

U
Figure 3: ECCP components
185

3.1.1. ECCP probe sender


AN
ECCP control cycle starts with the probe sender . This module generates
M
probe trains each of size N frames. Thereupon, it sends them through the
network towards destination host. By the time they leave the source, each
D

190 probe frame is tagged with a sending time. Other information is added to the
probes such as sequence number and train identifier. ECCP probe sender
TE

sends probe traffic of a uniformly distributed random rate µ between a fixed


minimum value and R × AvT. ECCP is not trying to estimate an exact value for
AvBw. Instead, it only estimates AvBw value within a maximum limit equals
EP

195 R × AvT. Thus, ECCP gets enough information to control (decrease or increase)
data rate while limiting probe rate to R × AvT. Hence, probe traffic for M flows
C

crossing one link (M × R × AvT ) never exceeds link bandwidth stability margin
( AvT × C).
AC

3.1.2. ECCP probe receiver


200 The probe receiver captures probe frames, retrieves probe information,
and adds receiving time for each probe frame. Then, ECCP probe receiver
forwards each probe train information to ECCP bandwidth estimator for
additional processing.

9
ACCEPTED MANUSCRIPT

3.1.3. ECCP bandwidth estimator


205 The bandwidth estimator estimates AvBw using a modified version of

PT
BART which is based on a self-induced congestion model. In this model, when
∆in
ne
t li

- Δin
ai gh
Str

Δin
∆out

ε = Δout
β

RI
Probin AvBw=μ α μ+
g ra
te μ ε=
Probe Sender Probing rate μ
0
μ<AvBw μ>AvBw
No Congestion Network Congested probing
d

SC
e ive e r Probe Receiver
c ra t
Re

Network Queue

U
Data Receiver

Data Sender AN
Probe Traffic Data Traffic

Figure 4: The effect of injecting probe traffic into network


M
probe traffic of rate µ and fixed inter-frame intervals ∆in is inserted along a
network path, the received inter-frame interval ∆out is affected by the network
D

state such that; if µ is greater than AvBw, network queues start accumulating
TE

210 data which increases ∆out . Otherwise, ∆out will be, in average, equal to ∆in
(Fig. 4). This model does not require clock synchronization between hosts.
Rather, it uses the relative queuing delay between probe frames.
EP

BART derives a new metric to define the change of the inter-frame time
called strain  = (∆out − ∆in )/∆in . For probe rate µ that is less than AvBw, the
215 strain will be, on average, equal to zero ( ≈ 0). Otherwise, the strain increases
C

proportionally to the probe rate µ (Fig. 4). This linear relation between strain
AC

 and probe rate µ is represented using (2).

if µ ≤ AvBw

0


= (2)
 α µ + β if µ > AvBw.



Based on this linear relationship between strain  and probe rate µ, ECCP
estimates AvBw as the maximum probe rate that keeps the strain  equal to

10
ACCEPTED MANUSCRIPT

220 zero (α × AvBw + β = 0) as in (3) (for more details see [32]).

AvBw = − β/α. (3)

PT
For that purpose, the bandwidth estimator calculates the strain  i for each
probe pair {i = 1, . . . , N − 1}. Then, the calculated average  and its variance R

RI
are forwarded to Kalman Filter (KF). In addition, an estimation of the system
noise covariance Q and measurement error P are provided. Kalman filter works

SC
225 on continuous linear systems while this model has a discontinuity separating
two linear segments as shown in Fig. 4. Thus, BART ignores the probe rates µ
that are not on the horizontal line where µ is less than the last estimated AvBw

U
(µ < AvBw). Unlike BART, ECCP does not ignore probe train information
that is not on the straight line. Instead, it uses that probe rate µ to provide an
230
AN
estimation of AvBw using (4) (for more details see [32]).

if  <  t

 max(µ j )
M


AvBw =  (4)
 KF( t ,Q, P) if  ≥  t



where j is the probe train number and  t is the strain threshold that identifies
D

the starting point of the straight line. After that, Kalman filter calculates α and
TE

β parameters. As such, AvBw is calculated using (4). Afterward, bandwidth


estimator sends estimated AvBw back to the source in a CNM message.
EP

235 3.1.4. ECCP rate controller


ECCP rate controller is a key component of ECCP mechanism. It controls
the transmission rate R according to Additive Increase Multiplicative Decrease
C

(AIMD) model after receiving AvBw value. Based on the received estimated
AC

AvBw, ECCP rate controller calculates available bandwidth ratio Ar accord-


240 ing to (5). Ar represents the ratio of the available bandwidth to the bandwidth
stability margin (R × AvT ) (Fig. 5).

AvBw
Ar = . (5)
R × AvT

11
ACCEPTED MANUSCRIPT

Bandwidth

AvT x R
AvBweq
Link Capacity AvBw- AvBweq

PT
AvBw

Aeq

RI Ar- Aeq
Ar

Data Traffic (R)

SC
Time 0

U
Figure 5: Relationship between Av Bw and A r

AN
ECCP works on keeping Ar at an equilibrium level Aeq . Therefore, it calcu-
lates a feedback parameter Fb to represent the severity of the congestion using
M
(6).

Fb = ( Ar − Aeq ) + w × ( Ar − Aol d ) (6)


D

where Aol d is the previous value of Ar , and w is equal to 2 (similar to QCN) and
TE

it represents a weight for ( Ar − Aol d ); i.e., w makes calculated Fb more sensitive


245 to flows that change their rate aggressively than flows with stable high rates.
Consequently, ECCP uses the calculated Fb to control hosts’ transmission rate.
EP

Furthermore, ECCP rate controller monitors two variables (1) the trans-
mission rate R and (2) the target rate T R. T R is the transmission rate before
congestion and represents an objective rate for the rate increase process. ECCP
C

250 rate controller uses a rate decrease process if the calculated Fb value is neg-
AC

ative otherwise it uses a self-increase process as depicted by (7) (Fig. 6).

 
T R ← R

 

if Fb < 0 (rate decrease process)



 



 R(1 + G d × Fb ) (7)

 

 R ← 1 (R + T R)


2 otherwise (Self-increase process)

12
ACCEPTED MANUSCRIPT

where G d is a fixed value and is taken to make the maximum rate reduction
equals 1/2.

PT
Byte Counter
if Fb < 0 +
Rate R

TR =R Rate -
R = R (1+Gd x Fb) Controller Feed Back

RI
+
R TR Timer

TR = TR + RHAI
TR = TR + RAI

SC
R = 1/2 x (R +TR)
Time

U
AN
Figure 6: ECCP rate control stages
M
Fig. 6 shows the ECCP rate control process in detail. The figure shows that
255 when ECCP calculates a negative Fb, it executes the rate decrease process. In
D

addition, Fig. 6 depicts that ECCP divides the self-increase process into three
stages i) Fast Recovery (FR), ii) Active Increase (AI) and iii) Hyper-Active
TE

Increase (HAI). ECCP determines the increasing stage based on a byte counter
BC and a timer T. The Fast Recovery stage consists of five cycles where each
cycle is defined by sending BC Bytes of data or the expiration of a timer T.
EP

260

The timer defines the end of cycles in case of low rate flows. At each cycle, R is
updated using (7) while keeping T R unchanged. If the byte counter or the timer
C

completes five cycles in FR stage while no negative Fb is calculated, the rate


controller enters Active Increase (AI) stage. In this stage, T R is increased by
AC

265 a predefined value R AI . Moreover, the byte counter and the timer limits are
set to BC/2 and T/2 respectively. Afterward, the rate controller enters the
Hyper-Active Increase (HAI) stage, if both the byte counter and the timer finish
five cycles. In the HAI stage, T R is increased by a predefined value RH AI as

13
ACCEPTED MANUSCRIPT

in (8).

T R + R AI (AI)


TR ←  (8)

PT

T R + RH AI (HAI).


270 Where R AI is the rate increase step in AI stage and RH AI is the rate increase step
in HAI stage. Algorithms 1 and 2 depict ECCP rate decrease and self-increase

RI
processes respectively.

SC
Algorithm 1: ECCP rate decrease process
Input : Available Bandwidth AvBw

Output: New sending rate R

U
1 Ar ← AvBw/R × AvT ;

3 if (Fb < 0) then


AN
Fb ← ( Ar − Aeq ) + w × ( Ar − Aol d ) ;

4 T R ← R;
M
5 R ← R(1 + G d × Fb ) ; /* Rate decrease */

6 Sel f IncreaseStarted ← T RU E ; /* Initialize self-increase */


D

7 ByteC ycleCnt ← 0 ;

TimeC ycleCnt ← 0 ;
TE

9 ByteCnt ← BC ;

10 Timer ← T ;
EP

11 end
C
AC

14
ACCEPTED MANUSCRIPT

Algorithm 2: ECCP self-increase process


Output: New sending rate R

PT
1 foreach sentFrame do
2 if Sel f IncreaseStarted == T RU E then
3 ByteCnt ← ByteCnt − Byte( f rameSize) ;

RI
4 if (ByteCnt ≤ 0) then
5 ByteC ycleCnt + + ;

SC
6 if (ByteC ycleCnt < 5) then
7 ByteCnt ← BC ;

else

U
8

9 ByteCnt ← BC/2 ;

10 Adjust Rate() ; AN
11 end
M
12 foreach timeout do
13 if Sel f IncreaseStarted == T RU E then
14 TimeC ycleCnt + + ;
D

15 if (ByteC ycleCnt > 5) or (TimeC ycleCnt > 5) then


TE

16 RestartTimer (T/2) ; /* AI or HAI stage */

17 else
18 RestartTimer (T ) ; /* FR stage */
EP

19 Adjust Rate() ;

20 end
C

21
AC

22 AdjustRate():

23 if (ByteC ycleCnt > 5) and (TimeC ycleCnt > 5) then


24 T R ← T R + 5 Mbps ; /* HAI stage */

25 else if (ByteC ycleCnt > 5) or (TimeC ycleCnt > 5) then


26 T R ← T R + 50 Mbps ; /* AI stage */

27 R ← 1/2 × (R + T R);

28 if (R > linkCapacit y) then


15
29 R ← linkCapacit y; Sel f IncreaseStarted ← F ALSE;
ACCEPTED MANUSCRIPT

4. Phase Plane Analysis

In this paper, we use phase plane method to visually represent certain charac-

PT
275 teristics of the differential equation of the ECCP. Phase plane is used to analyze
the behavior of nonlinear systems. The solutions of differential equations are
a set of functions which could be plotted graphically in the phase plane as a

RI
two-dimensional vector field. Given an autonomous system represented by a
differential equation x 00 (t) = f (x(t), x 0 (t)), one can plot the phase trajectory of

SC
280 such a system by following the direction where time increases. Fig. 7a depicts a
system x(t) in time domain, where a phase trajectory of this system is displayed
in Fig. 7b. One can notice that x(t) and x 0 (t) in time domain can be inferred

U
from the phase trajectory plot. Thus, the phase trajectory provides enough

285
AN
information about the behavior of the system. Moreover, sketching phase tra-
jectories is easier than finding an analytical solution of differential equations,
which sometimes is not possible.
M
x(t)
D

x (t)
TE

t
x(t)
EP

(a) The trajectory of x(t) in time do- (b) Phase trajectory of x 0 (t)
C

main versus x(t)


AC

Figure 7: A phase trajectory example

Congestion control schemes in computer networks require different behaviors


for rate increase and rate decrease subsystems. In addition, the congestion state
controls the transition between these subsystems. The phase plane method
290 could link isolated subsystems and present graphically the switching process.

16
ACCEPTED MANUSCRIPT

Thus, using phase plane method is adequate for analyzing segmented systems
like congestion control protocols [33].
In addition, system parameters limitation can be taken into consideration

PT
explicitly. Therefore, we should consider only the phase trajectories that satisfy
295 our system limitations (i.e link capacity and buffer size) even if the system is

RI
stable according to the derived stability conditions.

SC
5. ECCP Modeling

The core element of ECCP is the rate control algorithm. By responding


correctly to the calculated feedback, the network load should remain around

U
300 the target point. For the purpose of simplicity, we made these assumptions:
AN
• All sources are homogeneous, namely they have the same characteristics
such as round-trip time.
M
• Data flows in data center networks have high rates and appear like con-
tinuous flow fluid.
D

305 • Available bandwidth estimation error is negligible (Measured AvBw is used


in the simulation to avoid estimation errors). We leave studying the effect
TE

of available bandwidth estimation error on ECCP stability for future work.

Given the aforementioned assumptions, ECCP can be modeled while calcu-


EP

lating the available bandwidth using (9).

AvBw(t) = C − M × R(t) (9)


C

310 where AvBw(t) is the available bandwidth at time t, C is the maximum link
AC

capacity, M is the number of flows that share the same bottleneck link, and
R(t) is the host’s transmission rate at time t.
By substituting (9) into (5) we get:

C M
Ar (t) = − (10)
AvT × R(t) AvT
where Ar (t) is the available bandwidth ratio at time t.

17
ACCEPTED MANUSCRIPT

In addition, feedback calculation in (6) becomes:

Fb(t) = Ar (t) − Aeq + w × T × A0r (t − T ) (11)

PT
315 where T is the time interval between trains which defines the control cycle time,
and ( Ar − Aol d ) becomes the derivative of availability ratio A0r multiplied by the

RI
control cycle time T.
Given ECCP rate update equation (7), the derivative of transmission rate

SC
R0 (t) can be represented by the delay differential equation (12).

Gd
 T R(t)Fb(t − τ) if Fb(t − τ) < 0



R (t) = 
0
(12)

U

 T R− R(t )

if Fb(t − τ) ≥ 0
 2 ×T BC
320 AN
where τ is the propagation delay, TBC is the BC counter time.

6. Stability Analysis of ECCP


M
In this section, phase plane is used in studying the stability of ECCP. Phase
plane analysis of ECCP is carried out for the self-increase and rate decrease
D

processes separately. Next, simulation experiments are presented to verify our


TE

325 mathematical analysis.

6.1. Stability Analysis of ECCP Rate Decrease Subsystem


EP

In this section, we analyze ECCP rate decrease subsystem represented by


(12), (11), and (10). For the sake of simplicity and without loss of generality,
we made this linear variable substitution.
C

= Ar (t) − Aeq

 y(t)


AC

 (13)
 y 0 (t) = A0r (t).



Thus, from (10) we get:

C M
y(t) = − − Aeq
AvT × R(t) AvT

18
ACCEPTED MANUSCRIPT

Let ζ = M
AvT + Aeq , we get:
C
y(t) = −ζ
AvT × R(t)

PT
C
R(t) =
AvT × (y(t) + ζ )
C
R0 (t) = − y 0 (t). (14)
AvT × (y(t) + ζ ) 2

RI
The feedback equation could be represented by substituting (13) in (11):

Fb(t) = y(t) + w × T y 0 (t − T ). (15)

SC
Substituting (14) and (15) into the rate decrease part of (12), we get the
rate decrease subsystem equation (16).

U
−C Gd C
y 0 (t) = Fb(t − τ)( )
AvT (y(t) + ζ ) 2 T AvT (y(t) + ζ )
−1
(y(t) + ζ )
AN
y 0 (t) =
Gd
T
Fb(t − τ)
Gd
−y 0 (t) = Fb(t − τ)(y(t) + ζ ). (16)
T
M
Thus, ECCP rate decrease subsystem could be represented by substituting
(15) into (16).
D

Gd  
−y 0 (t) = y(t − τ) + w × T × y 0 (t − T − τ) (y(t) + ζ ). (17)
T
TE

Based on (17), we can state this lemma.

Lemma 1. ECCP rate decrease subsystem is stable when (18) is satisfied.


EP

s
 1 2 1 p 
τ/T < min w − + 2w 2 − + 4w, w + , w + w 2 + 2w . (18)
Gd ζ (G d ζ ) 2 Gd ζ
C

330 For proof, review Appendix A.


AC

6.2. Stability Analysis of ECCP Rate Increase Subsystem


The self-increase subsystem behavior can be summarized as follows: The
stability of ECCP system mainly depends on the sliding mode motion [7] from
self-increase subsystem into the rate decrease subsystem when the system crosses
335 the asymptotic line (Fb = 0). Thus, the ECCP system is asymptotically stable
when inequality (18) is satisfied. For proof, review Appendix B.

19
ACCEPTED MANUSCRIPT

6.3. Verification of ECCP’s stability conditions using simulation

In this section, we use discrete-event simulation to verify the mathemati-

PT
cal analysis of ECCP. Using OMNEST network simulation framework [8], we
340 simulate a dumbbell topology of four data sources and four receivers connected
to two 10-Gbps switches as shown in Fig. 8. All links in this topology have

RI
a maximum capacity of 10 Gbps. We consider the worst case which happens
when all sources send with their maximum link capacity. Thus, we have four

SC
data sources that send data at maximum line capacity (10 Gbps) toward four
345 receivers through one bottleneck link (Fig. 8). Table 1 depicts the simulation
parameters.

U
Source 0 AN Receiver 0

ps
10

Gb
Gb

10
10 G
bps ps Switch 0 Switch 1 b ps
Bottleneck Link 10 G
Source 1 Receiver 1
M
ps 10 G
10 Gb bps
10 Gbps
ps 10
Source 2 Gb G
bp Receiver 2
10 s
D

Source 3 Flow 0 Flow 1 Flow 2 Flow 3 Receiver 3


TE

Figure 8: Simulation topology

Based on ECCP parameters that are shown in Table 1 and inequality (18),
EP

ECCP is stable for all τ < 1.482 T. Fig. 9 shows a box plot of cross traf-
fic. It depicts that ECCP system reduces cross traffic rate to a value lower
C

350 than its minimum limit ((1 − AvT ) × C = 9 Gbps), when τ exceeds the analyt-
ically calculated limit (1.482 T ). In addition, Fig. 9 clearly shows that when
AC

τ = 1.8 ms > 1.482 T, the variation of cross traffic exceeds the maximum al-
lowed margin (AvT × C = 1 Gbps). One can notice that when τ = 3.3 T the
average cross traffic starts to increase again. The reason behind that is the data
355 accumulation in the queue as shown in Fig. 10b.
Fig. 10 depicts the queue length while varying the propagation delay (τ =

20
ACCEPTED MANUSCRIPT

10

Stability Condition Separation Line


9.5
Cross Traffic Rate (Gbps)
9

PT
8.5

7.5

RI
7

6.5

SC
6
0.3 ms (0.3T) 0.6 ms (0.6T) 1.2 ms (1.2T) 1.8 ms (1.8T) 2.4 ms (2.4T) 3.3 ms (3.3T)
Propagation Delay (ms)

Figure 9: Box plot of the cross traffic rate

U
300
=300 s =600 s =1.2ms
250 AN
Queue Length (KB)

200

150
M
100

50
D

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Time (sec)
TE

(a) τ = 0.3 T, 0.6 T, & 1.2 T


300
=1.8ms =2.4ms =3.3ms
250
Queue Length (KB)

EP

200

150

100
C

50
AC

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Time (sec)

(b) τ = 1.8 T, 2.4 T, & 3.3 T

Figure 10: Queue length

21
ACCEPTED MANUSCRIPT

0.3 T, 0.6 T, 1.2 T, 1.8 T, 2.4 T, & 3.3 T). Fig. 10a shows that if the stability
conditions are satisfied (τ < 1.482T), ECCP system succeeds in maintaining a
close-to-zero queue length. Otherwise, data start to accumulate and the queue

PT
360 fluctuates significantly as shown in Fig. 10b.
1

RI
0.95
0.8
0.9

SC
Propability

0.6 0.85

0.8

0.4
0.75

U
0.2 0.7

6 6.5 7 7.5
0.3 ms 0.6 ms 1.2 ms 1.8 ms 2.4 ms 3.3 ms data1 data2
0
0 50
AN
100 150
Queue Length (KB)
200 250

Figure 11: CDF of queue length


M
Fig. 11 illustrates the cumulative distribution function (CDF) of the queue
D

length. It shows that when stability conditions are satisfied and τ = 0.3 T, 0.6 T, 1.2 T,
99-percentile of queue length are less than 6.72 KB, 6.78 KB and 21.9 KB re-
TE

spectively. But when these conditions are violated, 99-percentile of queue length
365 reach up to 294.4 KB.
Fig. 12 depicts the transmission rates while varying the propagation delay.
EP

It shows that as long as τ does not exceed the stability limit (1.482 T), ECCP
system achieves fairness between flows.
C
AC

22
ACCEPTED MANUSCRIPT

PT
RI
10 10
Transmission Rate (Gbps)

Transmission Rate (Gbps)

SC
Host 0 Host 1 Host 2 Host 3 Host 0 Host 1 Host 2 Host 3
8 8

6 6

4 4

2 2

U
0 0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Time (sec) Time (sec)

10
(a) τ = 300 µs (0.3 T )
AN 10
(b) τ = 600 µs (0.6 T )
Transmission Rate (Gbps)

Transmission Rate (Gbps)

Host 0 Host 1 Host 2 Host 3 Host 0 Host 1 Host 2 Host 3


8 8

6 6

4 4
M
2 2

0 0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Time (sec) Time (sec)
D

(c) τ = 1.2 ms(1.2 T ) (d) τ = 1.8 ms(1.8 T )


10 10
Transmission Rate (Gbps)

Transmission Rate (Gbps)

Host 0 Host 1 Host 2 Host 3 Host 0 Host 1 Host 2 Host 3


8 8
TE

6 6

4 4

2 2

0 0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
EP

Time (sec) Time (sec)

(e) τ = 2.4 ms (2.4 T ) (f) τ = 3.3 ms(3.3 T )

Figure 12: Transmission rates


C
AC

23
ACCEPTED MANUSCRIPT

Table 1: Simulation parameters

Data Senders Parameters

PT
Frame size Normal distribution

(avg = 1000, σ = 150)

RI
Min Frame size 200 Byte

SC
Max Frame size 1500 Byte

Propagation Delay τ 40 µsec

U
ECCP Probing Parameters

Number of probe frames N = 33

Size of probe frames


AN 1500 Byte

Time between each train 1 ms


M
Available Threshold AvT = 0.1 (10%)

Minimum probe rate 50 Mbps


D

Maximum probe rate AvT × R


TE

 
0.00001 0.0 
System noise Q =  

 
0.0 0.01
EP

 
1.0 0.0 
Measurement error P =  

 
0.0 100.0
C

ECCP Controller Parameters


AC

Equilibrium available bandwidth ratio Aeq = 0.5 (50%)

Rate control timer T = 5 ms

Rate control byte counter BC = 750 KByte

Gd G d = 100/128

Rate increase step in AI stage R AI = 5 Mbps

Rate increase step in HAI stage24 RH AI = 50 Mbps


ACCEPTED MANUSCRIPT

6.4. Boundary limitations

370 In our stability analysis, we have deduced sufficient stability conditions of

PT
the core mechanism of ECCP. However, ECCP system is also constrained by
physical boundaries such as the maximum link capacity and buffer size. For
example, when the ECCP system reaches the equilibrium point, hosts keep

RI
increasing their data rates until calculating a positive Fb. Thus, cross traffic
375 might reach the maximum limit and data starts to be queued in the system. In

SC
order to avoid this, the integral of the self-increase function from t to (t+(T +2τ))
must be less than the available bandwidth margin (AvT × Aeq ×C), where (T +2τ)
is the control cycle time. The boundary limitation of ECCP queue system is

U
summarized by the following lemma.

380 AN
Lemma 2. ECCP keeps queue length close to zero, thereby ensuring minimum
network latency and preventing congestion if inequality (19) is satisfied.
M
C(T + 2τ)
BC > . (19)
2M
D

The proof of lemma 2 is presented in Appendix C


TE

6.5. Verification of ECCP’s boundary limitations using simulation

Based on ECCP parameters shown in table 1 and inequality (19), ECCP


385 is capable of keeping queue length close to zero when BC > 500 K B. ECCP
EP

is simulated to verify the analytical model. Fig. 13, 14, 15 and 16 depict the
simulation results while varying the byte counter (BC = 150 KB, 450 KB, 600
KB, and 750 KB). In addition, Fig. 13 shows that when inequality (19) is not
C

satisfied (BC < 500 K B), ECCP system becomes unstable and the cross traffic
AC

390 variation exceeds ( AvT × C) limit (1 Gbps). It is clearly shown that reducing
BC decreases the average cross traffic rate and increases its variation. One can
notice that at BC = 150K B, the average cross traffic rate starts to increase again
which is a result of data accumulation in the bottleneck link queue as shown in
Fig. 14. Besides, Fig. 14 depicts that when byte counter does not satisfy the
395 analytically calculated limit BC < 500 K B, the queue starts accumulating data.

25
ACCEPTED MANUSCRIPT

10

Stability Condition Separation Line


Cross Traffic Rate (Gbps)
9.5

PT
9

RI
8.5

SC
8
150 KB 450 KB 600 KB 750 KB
Byte Conter BC (KB)

Figure 13: Cross traffic statistics

U
In contrast, when byte counter limit is satisfied BC > 500 K B, ECCP succeeded AN
in maintaining a close-to-zero queue length.
300
BC=150 KB BC=450 KB BC=600 KB BC=750 KB
M
250
Queue Length (KB)

200
D

150

100
TE

50

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Time (sec)
EP

Figure 14: Queue length

Fig. 15 shows the CDF of the queue length. It depicts that when BC is
C

equal to 750 and 600 KB, 99-percentile of queue length are less than 6.9 KB
AC

400 and 6.8 KB respectively. But when inequality (19) is not satisfied where BC <
500 KB, 99-percentile of queue length reaches up to 299 KB.
Fig. 16 depicts the effect of varying the byte counter BC on the transmission
rates. It shows that when BC ≤ 600 K B, flows with high rate start recovering
faster than flows with low rate (Fig. 16a, 16b & 16c) but when BC > 600 K B,
405 hosts start to recover at a relatively equal speed which achieves fairness between

26
ACCEPTED MANUSCRIPT

0.98

0.8 0.96

0.94

PT
Propability 0.92

0.6 0.9

0.88

0.86
0.4

RI
0.84

0.82

0.2 0.8

5.8 6 6.2 6.4 6.6 6.8 7 7.2 7.4 7.6 7.8

BC=150 KB BC=450 KB BC=600 KB BC=750 KB data1 data2

SC
0
0 50 100 150 200 250
Queue Length (KB)

Figure 15: CDF of queue length

U
10 10
Transmission Rate (Gbps)

Transmission Rate (Gbps)


Host 0 Host 1 Host 2 Host 3 Host 0 Host 1 Host 2 Host 3
8 8

2
AN 6

0 0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
M
Time (sec) Time (sec)

(a) BC = 150 KB (b) BC = 450 KB


10 10
Transmission Rate (Gbps)

Transmission Rate (Gbps)

Host 0 Host 1 Host 2 Host 3 Host 0 Host 1 Host 2 Host 3


8 8
D

6 6

4 4
TE

2 2

0 0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Time (sec) Time (sec)

(c) BC = 600 KB (d) BC = 750 KB


EP

Figure 16: Transmission rates


C

flows (Fig. 16d). This limit exceeds slightly the predicted value by the analytical
analysis (Inequality 19) but stays within an acceptable range.
AC

6.6. Discussion

The time interval between trains T must be greater than the sending time of
410 the whole train (N frames, of 1500 Byte each) with rate equals to AvT × Rmin ,

27
ACCEPTED MANUSCRIPT

where Rmin is the minimum probing rate.


N × 1500 × 8
T ≥ . (20)
AvT × Rmin

PT
Furthermore, T determines the control cycle which controls the buffer bound-
ary. For example, for a stable system of M number of flows, ECCP will keep

RI
the queue length close to zero. If a new flow arrives with rate equals R0 , thus,
415 R0 must satisfy:
R0 × T ≤ B. (21)

SC
Where B is the maximum switch buffer size. In other words, the hardware buffer
inside the switch must satisfy B ≥ T × R0 , or any new flow has to start with

U
B
rate R0 ≤ T.

7. Linux-Based Implementation
AN
420 We have implemented an ECCP testbed using 3 Linux hosts and a 10 Gbps
M
switch. The testbed is connected as shown in Fig. 17 and is configured according
to table 1. In this implementation, we built a Java GUI to periodically collect
D

statistics and plot the actual transmission rate R, and cross traffic rate at the
receiver (Fig. 20).
TE

Statistic
Collector
Sender 0
EP

Sender 1 Receiver
C

Switch
AC

Data Path Stastics Reading

Figure 17: Experiment testbed topology

425 In the next section we present several experiments to validate our bandwidth
estimation method, and in the following section we present the ECCP testbed
implementation.

28
ACCEPTED MANUSCRIPT

7.1. Validating available bandwidth estimation process

ECCP’s available bandwidth estimation process is tested using the afore-

PT
430 mentioned testbed. In this topology, sender 0 sends a constant bit rate traffic
to the receiver and sender 1 sends probe traffic with a randomly generated rate
µ. Fig. 18 shows the measured strains  versus the probe rate µ at the receiver

RI
in three scenarios (i) AvBW = 6 Gbps, (ii) AvBW = 5 Gbps, (iii) AvBW = 1.5
Gbps. Fig. 18 depicts that when  starts increasing, µ is always identical to

SC
435 AvBw in all cases. Thus, we conclude that this method is trustworthy and could
be used to estimate AvBw.
0.2 0.2

U
Available Bandwidth

Available Bandwidth
0.15 0.15
Strain

Strain
0.1 0.1

0.05

0
AN 0.05

0 2 4 6 8 10 0 2 4 6 8 10
Probe Rate (Gbps) Probe Rate (Gbps)
M
(a) AvBw = 6 Gbps (b) AvBw = 5 Gbps
0.2
Available Bandwidth

0.15
D
Strain

0.1

0.05
TE

0 2 4 6 8 10
Probe Rate (Gbps)

(c) AvBw = 1.5 Gbps


EP

Figure 18: ECCP’s available bandwidth estimation process


C

7.2. ECCP testbed implementation


AC

In this testbed, we have implemented the ECCP rate controller using


Linux Traffic Control [34]. ECCP needs to control data traffic while probe
440 traffic must be forwarded with no control. To achieve this goal, we use Hierarchy
Token Bucket (HTB) [35] to create two virtual schedulers (Qdisc) with different
classes in Linux machines. HTB allows sending different classes of traffic on

29
ACCEPTED MANUSCRIPT

different simulated links using one physical link. HTB is used to ensure that
the maximum service provided for each class is the minimum of the desired rate
DR or the assigned rate R by ECCP. Fig. 19a shows the two classes that we

PT
445

create to represent data flow and probe flow. In addition, two virtual schedulers
(Qdisc) are created and linked to these classes (Fig. 19b). Thus, ECCP can

RI
limit the data rate by setting the rate on class 1:11 equal the maximum allowed
rate, while keeping the probe class (Class 1:22) uncontrolled. Note that these

SC
450 two queues have different priorities; data flow enter the queue with low priority
while probe flow is forwarded through the queue with high priority.

Data Probes

U
Root HTB Qdisc
1:

HTB Class
1:1
AN Setting LR Leaf
Qdisc

Sending rate: SR
Leaf
Qdisc

Probe rate :μ

HTB
HTB Class 1:11 HTB Class 1:22
root
M
Main Link
Leaf Qdisc Leaf Qdisc
SFQ SFQ
(b) Virtual Queues that is cre-
D

(a) Data class and probe class that ated using HTB for data and probe
TE

is created by HTB packets

Figure 19: HTB created virtual queues and their classes


EP

In this experiment, each host sends with desired rate DR that are throttled
by HTB to the sending rate R which is calculated by ECCP. DRs are varied 4
times in this test, in the first period (0 s < t < 4 s), host 0 sends with DR = 4
C

455 Gbps while host 1 sends with DR = 1 Gbps (Fig. 20). In this period, there is
AC

no congestion and the transmission rates R are not controlled (equal DR). In
the second period (4 s < t < 12.4 s), host 1 increases its DR to 6 Gbps. Thus,
ECCP starts limiting DR by setting R to a value that keeps the cross traffic
close to 9.5 Gbps. One can notice in this period, ECCP controls only the greedy
460 flow (Host 1) while allowing Host 0 to send with its DR. In the third period
(12.4 s < t < 14.2 s), host 0 increases its DR to 6 Gbps. Therefore, ECCP

30
ACCEPTED MANUSCRIPT

starts to control both hosts’ rates severely to prevent congestion. Finally, when
t > 14.2 s, host 0 decreases its DR to 3 Gbps which ends the congestion. Thus,
ECCP alleviates its control, and each host sends with its desired rate (R = DR).

PT
RI
U SC
AN
M
D

Figure 20: ECCP lab implementation results


TE

465 8. Conclusion
EP

In this paper, we propose ECCP as a distributed congestion control mech-


anism that is implemented in line cards or end hosts and does not require any
switch modification.
C

We analyzed ECCP using phase plane method while taking into consid-
eration the propagation delay. Our stability analysis identifies the sufficient
AC

470

conditions for ECCP system stability. In addition, this research shows that the
stability of the ECCP system is ensured by the sliding mode motion. How-
ever, the stability of ECCP depends not only on its parameters but also on the
network configurations.
475 Several simulations were driven to verify our ECCP stability analysis. The

31
ACCEPTED MANUSCRIPT

obtained numerical results reveal that the ECCP system is stable when the
delay is bounded. Finally, a Linux-based testbed experimentation is conducted
to evaluate ECCP performance.

PT
As a perspective of this work, we are presently (i) studying the effect of
480 available bandwidth estimation error on ECCP stability, (ii) evaluating ECCP

RI
in larger and various network topologies using our simulator.

SC
Acknowledgment

This work is supported by Ericsson Research, the Fonds de Recherche Nature


et Technologies (FRQNT) and the Natural Sciences and Engineering Research

U
485 Council of Canada (NSERC). Sincere gratitude is hereby extended to Brian
AN
Alleyne and Andre Beliveau for their help and support in constructing this
work.
M
Appendix A. Proof of lemma 1 (stability conditions of the ECCP
rate decrease subsystem)
D

Proof. Starting with ECCP rate decrease subsystem equation (17) that could
TE

be presented as follows:

Gd  
y 0 (t) + y(t − τ) + w × T × y 0 (t − T − τ) (y(t) + ζ ) = 0. (A.1)
T
EP

490 Lyapunov has shown that the stability of nonlinear differential equations in
the neighborhood of equilibrium point can be found from their linear version
around the equilibrium point [36] when the Lipschitz condition is satisfied. For
C

delay differential equations [37] has proven that, delay differential equations
AC

is uniformly asymptotic stable if its linearized version is uniformly asymptotic


495 stable and the Lipschitz condition is satisfied.
Consider functions g1 and g2 are defined as follow: g1 (t) = y(t) and g2 (t) =
− GTd (y(t − τ) + w T y 0 (t − T − τ))(y(t) + ζ ). Since both g1 (t) and g2 (t) are poly-
nomials, for any ~z = (z1 , z2 ) = (y(t), y 0 (t)), there exists L such that ||gi (t,~z1 ) −
gi (t,~z2 )|| ≤ L||~z1 − ~z2 ||, where i = 1,2. Then the Lipschitz condition is satisfied.

32
ACCEPTED MANUSCRIPT

500 Hence, the stability of the delay differential equation is defined by the stability
of the linearized part near the equilibrium point.
Thus, the linear part of the rate decrease subsystem equation becomes:

PT
Gd ζ
y 0 (t) + y(t − τ) + w × T × y 0 (t − T − τ) = 0.

(A.2)
T

RI
We use Taylor series to approximate (A.2) by substituting y(t − τ) and y(t −
T − τ) using (A.3) and (A.4) respectively.
τ 2 00

SC
y(t − τ) ≈ y(t) − τy 0 (t) + y (t). (A.3)
2
505

y 0 (t − T − τ) ≈ y 0 (t) − (T + τ)y 00 (t). (A.4)

U
Hence (A.2) becomes:
Gd ζ
y 0 (t) +
T
τ2
2
AN
y(t) − τy 0 (t) + y 00 (t) + w y 0 (t) − (T + τ)y 00 (t) ≈ 0
 

τ2  1 τ 0 1
( − w(T + τ))y 00 (t) + w + − y (t) + y(t) ≈ 0 (A.5)
Gd ζ T
M
2T T
where G d ζ , 0. Therefore, one can derive the characteristic equation of (A.5)
as:
D

 τ2   1 τ 1
− w(T + τ) λ 2 + w + − λ + = 0. (A.6)
2T Gd ζ T T
TE

By calculating the roots λ 1, 2 of the characteristic equation (A.6), we obtain:



−b ± b2 − 4ac
λ 1, 2 = . (A.7)
2a
EP

τ2 τ
where a = 2T − w(T + τ), b = w + 1
Gd ζ − T, and c = 1
T.
510 In order to study the stability of ECCP rate decrease subsystem, the roots
of (A.6) must be either (i) complex roots with negative real part for a system
C

with stable spiral point (Fig. A.21a) or (ii) negative roots for a system with
AC

stable point (Fig. A.21b).

Appendix A.1. System stability with a spiral point


515 For a stable system with a spiral point, λ 1, 2 must be complex numbers with
negative real parts. Thus, inequalities (A.8) and (A.9) must hold.

b2 < 4ac (A.8)

33
ACCEPTED MANUSCRIPT

y'(t)
F
b =0
y'(t)
F
b =0

PT
y(t)
As
ym y(t)

RI
p to
t ic As
l in ym
e pt
ot
ic

SC
lin
e
(a) Complex roots with with negative real
part (b) Negative real roots

U
Figure A.21: Phase trajectories of the rate decrease subsystem

AN
b/a > 0 (A.9)
M
Substituting a, b and c in (A.8), we get:
 1 τ 2 τ2 1
w+ − <4 − w(T + τ) .
Gd ζ T 2T T
D

Let H = w + 1
Gd ζ , we get:
TE

 τ 2 τ τ 
H− < 2( ) 2 − 4w − 4w +
T T T
τ 2 τ
( ) + (2H − 4w) − H − 4w > 0.
2
(A.10)
T T
EP

One can say that the right-hand side (RHS) of (A.10) represents a convex func-
2
tion ( dd(τ/T
(R H S)
) 2 = 1 > 0 ) as shown in Fig. A.22. Thus, inequality (A.8) holds
when RH S < 0 where τ/T < min(roots) and τ/T > max(roots). Hence, we
C

calculate the roots r 1, 2 of the RHS as:


AC

p
−2H + 4w ± (2H − 4w) 2 + 4(H 2 + 4w)
r 1, 2 =
2
p
r 1, 2 = −H + 2w ± (H − 2w) 2 + (H 2 + 4w)
p
r 1, 2 = 2w − H ± H 2 − 4wH + 4w 2 + H 2 + 4w
1 p
r 1, 2 = w − ± 2H 2 − 4wH + 4w 2 + 4w. (A.11)
Gd ζ

34
ACCEPTED MANUSCRIPT

τ/T

PT
r1 r2

RI
Figure A.22: Roots r 1, 2 of a convex function

SC
By substituting with the value of H, we get:
s
1 2
r 1, 2 = w − ± 2w 2 − + 4w. (A.12)
Gd ζ (G d ζ ) 2

U
Thus, inequality (A.8) holds when:


T


τ
< w − 1
Gd ζ
AN

q

q
2w 2 − (G d2ζ)2 + 4w
(A.13)
 τ > w − 1 + 2w 2 − 2 2 + 4w.


 T G d ζ (G d ζ)
M
One can conclude that inequality (A.8) does not hold because τ/T by definition
520 must be limited by a certain value k (τ/T < k, where k ∈ R+ ). Therefore, (A.6)
D

does not have complex roots and ECCP rate decrease subsystem does not have
a stable spiral point.
TE

Appendix A.2. System stability with a stable point


For a stable system with a stable point, λ 1, 2 must be negative real number
EP

525 where inequalities (A.14) and (A.15) hold.

b2 > 4ac. (A.14)


C


−b ± b2 − 4ac
< 0. (A.15)
AC

2a
By treating A.14 similar to (A.8), we get:
s
1 2
w− − 2w 2 − + 4w > τ/T
Gd ζ (G d ζ ) 2
s
1 2
<w− + 2w 2 − + 4w. (A.16)
Gd ζ (G d ζ ) 2

35
ACCEPTED MANUSCRIPT

As τ and T are always greater than zero, we can consider the positive root only.

s
1 2
τ/T < w −

PT
+ 2w 2 − + 4w. (A.17)
Gd ζ (G d ζ ) 2
The second condition (Inequality A.15) can be simplified as follows:
−b

RI
(1 ± 1 − 4ac/b2) < 0.
p
(A.18)
2a
530 This condition holds in one of the following two states: (i)

SC
 −b/a > 0



 (A.19)
 1 ± 1 − 4ac/b2 < 0.

 p

U
Or:
 −b/a < 0



 AN
 1 ± 1 − 4ac/b2 > 0.



p (A.20)

The second part of the first state (A.19) does not hold in its worst case when
M
p
we consider the positive root; i.e 1 + 1 − 4ac/b2 will never be less than zero.
Thus we consider only the second state. The worst case of the second part of
D

the second state equation (A.20) could be derived as follows:


p
1 − 1 − 4ac/b2 > 0
TE

p
− 1 − 4ac/b2 > −1

1 − 4ac/b2 > 1
EP

−4ac/b2 > 0. (A.21)

For all b2 > 0 and c = 1/T > 0, we conclude that c must be greater than 0.
C

Consequently, b must be greater than zero (first part of A.20) when −a is greater
than zero.
AC

Hence, to satisfy the second inequality A.20, these conditions must hold:

−a > 0
τ2
wT + wτ − >0
2T
τ τ
−1/2( ) 2 + w + w > 0, (A.22)
T T

36
ACCEPTED MANUSCRIPT

and

b>0

PT
1 τ
w+ − >0
Gd ζ T
τ 1
<w+ . (A.23)
Gd ζ

RI
T

535 Dissimilar to inequality (A.10), the RHS of (A.22) represents a concave func-
τ
tion. Thus, inequality (A.22) holds when min(r 1, 2 ) < < max(r 1, 2 ), where the

SC
T
roots r 1, 2 of (A.22) are:
p
r 1, 2 = w ± w 2 + 2w. (A.24)

U
Hence, inequality (A.22) holds when:
p AN p
w − w 2 + 2w < τ/T < w + w 2 + 2w. (A.25)

Because τ and T ∈ R+ , we consider only the positive root, thus (A.25) becomes:
M
540
p
τ/T < w + w 2 + 2w. (A.26)
D

To conclude, ECCP rate decrease subsystem is stable with a stable point (Fig.
A.21b) when inequalities (A.17), (A.23) and (A.26) hold.
TE

In summary, ECCP rate decrease subsystem is asymptotically stable and the


phase trajectories of the rate decrease differential equation (16) are parabolas

moving toward a stable node as shown in Fig. A.21b when τ/T < min w +
EP

545
√ √ 
G d ζ , w + w + 2w, w − G d ζ + 2H − 4wH + 4w + w .
1 2 1 2 2 
C

Appendix B. Stability Analysis of ECCP Rate Increase Subsystem


AC

In this appendix, we study the stability of the rate increase subsystem by


substituting (14) into self-increase part of (12).

M T R − AvTM×C (y(t) + ζ )
y 0 (t) = ×
AvT × C 2 × TBC
M ζ y(t)
y 0 (t) = × TR − − . (B.1)
2 × AvT × C × TBC 2 × TBC 2 × TBC

37
ACCEPTED MANUSCRIPT

Equation (B.1) is an inhomogeneous second order ordinary differential equa-


tion (ODE) which has a characteristic equation of the form:

PT
1
λ2 + λ=K (B.2)
2 × TBC
where:

RI
M ζ
K= TR −
2 × AvT × C × TBC 2 × TBC

SC
1
M × TR − Aeq
= − AvT
2 × AvT × C × TBC 2 × TBC
1  
= M × T R − (1 − Aeq AvT )C . (B.3)
2 × AvT × C × TBC

U
550 The phase trajectories of (B.2) can be drawn using the Isoclinal method [38].

y'(t)
AN
Fig. B.23a and B.23b show the phase trajectories of the self-increase subsystem
As
ym y'(t)
pt
ot
M
ic
lin
e
K>0
y(t) y(t)
D

K<0
As
ym
pt
TE

ot
ic
lin
e
EP

(a) (K > 0) (b) (K < 0)

Figure B.23: Phase trajectories in the self-increase subsystem (K > 0 and K < 0)
C

while (K > 0) and (K < 0) respectively. In general, ECCP reaches self-increase


AC

subsystem coming from the rate decrease subsystem when, M × T R ≥ (1 −


Aeq AvT )C. Consequently, K > 0 and the system follows the trajectory of Fig.
555 B.23a. If ECCP enters the self-increase subsystem phase trajectory with K < 0,
it follows the trajectory in Fig. B.23b for five cycles. Then, it increases T R as
in equation (8) which increases K. Finally, ECCP follows the phase trajectory
shown in Fig. B.23a.

38
ACCEPTED MANUSCRIPT

y'(t)

Fb
=0
Rate Decrease

PT
l5
l2
y(t)

RI
l4 l1
l3

SC
Rate Increase

Figure B.24: ECCP phase trajectories

U
560
AN
Combining Fig. A.21b, B.23a and Fig. B.23b for the rate decrease and self-
increase subsystems, we get Fig. B.24. In this figure, one can notice that if the
system starts in the self-increase subsystem, it follows line l 1 (K > 0) toward
M
the asymptotic line (Fb = 0) or it follows line l 3 (K ≤ 0) for 5 cycles till ECCP
enters AI stage and T R is increased. Then it follows l 4 toward the asymptotic
D

line. Afterward, the system follows either line l 2 coming from FR stage to rate
565 decrease subsystem or l 5 from the AI stage to the rate decrease subsystem. Both
TE

trajectories lead ECCP toward the equilibrium point as shown in Fig. B.24.
Therefore, ECCP rate increase subsystem is not stable, and the stability
of ECCP system mainly depends on the sliding mode motion [7] from self-
EP

increase subsystem into the rate decrease subsystem when the system crosses
570 the asymptotic line (Fb = 0).
C

Appendix C. Proof of lemma 2 (boundary limitations for the ECCP)


AC

Proof. To avoid data accumulation in the queue, the integral of the self-increase
function from t to t + (T + 2τ) must be less than the available bandwidth margin
as depicted by (C.1).
 t + (T +2τ)
M R0 (t)dt < AvT Aeq C. (C.1)
t

39
ACCEPTED MANUSCRIPT

575 Since ECCP is a discrete system and R(t) is constant within control cycle, (C.1)
could be approximated within one control cycle to:

PT
M R0 (t)(T + 2τ) < AvT Aeq C
(T R − R)

RI
M (T + 2τ) < AvT Aeq C.
2TBC

SC
At the equilibrium point R = (1 − AvT Aeq )C/M, and T R > C/M.

(C/M − (1 − AvT Aeq )C/M)


M (T + 2τ) < AvT Aeq C

U
2TBC
(C − C − AvT Aeq C)
(T + 2τ) < AvT Aeq C
2TBCAN (T + 2τ) < 2TBC
2BC
(T + 2τ) <
M
C/M
C(T + 2τ)
BC > . (C.2)
2M
D


TE

In summary, ECCP keeps queue length close to zero, if condition C.2 is


satisfied.
EP

580 References

[1] A. Bachmutsky, System design for telecommunication gateways, John Wi-


C

ley & Sons, 2011.


AC

[2] G. Lee, Cloud Networking: Understanding Cloud-based Data Center Net-


works, Morgan Kaufmann, 2014.

585 [3] I. 802.1, The data center bridging (DCB) task group (2013).
URL http://www.ieee802.org/1/pages/dcbridges.html

40
ACCEPTED MANUSCRIPT

[4] M. Snir, The future of supercomputing, in: Proceedings of the 28th ACM
International Conference on Supercomputing, ICS ’14, ACM, New York,
NY, USA, 2014, pp. 261–262. doi:10.1145/2597652.2616585.

PT
590 [5] S. Bailey, T. Talpey, The architecture of direct data placement (DDP) and
remote direct memory access (RDMA) on internet protocols, Architecture.

RI
URL https://tools.ietf.org/html/rfc4296

SC
[6] P. Kale, A. Tumma, H. Kshirsagar, P. Ramrakhyani, T. Vinode, Fibre
channel over ethernet: A beginners perspective, in: 2011 International
595 Conference on Recent Trends in Information Technology (ICRTIT), 2011,

U
pp. 438–443. doi:10.1109/ICRTIT.2011.5972328.

AN
[7] V. Utkin, Variable structure systems with sliding modes, IEEE Transac-
tions on Automatic Control 22 (2) (1977) 212–222. doi:10.1109/TAC.
1977.1101446.
M
600 [8] A. Varga, R. Hornig, An overview of the omnet++ simulation environment,
in: Proceedings of the 1st international conference on Simulation tools and
D

techniques for communications, networks and systems & workshops, ICST


(Institute for Computer Sciences, Social-Informatics and Telecommunica-
TE

tions Engineering), 2008, p. 60.

605 [9] IEEE standard for local and metropolitan area networks–media access con-
EP

trol (MAC) bridges and virtual bridged local area networks–amendment


17: Priority-based flow control, IEEE Std 802.1Qbb-2011 (Amendment to
C

IEEE Std 802.1Q-2011 as amended by IEEE Std 802.1Qbe-2011 and IEEE


Std 802.1Qbc-2011) (2011) 1–40doi:10.1109/IEEESTD.2011.6032693.
AC

610 [10] B. Stephens, A. L. Cox, A. Singla, J. Carter, C. Dixon, W. Felter, Practical


DCB for improved data center networks, in: INFOCOM, 2014 Proceedings
IEEE, IEEE, 2014, pp. 1824–1832.

[11] IEEE standard for local and metropolitan area networks– virtual bridged
local area networks amendment 13: Congestion notification, IEEE Std

41
ACCEPTED MANUSCRIPT

615 802.1Qau-2010 (Amendment to IEEE Std 802.1Q-2005) (2010) c1–119doi:


10.1109/IEEESTD.2010.5454063.

PT
[12] M. Alizadeh, B. Atikoglu, A. Kabbani, A. Lakshmikantha, R. Pan, B. Prab-
hakar, M. Seaman, Data center transport mechanisms: Congestion control
theory and IEEE standardization, in: 2008 46th Annual Allerton Confer-

RI
620 ence on Communication, Control, and Computing, 2008, pp. 1270–1277.
doi:10.1109/ALLERTON.2008.4797706.

SC
[13] A. Kabbani, M. Alizadeh, M. Yasuda, R. Pan, B. Prabhakar, AF-QCN:
Approximate fairness with quantized congestion notification for multi-

U
tenanted data centers, in: 2010 18th IEEE Symposium on High Perfor-
625 mance Interconnects, 2010, pp. 58–65. doi:10.1109/HOTI.2010.26.
AN
[14] Y. Zhang, N. Ansari, Fair quantized congestion notification in data center
networks, IEEE Transactions on Communications 61 (11) (2013) 4690–
M
4699. doi:10.1109/TCOMM.2013.102313.120809.

[15] Y. Tanisawa, M. Yamamoto, Qcn with delay-based congestion detection


D

630 for limited queue fluctuation in data center networks, in: 2013 IEEE 2nd
International Conference on Cloud Networking (CloudNet), 2013, pp. 42–
TE

49. doi:10.1109/CloudNet.2013.6710556.

[16] R. Adams, Active queue management: A survey, IEEE Communications


EP

Surveys Tutorials 15 (3) (2013) 1425–1476. doi:10.1109/SURV.2012.


635 082212.00018.
C

[17] M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar,


S. Sengupta, M. Sridharan, Data center TCP (DCTCP), SIGCOMM Com-
AC

put. Commun. Rev. 40 (4) (2010) 63–74. doi:10.1145/1851275.1851192.

[18] Y. Zhu, H. Eran, D. Firestone, C. Guo, M. Lipshteyn, Y. Liron, J. Pad-


640 hye, S. Raindel, M. H. Yahia, M. Zhang, Congestion control for large-scale
RDMA deployments, SIGCOMM Comput. Commun. Rev. 45 (4) (2015)
523–536. doi:10.1145/2829988.2787484.

42
ACCEPTED MANUSCRIPT

[19] M. Alizadeh, A. Kabbani, T. Edsall, B. Prabhakar, A. Vahdat, M. Yasuda,


Less is more: Trading a little bandwidth for ultra-low latency in the data
center, in: Proceedings of the 9th USENIX Conference on Networked Sys-

PT
645

tems Design and Implementation, NSDI’12, USENIX Association, Berkeley,


CA, USA, 2012, pp. 19–19.

RI
URL http://dl.acm.org/citation.cfm?id=2228298.2228324

[20] R. Mittal, V. T. Lam, N. Dukkipati, E. Blem, H. Wassel, M. Ghobadi,

SC
650 A. Vahdat, Y. Wang, D. Wetherall, D. Zats, TIMELY: RTT-based conges-
tion control for the datacenter, SIGCOMM Comput. Commun. Rev. 45 (4)
(2015) 537–550. doi:10.1145/2829988.2787510.

U
[21] C. So-In, R. Jain, J. Jiang, Enhanced forward explicit congestion notifi-

655
AN
cation (e-fecn) scheme for datacenter ethernet networks, in: 2008 Interna-
tional Symposium on Performance Evaluation of Computer and Telecom-
munication Systems, 2008, pp. 542–546.
M
[22] L. Jose, L. Yan, M. Alizadeh, G. Varghese, N. McKeown, S. Katti, High
speed networks need proactive congestion control, in: Proceedings of the
D

14th ACM Workshop on Hot Topics in Networks, HotNets-XIV, ACM, New


TE

660 York, NY, USA, 2015, pp. 14:1–14:7. doi:10.1145/2834050.2834096.

[23] J. Perry, A. Ousterhout, H. Balakrishnan, D. Shah, H. Fugal, Fastpass: A


centralized "zero-queue" datacenter network, SIGCOMM Comput. Com-
EP

mun. Rev. 44 (4) (2014) 307–318. doi:10.1145/2740070.2626309.

[24] Z. Wang, X. Zeng, X. Liu, M. Xu, Y. Wen, L. Chen, TCP conges-


C

665 tion control algorithm for heterogeneous Internet, Journal of Net-


AC

work and Computer Applications 68 (Supplement C) (2016) 56 – 64.


doi:https://doi.org/10.1016/j.jnca.2016.03.018.
URL http://www.sciencedirect.com/science/article/pii/
S1084804516300327

670 [25] J. A., K. R. S.V., A. U. R., Congestion avoidance algorithm


using ARIMA(2,1,1) model-based RTT estimation and RSS in

43
ACCEPTED MANUSCRIPT

heterogeneous wired-wireless networks, Journal of Network and


Computer Applications 93 (Supplement C) (2017) 91 – 109.

PT
doi:https://doi.org/10.1016/j.jnca.2017.05.008.
675 URL http://www.sciencedirect.com/science/article/pii/
S1084804517302060

RI
[26] L. Tassiulas, A. Ephremides, Stability properties of constrained queueing
systems and scheduling policies for maximum throughput in multihop radio

SC
networks, IEEE Transactions on Automatic Control 37 (12) (1992) 1936–
680 1948. doi:10.1109/9.182479.

U
[27] J. Liu, N. B. Shroff, C. H. Xia, H. D. Sherali, Joint congestion control
and routing optimization: An efficient second-order distributed approach,
AN
IEEE/ACM Transactions on Networking 24 (3) (2016) 1404–1420. doi:
10.1109/TNET.2015.2415734.
M
685 [28] S. Ekelin, M. Nilsson, E. Hartikainen, A. Johnsson, J. E. Mangs, B. Me-
lander, M. Bjorkman, Real-time measurement of end-to-end available
D

bandwidth using kalman filtering, in: 2006 IEEE/IFIP Network Opera-


tions and Management Symposium NOMS 2006, 2006, pp. 73–84. doi:
TE

10.1109/NOMS.2006.1687540.

690 [29] A. Charny, D. D. Clark, R. Jain, Congestion control with explicit rate indi-
EP

cation, in: Communications, 1995. ICC ’95 Seattle, ’Gateway to Globaliza-


tion’, 1995 IEEE International Conference on, Vol. 3, 1995, pp. 1954–1963
vol.3. doi:10.1109/ICC.1995.524537.
C

[30] G. Raina, D. Towsley, D. Wischik, Part II: Control theory for buffer sizing,
AC

695 SIGCOMM Comput. Commun. Rev. 35 (3) (2005) 79–82. doi:10.1145/


1070873.1070885.

[31] F. Kelly, G. Raina, T. Voice, Stability and fairness of explicit congestion


control with small buffers, SIGCOMM Comput. Commun. Rev. 38 (3)
(2008) 51–62. doi:10.1145/1384609.1384615.

44
ACCEPTED MANUSCRIPT

700 [32] M. M. Bahnasy, A. Beliveau, B. Alleyne, B. Boughzala, C. Padala,


K. Idoudi, H. Elbiaze, Using ethernet commodity switches to build a switch
fabric in routers, in: 2015 24th International Conference on Computer Com-

PT
munication and Networks (ICCCN), 2015, pp. 1–8. doi:10.1109/ICCCN.
2015.7288483.

RI
705 [33] W. Jiang, F. Ren, C. Lin, Phase plane analysis of quantized congestion no-
tification for data center ethernet, IEEE/ACM Transactions on Networking

SC
23 (1) (2015) 1–14. doi:10.1109/TNET.2013.2292851.

[34] B. Hubert, et al., Linux advanced routing & traffic control howto, setembro

U
de.

710 AN
[35] M. Devera, Hierarchical token bucket theory (2002).
URL http://luxik.cdi.cz/~devik/qos/htb/manual/theory.htm

[36] V. Arnold, Supplementary chapters to the theory of ordinary differential


M
equations, Nauka, Moscow.
D

[37] R. D. Driver, Ordinary and delay differential equations, Vol. 20, Springer
715 Science & Business Media, 2012.
TE

[38] D. P. Atherton, G. M. Siouris, Nonlinear control engineering, IEEE Trans-


actions on Systems, Man, and Cybernetics 7 (7) (1977) 567–568. doi:
EP

10.1109/TSMC.1977.4309773.
C
AC

45
Author MANUSCRIPT
ACCEPTED biographies

Mahmoud Bahnasy received the M.Eng. from Université du Québec à


Montréal, Canada, in 2014. Currently pursuing the Ph.D. degree in computer
science and engineering at École de Technologie Supérieure, Montréal,
Canada.

PT
Halima Elbiaze received the Master degree from University of Versailles, France, in 1998, and the

RI
Phd from University of Versailles, in March 2002. She is a professor at Université du Québec à
Montréal since June 2003. In 2005, Dr. Elbiaze received the Canada Foundation for Innovation
Award to build her IP over the DWDM network Laboratory. Her research interests include

SC
intelligent optical networks, performance evaluation, traffic engineering, wireless networks, and next
generation IP networks. She is the author or coauthor of many journal and conference papers. Her
research interests include network performance evaluation, traffic engineering, and quality of service
management in optical and wireless networks. She is member of IEEE and OSA.

U
AN
Bochra Boughzala received her engineering national diploma form INSAT,
Tunisia in 2011 and her Master degree in Computer Science from UQAM in
2013. In 2013, she joined Ericsson Research group in Montreal where she now
M

works as experienced researcher in the networking technologies research area.


Her research interests include high performance data plane and execution
environment, data plane abstractions and domain specific languages, network
D

programmability and software defined networking (SDN), congestion control and traffic
management with new interest in 5G Ethernet fronthauling and information centric networking
TE

(ICN) for mobile backhaul.


C EP
AC
ACCEPTED MANUSCRIPT

- Using Ethernet commodity switches as a switch fabric for routers.

PT
- Proposing ECCP that controls transmission rate using estimated available bandwidth.

- Presenting a mathematical model of ECCP using Delay Differential Equations (DDEs).

RI
- Deducing the stability conditions of ECCP using the phase plane method.

SC
- Validating the ECCP performance using simulation and testbed implementation.

U
AN
M
D
TE
C EP
AC

You might also like