Energy Efficient Traffic Engineering in Software Defined
Energy Efficient Traffic Engineering in Software Defined
Networks
Radu Carpa
Spécialité de doctorat :
Informatique
Radu CÂRPA
iii
Résumé
La consommation d’énergie est devenue un facteur limitant pour le déploiement d’infrastructures
distribuées à grande échelle. Ce travail a pour but d’améliorer l’efficacité énergétique des
réseaux de cœur en éteignant un sous-ensemble de liens par une approche SDN (Software
Defined Network). Nous nous différencions des nombreux travaux de ce domaine par une
réactivité accrue aux variations des conditions réseaux. Cela a été rendu possible grâce
à une complexité calculatoire réduite et une attention particulière au surcoût induit par
les échanges de données. Pour valider les solutions proposées, nous les avons testées sur
une plateforme spécialement construite à cet effet.
Dans la première partie de cette thèse, nous présentons l’architecture logicielle “Seg-
menT Routing based Energy Efficient Traffic Engineering” (STREETE). Le coeur de la
solution repose sur un re-routage dynamique du trafic en fonction de la charge du réseau
dans le but d’éteindre certains liens peu utilisés. Cette solution utilise des algorithmes
de graphes dynamiques pour réduire la complexité calculatoire et atteindre des temps de
calcul de l’ordre des millisecondes sur un réseau de 50 nœuds. Nos solutions ont aussi été
validées sur une plateforme de test comprenant le contrôleur SDN ONOS et des commu-
tateurs OpenFlow. Nous comparons nos algorithmes aux solutions optimales obtenues
grâce à des techniques de programmation linéaires en nombres entiers et montrons que
le nombre de liens allumés peut être efficacement réduit pour diminuer la consommation
électrique tout en évitant de surcharger le réseau.
Dans la deuxième partie de cette thèse, nous cherchons à améliorer la performance
de STREETE dans le cas d’une forte charge, qui ne peut pas être écoulée par le réseau
si des algorithmes de routages à plus courts chemins sont utilisés. Nous analysons des
méthodes d’équilibrage de charge pour obtenir un placement presque optimal des flux
dans le réseau.
Dans la dernière partie, nous évaluons la combinaison des deux techniques proposées
précédemment : STREETE avec équilibrage de charge. Ensuite, nous utilisons notre
plateforme de test pour analyser l’impact de re-routages fréquents sur les flux TCP. Cela
nous permet de donner des indications sur des améliorations à prendre en compte afin
d’éviter des instabilités causées par des basculements incontrôlés des flux réseau entre des
chemins alternatifs.
Nous croyons à l’importance de fournir des résultats reproductibles à la communauté
scientifique. Ainsi, une grande partie des résultats présentés dans cette thèse peuvent
être facilement reproduits à l’aide des instructions et logiciels fournis.
iv
Contents
1 Introduction 1
v
CONTENTS
3 Theoretical background 33
3.1 Graphs 101: model of a communication network . . . . . . . . . . . . . . 33
3.2 Linear Programming 101 . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 The maximum concurrent flow problem . . . . . . . . . . . . . . . . . . . 36
3.3.1 Constructing the edge-path LP formulation . . . . . . . . . . . . 36
3.3.2 The node-arc LP formulation . . . . . . . . . . . . . . . . . . . . 38
3.3.3 Properties of the edge-path LP formulation . . . . . . . . . . . . . 39
3.3.4 The dual of the edge-path LP and its properties . . . . . . . . . . 40
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
vi
CONTENTS
5.2.2 Step 1: maximum concurrent flow computed by the SDN controller 79
Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Proof of the algorithm . . . . . . . . . . . . . . . . . . . . . . . . 80
Discussions on choosing the precision . . . . . . . . . . . . . . . . 83
5.2.3 Step 2: transmitting the constraints to the SDN switches . . . . . 84
5.2.4 Step 3: computing paths by SDN switches . . . . . . . . . . . . . 84
Transform the flow into a topologically sorted DAG . . . . . . . . 85
Computing the routes for flows . . . . . . . . . . . . . . . . . . . 85
Avoiding long routes . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.2.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Maximum link utilization . . . . . . . . . . . . . . . . . . . . . . 87
Computational time . . . . . . . . . . . . . . . . . . . . . . . . . 90
Path length increase . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Publications 115
Appendices 129
vii
CONTENTS
viii
Chapter 1
Introduction
Motivations
Advances in network and computing technologies have enabled a multitude of services —
e.g. those used for big-data analysis, stream processing, video streaming, and Internet
of Things (IoT) — hosted at multiple data centers often interconnected with high-speed
optical networks. Many of these services follow business models such as cloud computing,
which allow a customer to rent resources from a cloud and pay only for what she consumes.
Although these models are flexible and benefit from economies of scale, the amount of data
transferred over the network increases continuously. Network operators, under continuous
pressure to differentiate and deliver quality of service, often tackle this challenge by
expanding network capacity, hence always adding new equipment, or increasing the rates
of existing links. Existing work argues that, if traffic continues to grow at the current
pace, in less than 20 years network operators may reach an energy capacity crunch where
the amount of electricity consumed by network infrastructure may become a bottleneck
and further limit the Internet growth [1].
Major organizations have attempted to curb the energy consumption of their infras-
tructure by reducing the number of required network resources and maximizing their
utilization. Google, for instance, created custom Software Defined Network (SDN) de-
vices and updated their applications to co-operate with the network devices. In this
way, it was possible to achieve near-100% utilization of intra-domain links [2]. With this
smart traffic orchestration, Google was able to increase the energy efficiency while pro-
viding a high quality of service. However, such traffic management requires co-operation
between the communicating applications and the network management, a technique that
is out-of-reach of ordinary network operators.
As the network operators do not have control over the applications that use the net-
work, they over-provision the network capacity to handle the peak load. Furthermore,
to prevent failures due to vendor-specific errors, it is not exceptional for a big network
provider to implement vendor redundancy in its core network. Vendor redundancy con-
sists of using devices from multiple vendors in parallel. The Orange France mobile network
blackout in 2012 1 is an example of a vendor specific error where a software update intro-
duced a bug both on the primary and on the backup device. To avoid these situations,
1
http://www.parismatch.com/Actu/Economie/Orange-revelations-sur-la-panne-geante-157766
1
CHAPTER 1. INTRODUCTION
Figure 1.1: Traffic passing through the 200Gbps link between Paris and Geneva. Geant
Network [3]
some network operators double the capacity by using backup devices from multiple ven-
dors. All these approaches contribute to highly over-provisioned networks. For example,
evidence shows that the utilization of production 200Gbps links in the Geant network can
be as low as 15% even during the periods of highest load (Fig. 1.1). Moreover, during
the off-peak times during night and weekends, the network utilization is even lower. It
drops to almost 5% of the link capacity.
As a result of all this over-provisioning, the energy efficiency of operator networks is
low: devices are fully powered all the time while being used only at a fraction of their
capacity. This over-provisioning leaves a lot of potential for innovation and for reducing
the energy that a network consumes by switching off unused devices and optimizing their
usage.
This work is done as part of the GreenTouch consortium, which intends to reduce the
net energy consumption in communications networks by up to 98% by 2020[4].
2
active during off-peak periods. In particular, we intend to turn off transponders and
port cards on the extremities of a link. The status of these components is set to sleep
mode whenever a link is idle, and they are brought back to the operational state when
required. We also term this process of activating and deactivating network links as
switching links on and off respectively. To free certain resources, the network traffic is
dynamically consolidated on a limited number of links, thus enabling the remaining links
to enter the low power consumption modes. Respectively, when the network requires
more capacity, the corresponding links are brought back online to avoid congestion. In
other words, we transparently change the paths taken by the application data in the
network to accommodate network capacity to the load.
The second idea relies on the observation that a well-utilized network does not con-
sume resources in vein. A proper load balancing technique allows transferring more data
with the same resources, avoiding premature updates to higher data rates and thus avoid-
ing the increase of energy consumption due to this update. As detailed later, we conclude
that both these methods are necessary and must be used in parallel to achieve a good
energy efficiency of operator networks.
We position our work in the context of Software-Defined Network (SDN)-based back-
bone networks and work within a single operator network domain. Although different
methods have already been proposed for reducing the energy consumption of communi-
cation networks, our contribution is an end-to-end validation of the proposed solutions.
These solutions were both validated through simulations and implemented in a real net-
work testbed comprising Software-Defined Network (SDN) switches and the Open Net-
work Operating System (ONOS) SDN controller. Moreover, we also used the testbed
to evaluate the impact of frequent path changes on data flows and conclude that such
techniques can be used with minimal impact on application flows.
Chapter 2 sets the context of our work. We provide a general overview of the ar-
chitectures of backbone networks and explore the panorama of methods used to reduce
their energy consumption. Most of these methods propose structural modifications of
the network. For example, this includes completely re-designing the physical topology,
the network devices, or the applications using the network. In contrast, we follow a less
intrusive approach that relies on traffic engineering; a mature technology widely deployed
by operators for other goals, such as differentiated services.
3
CHAPTER 1. INTRODUCTION
We continue the chapter with an extended introduction to the existing traffic engi-
neering techniques and, in particular, to the concept of Software-Defined Network (SDN).
In parallel, we present the related work that uses traffic engineering for increasing the
energy efficiency of networks. This introduction allows us to conclude that there is still
a need for an online solution, capable of reacting fast to changes in network traffic and,
in particular, to the unexpected growth of network demand.
The chapter ends with a discussion on the particularities that must be considered
when designing a SDN-based solution for high-speed backbone networks. The presented
challenges underlie the design of our solutions in the rest of this work. We also introduce
the Source Packet Routing In NetworkinG (SPRING)/“Segment Routing” protocol, in-
tended to meet the requirements of traffic engineering in backbone networks, and hence
circumvent the challenges above.
Chapter 3 provides the theoretical background needed to understand the results from
this thesis. We give an introduction to linear programming and the classical network
flow problem known as “maximum concurrent flow”. We illustrate the two usual ways
to model this problem as a linear program, namely the “node-arc” and the “edge-path”
formulations. The first model is used throughout the work to provide a comparison
baseline for the quality of our algorithms. At the same time, the edge-path is at the
base of efficient approximate algorithms for computing the maximum concurrent flow in
a network.
The chapter ends with some mathematical properties of the edge-path formulation
and its dual, which is used in the later chapters of the thesis.
Chapter 5 presents two load balancing methods proposed for the STREETE frame-
work to enable better utilization of active network resources.
4
The first method provides a consistent improvement over the shortest path routing,
but the output quality is not very stable and did not satisfy our expectations. We
present the reasons behind this behavior and continue with the second, more elegant,
load balancing method that is based on state of the art for approximately solving the
maximum concurrent flow problem. It builds on the work done by George Karakostas
[5] to produce a complete solution for near-optimal Software-Defined Network (SDN) -
based online load balancing in backbone networks. This solution also leverages the power
of SPRING source routing protocol to reduce the complexity and cost of centralized
management.
Chapter 6 starts by unifying the results from the two previous chapters and evokes
the problem that the combination of STREETE with load balancing techniques may
produce a lot of route changes for network flows which may affect network performance.
In particular, the change in end-to-end delay can be interpreted by the TCP protocol as
a sign to slow down its transmission. Taking into consideration that TCP is, by far, the
most used transport protocol, this detail can have a severe impact on the overall network
performance.
In the second part of the chapter, we use our network testbed to evaluate if frequent
route changes can negatively impact the network performance. We conclude that the
drop in the bandwidth of network flows can be significant if routes change too frequently.
Moreover, network instabilities may happen as a result of bad inter-operation between
the STREETE protocol and TCP if no special care is taken to avoid them.
We quantify the tolerable frequency based on data from our analysis and conclude
the chapter with a discussion on the tuning to be done in the STREETE framework to
achieve a stable, near-optimal, energy efficient traffic engineering solution.
Chapter 7 concludes the thesis with a summary of main findings, discussion of future
research directions, and final remarks.
Fast mode
To make it easy to reproduce the simulations, we provide an archive containing all the
necessary files except CPLEX. Reproducing the results becomes as easy as executing the
following steps:
3
https://www.docker.com
5
CHAPTER 1. INTRODUCTION
• Install dependencies (docker, wget) using the package manager of your distribution.
• Build the docker image and run the simulations: cd streeteCode && ./buildAn-
dRun.sh
In particular, on an Ubuntu host, you can just copy/paste the following lines into the
terminal to follow these steps and run all the simulations except the CPLEX optimiza-
tions:
sudo apt-get install docker.io wget
wget https://radu.carpa.me/these/streeteCode.tar
tar -xaf streeteCode.tar
cd streeteCode
sudo ./buildAndRun.sh
echo "Finished!"
The raw simulation outputs will be placed into the streeteCode/results folder. An
image is also generated for each simulated time point, thus providing a visual represen-
tation of the state of the network. An example of such a figure is given in Fig. 1.2 where
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Figure 1.2: Image generated by simulations which illustrates the state of the links in the
the Italian network.
the color and the percentage of the links represent their utilization. Two percentages are
6
shown (one per direction). The color is a helper for easier visualization and corresponds
to the biggest of these two percentages represented using the “jet” color scheme (blue =
low utilization; red = high utilization 4 ). A dotted line means that the link is used only
in one direction. A slim black line means that neither of the two directions is used.
The figures which will be presented in chapters 4 and 5 will be generated in the
sub-folders of the streeteCode/figures folder.
Fallback mode
Alternatively, a much smaller archive, which does not contain the third party installation
files, is included with this PDF as an attachment on the first page of the manuscript.
The same archive is also disposable via a couple of reliable cloud service:
• https://s3-eu-west-1.amazonaws.com/me.carpa.archive/Thesis/streeteCode.light.tar.xz
• https://drive.google.com/open?id=0B-57pIpvqZLgUThqbTh3ZnNleFk
4
https://en.wikipedia.org/wiki/Heat_map
5
https://omnetpp.org/omnetpp
6
https://github.com/inet-framework/inet/releases/
7
https://download.qt.io/official_releases/qt/5.9/5.9.0/
7
CHAPTER 1. INTRODUCTION
8
Chapter 2
(IP/MPLS) B
architectures of backbone networks and of solutions from A
Switching
Afterward, we focus on a technique for reducing (TDM)
9
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
The distance between the optical amplifiers is approximately 80km. By using multiple
amplifiers in series, manufacturers advertise that their devices can transmit a 100Gbps
signal at a 5000km distance without the need for OEO conversion [6]. As a result, OEO
conversion is usually not required between the points of presence except for submarine
communication cables.
10
2.1. ARCHITECTURE OF BACKBONE NETWORKS
The energy consumption of the amplifiers is small. For example, the authors of [7] af-
firm that the most common type of amplifier, the Erbium Doped Fibre Amplifier (EDFA),
consumes approximately 110W to amplify all the wavelengths in the fiber. Nevertheless,
the number of amplifiers may be considerable. Fig. 2.2 illustrates a small part of the
physical topology of the Geant network. The little dots correspond to the various optical
amplifiers along the fibers while the bigger squares correspond to WDM devices presented
in the next section. As a result of their large number, the total power consumption of
the optical amplifiers in a backbone network is not negligible.
Transponder
𝜆 𝜆
It is important to notice that the wavelength λ3 is directly switched from one fiber
to another without any additional treatment. Jumping forward, we can see that this
wavelength is perceived by the upper, TDM, layer as a direct connection between the
nodes B and D; it "bypasses" the upper layers of the node A. That is why the literature
refers to this kind of connections by the name of “optical bypass”.
The WDM technology enables the transmission of multiple terabits per second over
a single optical fiber. Unfortunately, the number of wavelengths in a fiber is limited,
and fine-grained traffic separation is difficult to achieve. This work is done by the higher
levels of the core network hierarchy by multiplexing/demultiplexing multiple low-speed
communications into/from an optical lambda.
We rely on the model of energy consumption presented in [7]. After consulting a large
number of commercial product datasheets, the authors conclude that the consumption
11
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
of an Add Drop Multiplexers (ADM) device (the big box in Fig. 2.1) can be modeled
as d ¨ 85W ` a ¨ 50W ` 150W , where d is the number of optical fibers entering the
node; and a is the add/drop degree, the number of northbound interfaces connected
to transponders. We must add to these values the power consumption of transponders,
which are represented with small black rectangles on Fig. 2.1. They do complex electrical
processing to generate the optical signal adapted for transmission over a long distance.
The consumption of transponders, given by the same source [7], is summarized in Tab.
2.1.
To be noted that transponders correspond to cards (Fig. 2.3b) plugged into another
network device. They are not independent devices as it can be viewed in Fig. 2.3a.
12
2.1. ARCHITECTURE OF BACKBONE NETWORKS
our example, what seems to be direct connections between the nodes A-C and B-D are
virtual tunnels provided by the TDM and, respectively, WDM layers.
A particularity of the packet layer in operator networks is that it frequently relies
on MultiProtocol Label Switching (MPLS) forwarding rather than IP routing. This
is because the “best effort” service provided by IP is insufficient in today’s convergent
networks. Intelligent traffic management is needed to enable a network to be shared
between delay-critical audio flows and best-effort traffic.
In this layer, a single device can consume as much as 10 kilowatts, 75% of which is
due to the slot and port cards regardless of their utilization [7].
13
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
IPoOTN IPoWDM
leveraging the advantages of each layer while reducing the management overhead. The
IP/MPLS packets are still processed by separate, specialized routers and are transmitted
to the Optical Transport Network (OTN) devices via short-reach optical or copper links.
The latter TDM-multiplexes multiple low-speed connections from the packet router into
high-speed flows. Integrated transponders are afterward used to generate an optical sig-
nal, and an integrated Reconfigurable Optical Add Drop Multiplexers (ROADM) injects
the lambda into the optical fiber.
For historical reasons, IPoOTN is the preferred choice of large transit operators. The
OTN devices are designed to be backward compatible with legacy TDM protocols (SDH
and SONET) which were deployed in times when most network traffic consisted of au-
dio flows. OTN devices allow gradual transitions to higher speeds without having to
completely re-design the existing network.
Another advantage of OTN solutions is the fact that transit traffic is kept at TDM
layer instead of passing through multiple IP routers and the fact that existing protocols
are very mature and allow end-to-end management through multi-vendor networks. Nev-
ertheless, there are no technical limitations to implement these services, i.e. multi-tenancy
and agile management of transit traffic, in the packet layer. Moreover, IP/MPLS packet
forwarding can be seen as a Time-Division Multiplexing (TDM) technique. The services
provided by the packet and the TDM layer are, to some extent, redundant. That is why
in the modern era, with most network traffic being packet-based, IPoWDM architectures
that we present in the next section are becoming increasingly popular.
14
2.2. ON IMPROVING THE ENERGY EFFICIENCY OF IPOWDM BACKBONE
NETWORKS
fibers by a separate Reconfigurable Optical Add Drop Multiplexers (ROADM). This
solution moves the cost of transponder/optics from optical layer equipment to the router.
The IP/MPLS packets are directly mapped into optical fibers without the need for the
intermediate TDM layer. The aggregation of multiple small, low speed, flows into a high-
speed flow is thus done at packet layer. The multi-tenancy and Quality of Service (QoS)
are also implemented using the MPLS and RSVP-TE protocols at the packet level.
This level allows the most flexibility in traffic management. The disadvantage of
this solution is in need to process all the transit packets in the IP layer. The packet
processing per gigabit is, unfortunately, more costly than the TDM processing: the most
power-efficient devices consume 7W per Gbps for IP/MPLS processing, which is slightly
worse than TDM processing shown in Tab. 2.2. On the other hand, an optical bypass
can be used when it is justified to avoid transit processing.
15
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
the best combination of IP and optical devices at each point of presence. Such works
usually use linear or nonlinear models because the computational time is not crucial in
the network design phase.
Another approach to energy efficient network design relies on the following idea: the
best way to reduce the energy consumption of the networks is to completely avoid using
the network. A very attractive solution relies on caching the content as close to the end
user as possible. In particular, [1] proposes to use Content Delivery Networks (CDNs)
and FOG1 computing to avoid the growth of the network energy consumption beyond
what power grids can sustain. To be noted that CDNs are already widely used by content
providers such as Netflix to both unload the network core and improve the user experience
[14]. There are thus no technical limitations to implement this kind of solutions.
An alternative way to reduce the amount of network traffic, applicable when CDNs
can not easily cache the content, is the redundancy elimination. Individual devices are
placed in the path of network flows, and compression mechanisms are afterward used
to reduce the amount of data sent trough the system. Unfortunately, when it comes
to evaluating the energy consumption of solutions employing redundancy elimination,
the research community diverges. [15], for instance, affirms that the increase of energy
consumption due to compression may be higher than the reduction due to transmitting
fewer data into the network. Other works are more optimistic [16] [17]. For example,
the authors of [17] propose a solution which combines compression with energy efficient
traffic engineering which we present in a later section. The authors take special care to
avoid the unfavorable case shown in [15].
16
2.2. ON IMPROVING THE ENERGY EFFICIENCY OF IPOWDM BACKBONE
NETWORKS
200
au repos traffic réseau
180
Puissance électrique (en Watts)
160
140
120
100
80
60
40
20
0
0 100 200 300 400 500 600
Temps (s)
Figure 2.6: Energy consumption of a network switch measured at idle during 300s and
at full load afterwards. Source : [18]
If it receives uniformly distributed 1000byte packets, it will receive a packet every 0.8
microseconds. With existing technologie, the time needed for an interface to transit to
a low power state or to be awakened from sleep is much longer. Evidences show that
modern commercial transponders need that tens of seconds, or even minutes, to become
operational after a power cycle 2 , [23], [24]. Fortunately, the research community affirms
that it is possible to reduce this time and that transitions to/from a low power state are
possible in milliseconds [25] by keeping part of the interface active, like in the 802.3az
protocol.
Alternatively, Adaptive Link Rate (ALR) technologies propose to reduce the energy
consumption of network devices and achieving a better energy proportionality by dynam-
ically adapting the maximum rate of the network links. The port modulation is changed
when lower data rates are needed and hence adjust the rate. For example, at low de-
mands, a Gigabit Ethernet port will be passed to a 100Mbit mode. By doing that, the
power consumption will be reduced. [26] analyses the possibility of doing modulation
scaling in optical backbone networks.
New technologies may emerge soon to reduce the power consumption of network de-
vices even further. The STAR project [27], for example, promise shortly to present a
prototype of an all-optical switch which can be used to switch terabits worth of wave-
lengths while only consuming 8 watts.
We believe that the solutions which rely on energy efficient network design, CDNs,
redundancy elimination, and energy-efficient device design are indispensable in a gen-
uinely eco-responsible solution. However, these are long-term solutions. On a shorter
time scale, it is possible to increase the energy efficiency of existing networks by using
intelligent provisioning and traffic management, which we present in the next subsections.
2
Juniper PTX Series Router Interface Module Reference
17
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
18
2.3. TRAFFIC ENGINEERING
works rely on aggregating traffic flows over a subset of network resources, allowing to
switch off a subset of network components.
In an ideal world with energy proportional network devices, the energy consumption
would scale linearly with the utilization. As a result, there would be no way to improve
the energy efficiency by extending the paths of network flows. However, we already
presented in a previous section (2.2.2), that the consumption of the modern devices is
not linear and maximizing the device utilization increases its energy efficiency.
In this thesis, we follow the trend of optimizing the energy consumption in the IP
layer. First of all, the energy consumption of the IP layer was shown to dominate the one
in the optical layer [34]. Secondly, the optical fibers are frequently shared among different
operators and re-optimizations in the optical layer without turning amplifiers off has a
marginal impact on the consumption of the network [32] [33]. It is also important to
notice that, while transponders can be quickly shut down, switching amplifier sites on
and off is a time-consuming task [35]. Lighting on an intercontinental fiber may take
several hours.
We also limit ourselves to only turning off links, i.e. router ports and transponders,
without touching to network devices. The main reason is that there is no clear separation
between core and aggregation devices in production networks. Any core router is, at
the same time, used to aggregate client traffic. Turning the device off would cut the
connectivity of all connected clients. To be noted that some works proposed to virtualize
the network devices and transparently forward optical access/aggregation ports to an
active node using the optical infrastructure [36][37][38]. We believe that this solution has
significant shortcomings. One of them is the fact that access ports are not adapted to
long-distance transmission. We prefer thus to avoid switching off network nodes.
To switch off a subset of network resources, traffic engineering techniques are needed
to wisely optimize the paths for traffic flows while satisfying a set of performance objec-
tives. In the following section, we provide a detailed analysis of what traffic engineering
is and off associated related work in the field of applied energy efficient traffic engineering
via turning links off.
19
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
resource.
Some of the most common traffic engineer optimization objectives are load balancing
to avoid congestion; path protection in case of a link or device failure; differentiated
service for distinct classes of network flows.
In this work we are particularly interested in two traffic engineering objectives: i)
minimizing the energy consumption of the network by releasing links and turning them
off ; ii) increasing the efficiency of resources by increasing their utilization while avoiding
congestion.
A significant challenge of traffic engineering solutions is to react to changes in the
network automatically. To detect these changes, traffic engineering has to gather infor-
mation about the state of the network. Therefore, once a change in network condition
is detected, techniques are required to configure the behavior of network elements to
steer traffic flows accordingly. Such functions, embedded into data and control planes,
were traditionally performed in a decentralized manner, but more recently many traffic
engineering schemes have re-considered the centralization of control functions.
In the remaining part of this section, we present the most popular traffic engineering
techniques. We start with the decentralized and reactive solutions relying on the Inte-
rior Gateway Protocol (IGP). Afterward, we introduce the centralized techniques relying
on the IGP weight assignment and those using MPLS with Resource reSerVation Pro-
tocol Traffic Engineering (RSVP-TE) or Path Computation Element (PCE). We finish
with an introduction to SDN and the related works in the field of energy efficient traffic
engineering.
20
2.3. TRAFFIC ENGINEERING
d d d
1 1
1 1
1
a 1
c a c a c
1 1
1 1
b b b
(a) Uniform costs (b) Routes of a (uniform) (c) Routes of b (uniform)
d d d
1 1
0.
1 99
1
a 1
c a c a c
1 1
1 1
b b b
(d) Assigned costs (e) Routes of a (assigned) (f) Routes of b (assigned)
Fig. 2.7 illustrates this method. The first 3 figures show the result of a shortest path
computation done by nodes a (Fig. 2.7b) and b (Fig. 2.7c) with the uniform IGP metric
from Fig. 2.7a. It so occurs that both the node a and b use the link ad to route traffic
towards the node d, which may be sub-optimal if both nodes a and b send a lot of data
towards d. By reducing the metric (cost) of the link cd in Fig. 2.7d-2.7f we ensure that
b will prefer this link for reaching d.
The disadvantage of solutions relying on the IGP metric is in the fact that they are
limited by the use of shortest path to compute the routes. However, a theoretical property
affirms that an optimal routing is possible in the network by using only D+M paths in
the network [43] , where D is the number of demand pair and M the number of links. As
a result, such solutions can actually approach very close to the optimum.
21
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
[45] protocol is used between nodes to reserve resources along a path in the network.
MPLS is then used to create a virtual tunnel which bypasses the shortest paths. It is also
possible to establish multiple disjoint tunnels from a source to a destination. Equal-cost
Multi-path (ECMP) is afterward used to split the traffic across multiple tunnels towards
an egress device. Even if, by using resource reservation, the convergence speed increases
and the risk of oscillations decreases, this solution still leads to under-optimality due to
the greedy local selection of network paths by each device.
The centralized MPLS-based traffic engineering is the precursor of SDN. In this case,
a Path Computation Element (PCE) is used. A PCE is a special device that monitors
the networks and programs the paths into the network devices. We do not spend time on
this case because it is fundamentally equivalent to SDN, which is explained in the next
section.
• At the bottom, we represent the network devices which are simplified to the max-
imum. Instead of running complex distributed protocols, the network devices are
only capable of forwarding packets that arrive through an incoming port towards
the correct outgoing port. Henceforth data plane devices will be referred to as SDN
switches.
• In the middle is the SDN controller, which executes many of the traffic-engineering
requirements on gathering traffic information, performing management and control.
SDN controllers enable a lot of flexibility when it comes to traffic engineering. They
maintain and exploit network-wide knowledge to execute all required computations.
In Fig. 2.8 the controller is illustrated running a topology service, which discovers
the network topology; a statistics service to keep track of network status, and a
flow rule service to apply routes into the network devices.
22
2.3. TRAFFIC ENGINEERING
Application Layer
Custom application
Northbound API
Statistics service
Topology service
A->B : 1mbps
Controller Layer A
B
A->C : 2mbps
…………..
Southbound API
Infrastructure Layer
• At the top, we show the custom business applications written in the network ad-
ministrator’s language of choice. An example application is cloud infrastructure
utilizing a network to provision cloud services for customers.
The communication between the layers is done via abstraction APIs. This comes in
contrast to the traditional networks, where the control and data planes are tightly cou-
pled. An effort has been made towards standardizing the interface between the controller
and the data plane, generally termed as southbound API, and the manner the controller
exposes network programmability features to applications, commonly called northbound
API. The most known southbound API is OpenFlow. When it comes to the northbound
API, it is frequently exposed as HTTP-based RESTful APIs.
In a short period, Software Defined Networks have emerged from a concept into a
product which is expected to revolutionize the network management. Some SDN pioneers,
including Microsoft[47] and Google[2], have deployed software controlled networks in
production environments, increasing the efficiency of their system.
In the next section, we present the challenges which appear when implementing SDN-
based traffic engineering in backbone networks.
23
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
Distributed traffic engineering
One of the few works which rely only on the IGP for energy-efficient traffic engineering
is [48]. In this work, every node monitors the utilization of adjacent links and decides
whether to switch off a link by using a utility function that depends on the energy
consumption and a penalty function which we describe later. If a link is chosen for
switch off, the node floods into the network its will to turn it off. All nodes update their
IP routing tables via the IGP. As a result, the data flows will avoid the links marked for
sleep. Machine learning mechanisms are used to avoid choosing links whose extinctions
provoked a congestion in the past. A per-link penalty is used for this purpose: each time
a wrong decision caused a degradation of the network performance, the penalty increases,
ensuring that the link will not be turned off in the near future.
Unfortunately, the authors drastically underestimate the overhead of their solution
regarding exchanged messages. In a network with 30 nodes, they estimate that OSPF-TE
would flood the system once every 10 seconds. However, a detailed analysis of OSPF-TE
[42] shows that it will happen six times per second. If we recalculate the results from
[48], we discover that the announced overhead of 0.52% will become a 30% increase in
the number of OSPF-TE flooding. This is not negligible because OSPF-TE flooding is
costly both in the number of messages and needed processing power.
Another work which relies on the IGP to do energy efficient traffic engineering is
[49]. The authors propose to use OSPF-TE to announce the link load into the network.
Afterward, if no link has a load greater than a certain threshold, the least loaded link
is locally turned off by the corresponding node. If the threshold is violated, the last
turned-off link is awakened. This solution has the weakness of potentially turning on
links in the network even if this action does not help to solve the congestion.
[50] presents an original solution where a control node is elected in the network. This
node then guides the construction of the solution to obtain better overall energy savings
while still mostly relying on the distributed operation of the IGP. A subset of this
algorithm was implemented using the Quagga software by the authors of [51]. Sadly, the
paper provides little insights on the lessons learned from this implementation.
All previously mentioned papers suffer from the biggest problem of solutions relying
entirely on the IGP: a slow convergence. That is why centralized solutions, relying on
metric assignment or precise routes via MPLS or SDN, are more frequently proposed in
the literature.
24
2.4. CHALLENGES IN IMPLEMENTING SDN-BASED TRAFFIC ENGINEERING
IN BACKBONE NETWORKS
links, allowing to turn them off effectively. Nevertheless, this approach has a major
drawback: temporary network loops can be created during the re-configuration of IGP
as a result of weight changes.
Alternatively, using explicit paths enables more flexibility in traffic management like,
for example, easier unequal cost multipath forwarding and avoidance of network loops.
Pioneer works in this field proposed to use MPLS for data forwarding[53]. Recently, the
trend shifted towards using SDN for solutions relying on explicit path.
The panorama of research in these two fields is vast. For example, a recent survey
[54], cites more than 50 works which propose to reduce the energy consumption of wired
network via centralized traffic engineering. There is a trade-off to be found between the
energy efficiency, network performance, and computational complexity. For instance, the
optimal solution to the problem of energy efficient traffic engineering can be found us-
ing linear and non-linear models. Moreover, these models can easily be augmented to
incorporate additional constraints like, for example, path protection [55]; limited num-
ber of network reconfiguration per day [56] [57]; redundancy elimination [17]; maximum
propagation delay[58]; Quality of Service (QoS) [59].
A constraint which is frequently encountered in SDN based work is the limiting size
of flow tables [60] [61]. These works target OpenFlow networks and, as a result, flow
coalescing can overflow the flow table which is implemented in a size-limited ternary
content addressable memory (TCAM). We believe that this costly constraint can be
avoided by using the SPRING protocol instead of OpenFlow [62]. This point of view also
seems to be shared by the authors of [63], who are among the first to propose a solution
for energy efficient traffic engineering built with the SPRING protocol in mind.
All the previously mentioned linear and nonlinear models cannot be solved to optimal-
ity on any, but small, network. That is why the authors usually also propose heuristics
intended for use in real networks. Unfortunately, few works report the computational
complexity of their algorithms. Such works include [57], reporting up to one hour of com-
putation; [53] who artificially limit the execution time to 300s; [61] reports 10 seconds on
a 15 node network; [59] announce between tens of second, up to multiple hours depending
on the desired precision.
In our work, we seek to design a solution which can react fast to network condition
changes at the expense of trading off the additional optimization constraints. A similar
work [64] reports 10 seconds of computation on a 400 node network. However, their algo-
rithm heavily depends on the number of request and the authors evaluated the solution
with less than 2000 requests, which is extremely small for such a big network.
25
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
the protocol traditionally used in SDN networks, OpenFlow, relies on the centralized
management to the point that the network becomes nonoperational without a controller.
Such a single critical point of failure is unacceptable in backbone networks.
In this section, we discuss the challenges and requirements in designing a SDN-based
traffic engineering solution for backbone networks. This will allow us to introduce the
constraints which will define the design of our algorithms.
2
3 2 3
a b a b
1
c c 1
d d
This reordering may influence the speed of the flow. The Transmission Control Pro-
tocol (TCP), for instance, may assume a network congestion and reduce the sending rate.
Due to that, the networking folklore implies that changing the route of TCP flows will
have a severe negative impact on their throughput. The research community has ex-
tensively estimated the impact of random packet reordering on the performance of TCP
flows. However, rerouting the flows is different from random packet reordering in the
sense that reordered packets arrive in bursts at the receiver and may result in a differ-
ent behavior compared to randomly reordered packets. Moreover, the joint evolution of
the congestion window size of multiple TCP flows sharing a single router or link exhibit
fluctuations due to the network as a whole and come on top of the standard behavior of
TCP [65].
To the best of our knowledge, previous work has not specifically evaluated the impact
of frequent flow rerouting on multiple aggregated transported flows. Some works which
estimated the impact of reordering on TCP include Bennett et al. [66], who were among
the first to highlight the frequency of packet reordering and its impact on the throughput
of TCP flows. Their results showed that packet reordering has a powerful effect on a
network’s performance. As a result, the authors of [67], for example, proposed a traffic
engineering algorithm which tries to optimize the congestion control and routing jointly.
In the meantime, Laor and Gendel [68] have experimentally measured the impact of
reordering on a testbed and also concluded that even a small rate of packet reordering
could significantly affect the performance of a high bandwidth network. A lot of work
was afterwards performed to increase the tolerance of TCP to packet reordering [69] [70]
[71] [72].
26
2.4. CHALLENGES IN IMPLEMENTING SDN-BASED TRAFFIC ENGINEERING
IN BACKBONE NETWORKS
Finally, a solution to detect unneeded retransmissions caused by packet re-ordering
and undo the congestion window was proposed and implemented. The problem of packet
reordering was considered solved until the introduction of Multi-Path forwarding, where
packets of a single flow are split among different paths [73] [74]. An RFC was even created
to prevent splitting TCP flows over multiple paths [75].
It is worth noting that different authors arrive at opposite conclusions concerning the
impact of multi-path forwarding on TCP. Karlsson et al. [73], for example, concluded
that multi-path forwarding reduces the throughput of TCP flows and that mitigation
techniques implemented at the transport layer in the Linux kernel are not effective in
reducing the impact of packet reordering. On the other hand, the authors [74] affirm
that multi-path forwarding in fat tree data-center networks has little impact on the
throughput. In any case, modern multi-path forwarding techniques try to act on a flowlet
level and avoid reordering [76].
We performed an extensive evaluation on a testbed with the goal to answer the ques-
tion whether route changes due to traffic engineering will have an adverse impact on the
TCP flows and severely reduce the network performance. The detailed evaluation will be
presented in the chapter 6. However, jumping forward, we can affirm that it is safe to
reroute the flows as long as this action is not performed “too frequently”.
27
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
Gathering information about the state of the network
We are aware of the following approaches when it comes to gathering information about
the network state:
• Traffic agnostic: the network traffic is not taken into consideration at all.
• Based on link utilization: the solution relies on knowing the utilization of each link.
• Relying on the traffic matrix (origin-destination flows): only the information about
the accumulated bandwidth flowing from an ingress device towards each egress
device is transmitted.
Given these results, we can affirm that, in a reasonably sized backbone network with
100 nodes, it is possible to maintain a fresh view of the network traffic matrix effortlessly
even with any currently available SDN protocols. Moreover, in real-world networks,
traffic entering the network does not usually change unexpectedly. A vast improvement
is possible by assuming a temporal stability of the flows. Asynchronous notifications can
28
2.5. THE SPRING PROTOCOL
be used to inform the controller only about changes in flow sizes at the cost of having a
slightly less accurate view of the network traffic [82].
29
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
[
L1 [b,L1]
a
a a
b c
L2 b c b [b,L1] c [f,a]
L5
L3 L4 d e
d e d e [b,L1]
[f,a] [f,a]
g h
g h g h f [f,a]
f f
(a) Segment Identifiers (b) Force data over a link (c) Force data over a node
global scope. In the present work, we are interested in two types of SIDs, namely “nodal”
and “adjacency” as shown in Figure 2.10a.
After network discovery, sending a packet to node a through the shortest path requires
encapsulating it into a packet with destination a. Unlike in IP, much more flexible traffic
engineering is possible:
• If node h wants to send a packet to node a while forcing it over link L1, it adds the
header rb,L1s (Figure 2.10b).
• If h wants to send a packet to a via f (Figure 2.10c), it uses the header rf,as.
Being a source routing protocol, SPRING enables fast flow setup and easy reconfig-
uration of virtual circuits with minimum overhead since changes must be applied only
to the ingress devices. No time and signalling are lost re-configuring the midpoint de-
vices.The policy state is in the packet header and is completely virtualized away from
midpoints along the path. This means that a new flow may be created in the network by
contacting only one network device: the ingress router.
This agility and atomicity of flow paths updates is essential in solutions that intend to
perform link switch-off and improve energy efficiency. This comes in contrast with Open-
Flow where the forwarding tables of all the devices along the path must be reconfigured.
As already mentioned earlier, the need to synchronize the flow rule updates created a lot
of problems for the research community and even was one of the causes that provoked an
outage in the Google’s SDN backbone network.
Following all this analysis, we decide to rely on the SPRING protocol for our traffic
engineering, as it respects better the requirements for designing an SDN-based solution
that is presented in the current section.
30
2.6. CONCLUSION
2.6 Conclusion
In this chapter, we have introduced the context of backbone networks. We did a quick
overview of the architectures which are currently deployed in operator infrastructures.
In particular, we compared the IPoOTN and IPoWDM architectures and motivated our
choice to concentrate our research on the latter: we believe that it is better suited to
route modern traffic, which is predominantly packet-based.
The number of research papers that try to reduce the energy consumption of IPoWDM
networks has recently exploded in number. We sought to present the panorama of works
in this field, both at the network design phase and the network operation phase. Evidence
shows that the best way to reduce the energy consumption of such network is to manage
the traffic intelligently in the packet layer; a technique known in the literature as “energy
efficient traffic engineering”
We continued by presenting the existing traffic engineering techniques and the state
of art works which apply these techniques to the problem of reducing the energy con-
sumption of computer networks. This allowed us to conclude that there is a need in a
online, reactive, framework with low computational overhead.
We finished the chapter with a discussion on the challenges which may define the
design of such a framework. We concluded that a good SDN-based traffic engineering
solution must avoid sending explicit paths to the SDN switches. Combining a centralized
optimization with a local update of network paths can drastically reduce the amount of
control traffic passing through the network. As a way to achieve this goal, we proposed
to use the Source Packet Routing In NetworkinG (SPRING) protocol, which allows to
change the paths of flows by atomically changing a single forwarding entry on the ingress
network device.
31
CHAPTER 2. POSITIONING AND RELATED WORKS ON ENERGY
EFFICIENCY AND TRAFFIC ENGINEERING IN BACKBONE NETWORKS
32
Chapter 3
Theoretical background
In the previous chapter, we introduced the context of IPoWDM backbone networks and
presented the multiple ways used to reduce their energy consumption. In particular, we
are interested in solving this problem via intelligent traffic engineering in the IP layer.
In this chapter, we provide the necessary theoretical background on which we base
our work. We first introduce the classical way to model an IP layer of a communication
network via graph theory. After that, we provide a brief introduction to the linear
programming. Armed with these basics, we dig a deeper into the theory of the maximum
concurrent flow problem and introduce theoretical results that enable efficient algorithms
for centralized traffic management.
da,b
b
@e, cpeq “
4
l
“
“
1
l
10Gbps
l“1
a c
da,b
33
CHAPTER 3. THEORETICAL BACKGROUND
• A function c : E Ñ R` represents the capacities of the edges, i.e. the link speed.
• The traffic entering the network at the source node i P V and flowing toward the
egress destination node j P V is defined as the demand di,j ě 0. We assume that
there is no traffic from a node to itself, i.e. @v P V, dv,v “ 0.
• A path P “ pe1 , e2 , ...q is a sequence of edges such that (ei “ ab and ei`1 “ xyq ñ
b “ x). The lengthřof a path P “ pe1 , e2 , ...q is the sum of the lengths of the edges in
the path: lpP q “ lpeq. (in Fig. 3.1b, the two dashed lines show 2 paths between
ePP
a and b).
• A shortest path SPi,j from the node i to node j is a path such that there is no other
path P between i and j with lpP q ă lpSPi,j q. In Fig. 3.1, the path P1 “ pac, cbq
is the shortest path between a and b under the given length function l, because
lpP1 q “ lpacq ` lpcbq “ 2 ă 4 “ lpabq.
• We introduce the notation Pi,j : the set of all the paths between nodes i and j in
G. Let P “ Ypi,jqPV 2 :i‰j Pi,j be the set of all the paths in the network. Note that,
by definition, P also contains the paths with cycles. To simplify the examples in
the forthcoming sections, we do not consider these paths, as they do not influence
the result and are avoided during the execution of the algorithms based on shortest
path computations.
34
3.2. LINEAR PROGRAMMING 101
2
Maximize 2 ¨ x1 ` 1 ¨ x2 under the con-
3 ¨ x1 ´ 4 ¨ x2 ď 12 1.5
(4.5,1.6)
´1 ¨ x1 ´ 2 ¨ x2 ď ´7 (5.15,1.2) (5.6,1.2)
´2 ¨ x1 ` 7 ¨ x2 ď 0 1
straints:
6 ¨ x1 ` 7 ¨ x2 ď 42
0.5
x1 ě 0 feasible region valid solution
constraint invalid solution
x2 ě 0 0
objective function optimal solution
4 4.5 5 5.5 6
Any point that is not within the feasible region violates at least one of the constraints
of the problem. For example, this is the case for the point represented by a triangle. It
respects all the inequalities except the third one: ´2 ¨ 4.5 ` 7 ¨ 1.6 “ 2.2 ę 0.
The previous linear program can also be represented in a matrix form tcT x|Ax ď
b, x ě 0u:
$ ,
’
’ ¨ ˛ ¨ ˛ /
/
3 ´4 12
’
’ /
/
’
’ ˆ ˙ˇ ˇ ˆ ˙ ˆ ˙ ˆ ˙/ /
& ` ˘ x1 ˇ ˚´1 ´2‹ x1 ˚´7‹ x1 0
.
max 2 1 ˚ ‹ ď ˝ ‚,
˚ ‹ ě
x2 ˇ ˝´2 7 ‚ x2 0 x2 0 /
ˇ
’
’ c
’ T /
x 6 7 x 42 x 0 /
’ /
’
’ /
/
A b
% -
An important property of the Linear Programming is the duality. For any maximiza-
tion linear program presented in the form maxtcT x|Ax ď b, x ě 0u, we can formulate a
corresponding minimization problem mintbT y|AT y ě c, y ě 0u, called the dual problem.
The initial problem is called primal.
In matrix form, the dual problem will be represented as follows:
$ ,
’
’ ¨ ˛ ¨ ˛ ¨ ˛ ¨ ˛/ /
y y y1 0 /
’ /
1 1
’
’ ˇ /
’ ˆ ˙ ˆ ˙ /
&` ˘ ˚y2 ‹ ˇ 3 ´1 ´2 6 ˚y2 ‹
ˇ
2 ˚y 2
‹ ˚0 ‹.
min 12 ´7 0 42 ˝ ‚ˇ
˚ ‹ ˚ ‹ ě , ˚ ‹ ě ˚ ‹
’ T
y3 ˇ ´4 ´2 7 7 ˝y3 ‚ 1 ˝y3 ‚ ˝0‚/
’ b y T y y 0
/
c
’ /
’
’
’ 4 A 4 4
/
/
/
y y y 0
% -
This notation makes it easier to observe that each variable of the dual corresponds to
a constraint in the primal.
One of the fundamental duality properties is referred to as “weak duality” and
provides a bound on the optimal value of the objective function of either the primal
or the dual. The value of the objective function for any feasible solution to the primal
maximization problem is bounded from above by the value of the objective function for
any feasible solution to its dual minimization problem. Similarly, the objective function
35
CHAPTER 3. THEORETICAL BACKGROUND
of the dual problem is always greater than the value of the objective of the primal.
Furthermore, in a case when the linear program is feasible and has an optimal solution,
we have:
Strong Duality Property. If the primal (dual) problem has a finite optimal solution,
so does the dual (primal) problem, and these two values are equal. That is, if we denote
by x˚ the optimal primal solution and by y ˚ the optimal dual solution, we have:
c T x ˚ “ bT y ˚
The importance of this property is that it indicates that we may, in fact, solve the dual
problem in place of (or in conjunction with) the primal problem. A solution which we
show in a future chapter does indeed use this property to construct the primal problem
jointly with the dual.
A linear program in which some variables are restricted to be integers is called Mixed-
Integer Linear Programming (MILP). The integrality constraints allow the model to
capture the discrete nature of the decisions. For example, a variable whose values are
restricted to 0 or 1, called a binary variable, can be used to decide whether or not
some action is taken. Unfortunately, the introduction of the integer variables makes the
problem NP-hard.
In the next section, we study a real problem modeled in a linear program. In par-
ticular, we provide the classical edge-path modeling of the maximum concurrent flow
problem.
36
3.3. THE MAXIMUM CONCURRENT FLOW PROBLEM
da,b “ 9Gbps
P1
P2
P3
da,b “ 9Gbps a c P4
Figure 3.3: Paths which can be taken by the packets of the demands da,b and da,c
The operator wants to find how much more traffic can be theoretically absorbed into
the network before having to replace the network devices with faster, more expensive and
more power hungry ones. It assumes that all demands will increase at comparable rate.
For example: if da,b doubles to 18Gbps, da,c will also double to 2Gbps.
To find the solution, we use the following objective: maximize λ ¨ 9Gbps ` λ ¨ 1Gbps
= maximize λ ¨ 10Gbps = maximize λ [2 ] thus represents the traffic growth rate.
One can see that the only two possible ways to transport the data from node a to
node b is by passing through the paths P1 “ tac,cbu and P2 “ tabu. We ignore the paths
containing cycles as, for example: tac,ca,abu.
Let fa,b pP1 q be the fraction of the demand λ ¨ 9Gbps passing through the path P1 .
By construction: fa,b pP1 q ` fa,b pP2 q “ λ ¨ 9Gbps. The same for the demand from a to c:
fa,c pP3 q ` fa,c pP4 q “ λ ¨ 1Gbps.
The notation fa,b pP1 q is redundant as the path P1 is known to start at node a and to
end at node b. We therefore use a condensed notation f pP1 q. The previous equations can
be rewritten as Demand constraints:
Also, the flows passing through a link must be smaller than the capacity of this link.
As a result, we have the following Capacity constraints:
Finally, the amount of data passing through a path cannot be negative. Flow non-
negativity constraints:
2
“10Gbps” is a constant. As a result, maximizing λ ¨ 10Gbps is equivalent to maximizing λ.
37
CHAPTER 3. THEORETICAL BACKGROUND
f pP1 q ě0
f pP2 q ě0
(3.3)
f pP3 q ě0
f pP4 q ě0
These equations can be written in a condensed form:
Maximize: λ
Subject to constraints:
ÿ
from 3.1 : f pP q “ λ ¨ di,j @pi,jq P V 2 : di,j ą 0
P PPi,j (3.4)
ÿ
from 3.2 : f pP q ď cpeq @e P E
P PP:ePP
from 3.3 : f pP q ě 0 @P P P
The problem solved by the Linear Program 3.4 is known by the name of Maximum
Concurrent Flow Problem (MCFP). In the following, we refer to any valid solution λ
as “concurrent flow of throughput λ”. We use the notation λ˚ to designate the optimal
solution to this problem.
V set of nodes
E set of edges
cpeq capacity of the edge e P E
fei,j ě 0 flow from node i P V to node j P V on link e P E
wv` set of outgoing links from the node v P V
wv´ set of inbound links towards the node v P V
38
3.3. THE MAXIMUM CONCURRENT FLOW PROBLEM
sum of the flow through arcs directed away from that node plus that node’s demand if
any.
$
ÿ ÿ &´λ ¨ di,j ,
’ if v “ i
fei,j ´ fei,j “ λ ¨ di,j , if v “ j
’ (3.5)
ePwv´ ePwv` 0, otherwise
%
@v P V, @pi,jq P V 2 : di,j ą 0
ii) Edge capacity constraints. The total flow passing through a link must be smaller than
its capacity.
ÿ
fei,j ď cpeq @e P E (3.6)
pi,jqPV 2 :di,j ą0
All the models used for validating our heuristics are based on this node-arc LP for-
mulation.
Proof. As per link capacity constraints (3.2) of the primal LP, if a concurrent flow of
throughput λ is feasible, the amount
ř of flow which passes through a link never exceeds
the capacity of this link: cpeq ě f pP q. As a result, we have:
P PP:ePP
ÿ ÿ ÿ
lpeqcpeq ě plpeq f pP qq “
ePE ePE P PP:ePP
ÿ ÿ
“ lpeqf pP q “
ePE P PP:ePP
ÿ ÿ (3.7)
“ p lpeqqf pP q “
P PP ePP
ÿ
“ lpP qf pP q
P PP
39
CHAPTER 3. THEORETICAL BACKGROUND
Furthermore, the length of a shortest path between a source node i and a destination
node j is, by definition, shorter or equal to the length of any other path between these
nodes, i.e.: @P P Pi,j , lpP q ě lpSPi,j q. As a result:
ÿ ÿ ÿ
lpP qf pP q ě plpSPi,j q f pP qq “
P PP pi,jqPV 2 :di,j ą0 P PPi,j
ÿ
“ lpSPi,j qλdi,j “ (3.8)
pi,jqPV 2 :di,j ą0
ÿ
“λ lpSPi,j qdi,j
pi,jqPV 2 :d
i,j ą0
ř
From equations 3.7 3.8, and under the assumption that lpSPi,j qdi,j ‰ 0,
pi,jqPV 2 :di,j ą0
we prove the claim.
The previous results can be applied to the particular case of an optimal maximum
concurrent flow λ˚ :
ř
lpeqcpeq
ePE
ř ě λ˚ (3.9)
lpSPi,j qdi,j
pi,jqPV 2 :di,j ą0
ÿ
Minimize: lpeqcpeq
ePE
Subject to constraints:
ÿ
lpeq ě zpi,jq @pi,jq P V 2 : di,j ą 0; @P P Pi,j (3.10)
ePP
ÿ
zpi,jqdi,j ě 1
pi,jqPV 2 :d
i,j ą0
lpeq ě 0 ePE
Lemma 3.3.2. Setting zpi,jq equal to the length of the shortest path between i and j
under the length function l, i.e. zpi,jq Ð lpSPi,j q, neither changes the cost nor impacts
the feasibility of the LP.
Proof. The first group of constraints forces each zpi,jq to be at most equal to the length
of the shortest path between i and j. In any valid solution to the LP 3.10, if some zpi,jq is
ř smaller than lpSPi,j q, we can safely increase it without invalidating the constraint
strictly
zpi,jqdi,j ě 1 or changing the objective value.
pi,jqPV 2 :di,j ą0
40
3.4. CONCLUSION
ÿ
Minimize: lpeqcpeq
ePE
Subject to constraints:
ÿ
lpeq ě lpSPi,j q @pi,jq P V 2 : di,j ą 0; @P P Pi,j (3.11)
ePP
ÿ
lpSPi,j qdi,j ě 1
pi,jqPV 2 :di,j ą0
lpeq ě 0 ePE
ř
Lemma 3.3.3. Let Dplq :“ lpeqcpeq be the quantity minimized by the dual. For any
ePE
valid solution of the LP 3.11,
ř
lpeqcpeq
Dplq ě řePE ě Dplq˚ (3.12)
lpSPi,j qdi,j
pi,jqPV 2 :di,j ą0
ř
Proof. The first inequality is a direct application of the constraint lpSPi,j qdi,j ě
pi,jqPV 2 :di,j ą0
1 of the LP 3.11. The second inequality is a direct application of the strong duality the-
orem on Equation 3.9 from the previous section.
3.4 Conclusion
This chapter rapidly introduces the theoretical background of this thesis. Our primary
goal was to present the “maximum concurrent flow problem”, a problem which received
extensive attention from the academic research community. One of these works [5] pre-
sented an efficient algorithm, which is adapted to our needs in Chapter 5.
We also highlight the importance of Equation 3.12 which provides an upper bound to
the dual optimization objective. It is at the core of various algorithms for approximately
solving the maximum concurrent flow problem, including our work.
41
CHAPTER 3. THEORETICAL BACKGROUND
42
Chapter 4
In this chapter, we present the SegmenT Routing based Energy Efficient Traffic Engi-
neering (STREETE) framework, an online SDN-based method for switching links off/on
dynamically according to the network load. To react quickly to traffic fluctuations in
operators’ core networks, we trade off optimality for speed and create a solution that
relies entirely on dynamic shortest paths to achieve very fast computational times.
One of the main contribution of this chapter is the ability to react very fast in order
to awake sleeping links. The literature has thus far mostly overlooked the problem of
turning links on assuming that the traffic varies slowly in backbone networks due to
the high aggregation level. Nevertheless, unexpected burst still happen and must be
rapidly assimilated by the network. For example, the real traffic matrices which we use
for validation on the Germany50 network contain such a sudden burst of traffic volume.
This chapter starts by giving a high-level overview of our framework, and then details
the implemented algorithms. It then presents the methodology and the techniques used
to evaluate the proposed solution. In particular, we provide a quick overview of network
topologies; the software used for evaluation; the analyzed metrics; and the results ob-
tained by simulation. We highlight the observed strengths and limits of STREETE, thus
making a transition towards the next chapter, which presents a mean to make STREETE
work under extremely high network load at the expense of increased complexity.
A large part of this chapter also presents the Software-Defined Network (SDN) testbed
built for experimental evaluations and discusses on the problems revealed by the experi-
mental evaluation.
• All ingress devices inform the network controller of their demand towards each
egress node. That way, the controller maintains an up-to date network-wide traffic
43
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
matrix.
• The controller, which has an up-to-date and global view of the network topology
and the traffic matrix, executes an algorithm to determine the network links that
must be turned on or off.
• The controller sends the previously computed information to the SDN switches.
• The switches locally recompute the new routes to avoid the links which are off and
update their forwarding tables.
The algorithm that the controller runs continuously simulates the routing of the latest
network traffic matrix while trying to detect network under-utilization and/or highly
utilized links. Under-utilization offers a chance to reduce the energy consumption by
turning links off. Hence, the algorithm performs an in-memory simulation of turning the
candidate links off and estimating the impact of such a decision, without actually turning
them off in the physical network. Whenever the result of the simulation shows that it
is possible to turn a link off without creating congestion, it is scheduled for switch off.
Afterwards, the algorithm re-iterates to, potentially, turn-off other links. On the other
hand, under high link utilization an imminent congestion is assumed and the algorithm
acts in the same way to turn-on network links and solve the congestion if possible.
While the solution is conceptually simple and corresponds to a greedy, “generate
and test”, solution, it includes a couple of important optimizations that improve the
performance and the quality of the final solution. The rest of this section presents the
optimizations included into STREETE.
Provide an order for the greedy algorithm. The algorithm keeps two different
views of the network: i) the current network view, with some links being off ; and ii) the
full network topology, where all the links are assumed active. The second view allows
guiding the construction of the solution by providing an order for the greedy algorithm.
Using a fully active view of the network provides useful metrics for the algorithm.
This is especially important for turning links on. Compared to the phase of switch off, it
is much more difficult to correctly select a good subset of links to turn on that can solve
congestion. It may occur that the best candidate link is far from the place of congestion.
It is also usual to have to turn-on more than one link for this purpose.
For the phase of turning off, providing an order to the execution is less important.
Simple local heuristics generally lead to good results [79]. Nevertheless, our double view
of the network is still beneficial by avoiding chaining of link extinction. This case will be
detailed latter in the current section.
The intuition behind the method that we propose is depicted in Figure 4.1. Starting
from the network view with links being turned off (Figure 4.1a), the algorithm uses the
traffic matrix to find the highly loaded links (Figure 4.1d). In the meantime, the demands
from the traffic matrix are simulated being routed into the full network topology (Figure
4.1b). This action allows to sort the links in the decreasing order of their estimated uti-
lization in the fully active network. Figure 4.1e illustrates the result of this computation
where the thickness of the links represents their relatively high utilization compared to
other links. The previous action enables defining the order in which the links will be
44
4.1. GENERAL OVERVIEW
d d
c c
e f e f
b i b i d
h h c
g a b c ... g f
a a e
1
a) Network with links off (G ) a 0 9 2
d) Highly loaded links (CLG 1 )
b 3 0 2 i
d c 1 2 0 d b
c c h
f ... f g
e c) T M e a
f) Link turned on
b i b i
h h
g g
a a
b) Full network (G) e) (Ordered) candidate links
Figure 4.1: STREETE-ON: simulating the all-on network view to select the best link(s)
to turn on.
turned on until, if possible, the congestion is resolved. For example, in Figure 4.1f, the
link ab is selected as the best candidate for turn on because it is off, but it would transfer
a lot of data in the all-on network.
Reduce the control traffic overhead. The algorithm executed by the SDN controller
relies entirely on shortest path computations. This particularity allows to keep the com-
munication overhead as low as possible because the SDN controller does not need to send
explicit routes to the SDN switches. After receiving instructions from the controller, each
network device locally computes the routes by using the same constrained shortest path
algorithm as in the controller and naturally avoids disabled links. The only information
which has to be sent from the SDN controller to the switches contains the state of all
the links in the network: on/off. Moreover, there is no need to coordinate the updates
between devices to avoid transient network loops. This problem is avoided by relying on
the Source Packet Routing In NetworkinG (SPRING) protocol presented in a previous
chapter.
Reduce the impact of historical decisions. As mentioned earlier, the two views of
the network also reduce the influence of the historical decisions on the execution of the
algorithm for the phase of extinction. This allows to avoid chains of turned-off links.
To provide an example on why a historical decision may have a negative impact on
the final solution, we consider an algorithm from a related work [49], which works as
follows: “while there are no highly utilized links in the network, turn-off the least loaded
link ”. Figure 4.2 zooms in a subset of 4 nodes from a hypothetical network. Because
the link AB was turned off in the first figure, the flow through the links BC and CD
is reduced. That action made BC the best candidate for switch off. As a result, after
45
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
turning it off in the second figure, the flow on CD is reduced even further, making it the
best candidate.
After doing some preliminary tests, we observed that this chain of events generates
unnecessarily long routes in the network. In contrast, STREETE computes the order of
extinction based on the all-on topology. This way, we ensure that the algorithm is not
impacted by the actual state of links in the network.
4.2 Algorithms
In this section, we present the details of the algorithms executed by the SDN controller.
4.2.1 Notations
To facilitate the formal description, we use a few custom notations besides the classical
graph notations introduced in the previous chapter. For instance, we use the notation G
1
for the initial network topology with all the links being on, and G for referring to the
1 1
modified topology, with a subset of links switched off : V “ V and E Ă E. An example
1
of G and G can be seen in Figure 4.1a and 4.1b respectively.
We use the symbols α and β for referring to the two thresholds which trigger turning
links on and, respectively, off.
Based on these thresholds, for a given network G and a traffic matrix T M, we define
the set of links at critical load CLG,T M “ te P G | uG,T M peq ą αu. It contains the links
with utilisation higher than α in G. For example, if α “ 0.8, CLG 1 will contain the links
46
4.2. ALGORITHMS
with utilization greater than 80%. Hereafter we refer to links in CL as “links close to
congestion”. In our example, Figure 4.1d, CLG 1 ,T M “ tih, hf u.
Respectively, we use MLG,T M “ te P G | uG,T M peq ą βu for referring to links at
load more than β. The turn-off phase will avoid creating such links. To be noted that
MLG,T M Ě CLG,T M .
The choice to never turn-off the links located on the minimum spanning tree may be
criticized, because some links can remain enabled without being used to route network
traffic. However, in the context of backbone networks, with aggregated network traffic the
probability of this to happen is low. Fixing an always active spanning tree allows to keep
the network connected for management purposes. Moreover, it may even be interesting
to use graph spanners [89] instead of spanning trees to guarantee that no unnecessarily
long paths are created in the network as a result of turning links off. A t-spanner of a
graph G is a subgraph in which the distance between any two nodes is at most t times
longer than the distance between the same nodes in G. Such a solution for grid networks
was proposed in a related work [90].
The main execution loop is self-explanatory (Alg. 1). It consists of an infinite loop
which will periodically call STREETE-ON to find links which must be turned on and
1
apply the change into the in-memory graph structure G , followed by STREETE-OFF
to find the links which must be turned off. The state of links will afterwards be sent to
the network devices. Note that the algorithm can, in fact, shut down some links while
turning on others. This happens when the load is increased in one part of the network
while being reduced in another.
The following section presents the details of STREETE-ON. We do not enter into the
details of STREETE-OFF, as its behavior is exactly the same as the turn-on, with the
difference that links will be sorted in the ascending order of their utilization.
47
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
4.2.3 STREETE-ON
Algorithm 2 STREETE-ON
1
1: procedure STREETE-ON(G, G , T M)
2: result Ð H
1
3: candidateLinks Ð EzE
4: congested Ð CLG 1 ,T M
5: changed Ð true
6: while congested ‰ H and changed “ true do
7: changed Ð f alse
8: for all e P sortedpcandidateLinks,uG,T M q do
9: Ź sorted in descending order of utilisation in G
10: if congested ‰ H then
1 1
11: E Ð E Y teu Ź Turn on the link e
12: congestedaf ter Ð CLG 1 ,T M
13: if congestion decreased then
14: changed Ð true
15: candidateLinks Ð candidateLinkszteu
16: congested Ð congestedaf ter
17: result Ð result Y teu
18: else
1 1
19: E Ð E ze Ź Do not keep e on
20: end if
21: end if
22: end for
23: end while
24: return result
25: end procedure
We mentioned in a previous section that our greedy heuristic first tries to turn-on the
1
links which are off in the energy-optimised network G , but would have a high utilisation
in G. This action is performed by Algorithm 2 at lines 3 and 8. We prioritise turning on
the links that would create shortcuts for large flows.
At lines 11-12 the algorithm simulates turning the link on and estimates the congestion
after this operation. At lines 13-20 it tests if congestion decreases. If so, the link is
scheduled to be turned on in the physical network; otherwise the link is disregarded.
“congestion decreased” at line 13 is defined as:
• solving all the congestion
• or decreasing the utilization of any congested link while avoiding both i) to create
new congested links and ii) to increase the utilization of any congested link.
Sometimes a set of links must be turned on to avoid congestion. The while loop at
line 6 ensures that the algorithm is repeated until the congestion is resolved or it becomes
impossible to solve without creating another congestion.
The previously presented algorithm hides a complex operation when accessing CL
at lines 4 and 12, namely estimating the load of links to find the congested ones. To
48
4.2. ALGORITHMS
estimate the load of the link, we must simulate the routing in the network by first finding
the shortest paths and, afterwards, simulating the demands flowing through the network.
The next section presents how we use dynamic shortest path algorithms to reduce the
complexity of the first step. However, for the second step, it is also possible to achieve
an Op|V |q complexity for routing all the |V|-1 flows from a common source. As this is
not trivially intuitive, the algorithm 3 presents a way to achieve such a running time.
Algorithm 3 Compute how much data passes through the links of the spanning tree T
rooted at node vroot if the demands dvroot ,˚ , ˚ P V are routed into the tree
1: function RouteInTree(T pV, ET q, vroot P V, dvroot ,˚ )
2: sortedV Ð SortTreeNodes(T , vroot )
3: initialize fv Ð 0, @v P V
4: for i Ð |V | ´ 1; i ą 0; i Ð i ´ 1 do
5: v Ð sortedV ris
6: find pu,vq P ET Ź Find the unique parent node u of v
7: fv Ð fv ` dvroot ,v
8: fu Ð fu ` fv
9: fpu,vq “ fv
10: end for
11: return fp˚,˚q “ tfpu,vq : uv P ET u
12: end function
The main idea relies on the fact that, in a directed tree, the traffic crossing a link pu,vq
can be recursively defined as the sum of i) the traffic going to the destination endpoint
v; plus 2) the traffic flowing through the links tpv,˚q|˚ P V u Ă ET outgoing from v. The
algorithm can be imagined as starting with the leaf nodes, which have no outgoing traffic.
Afterwards, the algorithm climbs into the tree and computes the flow on each link using
the previously mentioned recursive relation.
Algorithm 4 Topologically sort the nodes of the (directed) tree T starting from the root
node vroot . Returns the ordered list of nodes.
1: function SortTreeNodes(T pV,ET q, vroot P V )
2: sorted “ rvroot s Ź List of nodes with one element pvroot q
3: i“0 Ź Index in sorted of the next node to analyze
4: while sorted.size() ă |V | do
5: u Ð sortedris
6: for all v P V s.t. pu, vq P ET do
7: sorted.append(v)
8: end for
9: iÐi`1
10: end while
11: return sorted
12: end function
To achieve that, the algorithm proceeds in two phases. First, the tree is topologically
sorted, i.e. the nodes are ordered such that there is no link going from a node towards any
of its predecessors. This is easy to achieve in a tree and consists in traversing it starting
49
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
from the root node, as shown in Algorithm 4. The algorithm 3 afterwards traverses the
sorted nodes in decreasing order (lines 4-10). This order ensures that, when a node is
encountered, the flow passing through all its children was already computed.
Algorithm 5 Update the shortest path tree T to maintain the shortest paths from the
root node vroot towards all the other nodes after the insertion of a new edge eG 1 into the
1
graph G
1
1: procedure D-RRL-addEdge(G pV, EG 1 q, T pV,ET q, vroot P V, e P EG 1 )
2: pu, vq Ð e Ź Get the extremities of the edge in G 1
3: if v “ vroot then
4: return
5: end if
6: if distpvq ď distpuq ` lpeq then
7: return
8: end if
9: initialize distpwq Ð lpvroot , wq, @w P V
10: initialize pq Ð P riorityQueuepq
11: distpvq Ð distpuq ` lpeq
12: ET Ð pET ztp_,vquq Y teu Ź Delete the unique edge going to v in T and add the new edge e
13: pq.addpv, distpvqq
14: while pq.notEmptypq do
15: v “ pq.extractT oppq
16: for all w P V t.q. pv, wq P EG 1 do
17: if distpwq ą distpvq ` lppv,wqq then
18: distpwq Ð distpvq ` lppv,wqq
19: ET Ð pET ztp_,wquq Y tpv,wqu
20: pq.addOrU pdatepw, distpwqq
21: end if
22: end for
23: end while
24: end procedure
50
4.3. EVALUATION BY SIMULATION
• Partial update of shortest paths each time an edge is inserted or deleted. The
1
algorithm 5, details this step for a partial update following an edge insertion to G .
The algorithm tests at line 6 whether the insertion of the edge does reduce the distance
from the root node to the destination end of the edge. If the distance is not decreased,
there is no need to do any changes to the spanning tree.
If the insertion of the edge does decrease the distance towards the node v, the algo-
rithm executes a Dijkstra-like construct (lines14- 22 ) to update the impacted branch of
the shortest path tree while leaving intact the rest of the tree. The actual reconstruction
of the shortest path tree is done at lines 12 and 19: we delete from the shortest path tree
the edge connecting the affected node and its predecessor on the path from the root. It
is safe to assume that exactly one such edge exists, because every node except the root
has exactly one inbound edge in a directed tree. The root node would never be examined
due to the condition at the beginning of the algorithm.
The use of the dynamic shortest path algorithm has two advantages. First of all, it
1
only recomputes the solution for the impacted subtree of each |V | trees. Secondly, the
biggest improvement is due to the fact that only a fraction of these trees will be, at all,
impacted. Most of them will not pass the condition at line 3.
51
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
NSFNet
USNET ATT
Geant
Italian
• The traffic matrices for the Geant and Germany50 networks are taken at 15 minute
and, respectively, 5 minute intervals. We have no insight on the evolution of network
flow in-between this sampling intervals. Moreover, the matrices date from 2007 and
may not reflect the state of traffic in modern networks. Over the past years, the
cloud has been extensively adopted by numerous companies and access to video
streaming on demand has become commonplace in households around the world.
• The traffic matrices for the NSFNet, USNet, AT&T and Italian are issue from
the GreenTouch project [93][94] and represent the average business Internet traffic
expected between nodes of the networks in year 2020. For each network, we have
12 traffic matrices which reflect a day of operation with a 2 hour sampling interval.
• Most of these networks are small and may not correspond to the size of modern
networks regarding the number of devices. Unfortunately, the operators keep the
information about their network topology confidential. The only backbone network
52
4.3. EVALUATION BY SIMULATION
who’s topology is not secret is Geant1 . The size of this network didn’t change
much compared to the one provided by SNDLib in 2007. For this reason, we follow
the same path as the rest of the research community and assume that the sizes of
networks presented in Fig. 4.3 are representative of the reality.
Nevertheless, we believe that the continuous growth of network traffic may lead
to a growth in number of devices. To evaluate the computational time on bigger
networks, we also executed the algorithms with uniform all-to-all traffic on the
networks from Figure 4.4. The first one, Coronet, was taken from the Internet2 .
We also artificially generated a 200 node network using the igen3 toolbox.
CORONET Generated200
• We do not know the real capacities of the network links. We empirically adapted
their capacity to obtain a maximum link utilization around 80% at the moment of
the highest utilization.
1
https://www.geant.org/Networks/Pages/Home.aspx
2
http://www.monarchna.com/topology.html
3
http://igen.sourceforge.net
53
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
V set of nodes
E set of links
cpeq capacity of the link e
di,j demand from source node i P V towards the destination node j P V
fei,j flow from node i P V to node j P V passing through link e P E
xe boolean: 1 if the link e P E is ON; 0 if it is OFF
wv` set of outgoing links from the node v P V
wv´ set of inbound links towards the node v P V
The model objective is to minimize the number of links being kept active.
ÿ
minimize: xe
ePE
Subject to:
Flow conservation constraint:
$
ÿ ÿ &´ds,d ,
’ if v “ s
fes,d ´ fes,d “ ds,d , if v “ d
’ (4.1)
ePwv´ ePwv` 0, otherwise
%
@v P V, @ps,dq P V 2 : ds,d ą 0
Link capacity constraint:
ÿ
fes,d ď cpeq ¨ xe @e P E (4.2)
ps,dqPV 2 :d
s,d ą0
The MILP model has a small advantage over STREETE: flows are allowed to follow
multiple paths in the network. This was done to reduce the computational complexity.
Without this simplification, the MILP solver was sometimes unable to find the exact
result in 2 hours even for the 24 node USNET network.
To produce the optimal baselines for comparison, we use the CPLEX 12.7.1 C++ API
to build the linear programs directly from our C++ code and to solve them. The biggest
network for which CPLEX executed without running out of RAM was Germany50.
54
4.3. EVALUATION BY SIMULATION
Distribution of network load. The utilization of the links is another parameter that
illustrates the quality of the solutions based on switching links on and off. This includes
not only the utilization of the most loaded link, but also the distribution of the load
across all links. The last case is especially interesting for the load balancing purpose in
the following chapter of this thesis. We believe that having a single highly utilized link
is worse than having two links at medium utilization. In the latter case, the network is
less likely to become congested if the traffic increases.
24 0.4
32 0.2
40 0.0
0 2 4 6 8 10 0 2 4 6 8 10
Index of the traffic matrix
Figure 4.5: The heat-map of the link utilization during a day in the NSFNet network.
Also illustrating the links which are off (shown in black)
Hereafter we present figures (e.g. Figure 4.5) that condense a lot of useful information:
they illustrate the proportion of links that are active, their utilization, and the distribution
of the utilization across the whole network. In particular, Figure 4.5 illustrates a day of
the NSFNet network with two different routing techniques: i) classical shortest path
routing; and ii) a technique that tries to turn links off.
For the NSFNet network, we have 12 traffic matrices at 2 hour intervals. For example,
traffic matrix 0 represents the network at 00:00 o clock, while the traffic matrix 1 at
02:00AM. Each column on the x axes represents the distribution of the load on the links
with a fixed traffic matrix. The black area corresponds to the lightly utilized links, while
white points show an utilization close to 1 (100%). The black links in the rightmost figure
are off. The bold white line shows the border between the active and inactive links. The
y axes allows to rapidly evaluate the number of inactive links.
The figure makes it easy to see the distribution of the load in the network. At 8AM
(x “ 4), not only we were able to reduce the energy consumption of the network, but
we also avoided to create new white (highly utilized) links. With the traffic matrix 4,
55
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
the utilization of most active links in the rightmost figure is between 20% and 40%. To
generate this figure, for each traffic matrix, the links were sorted by their utilization. As
a result, the most utilized link is always shown at the bottom of each column and does
not necessarily correspond to the same link in the adjacent columns.
Computational time. Another important parameter is the time needed by the algo-
rithms to generate a solution. The solution proposed in this chapter has a high theoretical
complexity but while behaves well in practice. For this reason, we measure the actual
time the algorithm takes on the previously presented network topologies.
All our algorithms were implemented in C++ within the OMNeT++[95] discrete
event simulator. Even if we no longer make use of the packet-based simulation, we
rely on the powerful tools included with the simulator to facilitate the description of
network topologies; generation of simulation traces; and their analysis. All simulations
were performed on servers with two Intel Xeon E5-2620 v2 processors and 64Gb of RAM.
The RAM was never a limiting factor for the execution of our algorithms.
Path stretch. The fact that we put links to sleep to save energy certainly impacts
the network performance. First of all, both turning links off and load balancing the
traffic come at the cost of probably taking longer routes and thus increasing the end-
to-end delay of the network traffic. Longer network paths are particularly problematic
in the context of backbone networks, where a detour path can, potentially, go around a
whole continent. To evaluate this fact, we introduce the parameter path-stretch, which
defines how much the path length is increased compared to the shortest path routing.
Taking into consideration that the sub-flows of an origin-destination flow can take multiple
routes in some of our solutions, we consider the mean path length. Formally, path-
avgtlpP q:P PPi,j and f pP qą0u
stretch i,j ” lpSPi,j q
with lpeq “ 1. A path-stretch value of 1 means that
the longest path and the shortest path have equal length. Respectively, a path-stretch=2
means that the length of the paths is twice as long as the shortest path.
56
4.3. EVALUATION BY SIMULATION
1400 5
1200 Initial network load
4
# of packets
Load (Gbps)
1000 Packet loss, STREETE
Packet loss, initial network 3
800
600 2
400
1
200
0 0
0 20 40 60 80 100
Time (s)
However, even wit a safe margin, the network drops packets. There is a noticeable increase
in the number of lost packets, especially at t “ 40, when the initial network was already
saturated. The packet loss on the most congested link at this moment is as high as 7.6%.
96
0.5
76
0.4
56 0.3
36 0.2
16 0.1
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Time (s) Time (s)
57
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
State of network links
Figure 4.8 shows the state of network links in each of the 6 network topologies evaluated
with realistic traffic traces. For each network, the leftmost figure shows the utilization of
the links when classical shortest path routing is used. The rightmost figure corresponds
to the state of the network when our STREETE framework is used. For each figure,
the black area represents the links with utilization close to 0% and the white points
correspond to utilization close to 100%. The bold white line shows the border between
the active and inactive links. The black points above the white line are switched off. More
information about how this figure was generated was presented in the previous section
(4.3.3).
# of links
28 34
42 0.4 51 0.4
56 0.2 68 0.2
0 2 4 6 8 10 0 2 4 6 8 10 0.0 85 0 2 4 6 8 10 0 2 4 6 8 10 0.0
Index of the traffic matrix Index of the traffic matrix
(a) Italian (b) USNET
42
# of links
24 0.4 63 0.4
32 0.2 84 0.2
40 0.0 105 0.0
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
Index of the traffic matrix Index of the traffic matrix
(c) NSFNet (d) AT&T
70
105 0.4 42 0.4
140 0.2 56 0.2
1750 48 96 144 192 240 0 48 96 144 192 240 0.0 70 0.0
0 16 32 48 64 80 0 16 32 48 64 80
Index of the traffic matrix Index of the traffic matrix
(e) Germany50 (f) Geant
Figure 4.8: The heat-map of the link utilization. Also illustrating the links which are off
(shown in black)
Each figure corresponds to a day of operation of the network evaluated at fixed time
points. For example, the first figures (4.8a - 4.8d) are generated using traffic matrices cap-
tured every two hours, producing 12 timestamps on the x axis. The latest two networks
(4.8e - 4.8f) were evaluate at 5 minute and, respectively, 15 minute intervals.
58
4.3. EVALUATION BY SIMULATION
Fig. 4.9 shows a more link-centric way to illustrate a part of the results condensed
in the heatmaps. It allows to visualize the utilization of each link separately and, in
particular, to see how the ignition of some sleeping links reduces the utilization of the
most loaded ones. This is particularly well visible for the USNET network: at traffic
matrix #7, the ignition of links reduced the utilization of other links to keep it below
80%.
Utilization
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0 2 4 6 8 10 0 2 4 6 8 10 0.0 0 2 4 6 8 10 0 2 4 6 8 10
Index of the traffic matrix Index of the traffic matrix
(a) Italian (b) USNET
Utilization
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0 2 4 6 8 10 0 2 4 6 8 10 0.0 0 2 4 6 8 10 0 2 4 6 8 10
Index of the traffic matrix Index of the traffic matrix
(c) NSFNet (d) AT&T
Utilization
0.6 0.6
0.4 0.4
0.2 0.2
0.00 48 96 144 192 240 0 48 96 144 192 240 0.0 0 16 32 48 64 80 0 16 32 48 64 80
Index of the traffic matrix Index of the traffic matrix
(e) Germany50 (f) Geant
The figures illustrate very well that shortest path routing does not suit these networks.
Very few elephant flows consume most of the bandwidth. These flows rapidly congest
some of the links while most links have very low utilization. Note that there are less
highly utilized links when the STREETE framework is active. This result is particularly
well visible on the USNET network (Fig. 4.8b and 4.9b) and may be counter-intuitive.
59
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
An illustration of the reason is given in Figure 4.10 where two flows are routed in the
network and, when the link hd is down, the flows take a detour over longer paths. As a
result, none of the links is at critical load. However, if the link is on, the flows are routed
over hd and its utilisation rises above the threshold α.
u “0 u “0 u “0.45 u “0.45
c d e c d e
u “0.45
u “0.45
u “0
u “0
b u “0.9 f b f
u “0.45
u “0.45
u “0
u “0
u “0.45 u “0.45 u “0 u “0
a h g a h g
fa,d “ fg,d “
0.45 ¨ c
We can conclude that STREETE behaves very well in all these scenarios. A large
number of network links is turned off, enabling effective energy savings of up to 56%
of the power consumed by the network links. We only consider uniform link speeds in
this work. During the hours of low utilization, most networks are reduced to spanning
trees. Moreover, STREETE is able to adapt the network against rises of traffic demands
by turning links on when capacity is needed. Nevertheless, these results also indicate
that STREETE was tested using “comfortable” conditions because it never faces a highly
utilized network. In the next section we test the limits of the algorithm.
60
4.3. EVALUATION BY SIMULATION
80 100
Number of active links
70
80
60
USNET
Italian
50 60
40
30 40
50 140
Number of active links
120
40
100
NSFNet
30 80
ATT
20 60
40
10
20
0 0
0 1 2 3 4 5 0 1 2 3 4 5
90
200
Number of active links
80
70
Germany 50
150
60
50
Geant
100 40
30
50 20
10
0 0
0 1 2 3 4 5 0 1 2 3 4 5
time (normalized) time (normalized)
Figure 4.11: The number of links kept active in the network under continuous growth of
network traffic. The x axes is normalized to the duration till congestion with shortest
path routing.
exceptionally high in this network, creating a high overall network load. It is perfectly
normal that the algorithm turned on a lot of links to avoid congestion.
Something to be noted in Fig. 4.11 is the fact that STREETE is sometimes able to
keep the network out of congestion longer than the shortest path routing. For example,
this is the case in the USNET network, where STREETE is capable to route more than
twice the amount of traffic of classical shortest path routing. The reason of this behavior
was explained in the previous section, Fig. 4.10.
CPLEX is able to successfully route the demands in the network way beyond the point
of congestion in the full network topology G with shortest path routing. Fig. 4.9 and
4.13 illustrate the reason behind this case on the example of the Generated200 network.
In the same scenario of constant traffic growth, both classical shortest path routing
and STREETE congested the Generated200 network just after tm “ 4. From the results
provided on shortest path routing (left part of Fig. 4.12b), it becomes clear that the
congestion occurs on one single link. Most of the links are not even close to having a high
utilization. In this context, STREETE is even able to keep a lot of links off and reduce
the energy consumption of the network (Fig. 4.12a, right).
Fig. 4.13 shows the distribution of the load in the network topology when STREETE
61
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
Utilization
0.6 0.6
# of links
458
687 0.4 0.4
916 0.2 0.2
1145 0.0 0.0 0 1 2 3 4 0 1 2 3 4
0 1 2 3 4 0 1 2 3 4
Index of the traffic matrix Index of the traffic matrix
(a) Generated200, link state heatmap (b) Generated200, link utilization
Figure 4.12: The heat-map (left) and per-link utilization (right) in the Generated200
network with increasing uniform all-to-all traffic
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Figure 4.13: STREETE in the Generated200 network with the highest traffic before
congestion (tm “ 4 in Fig. 4.9)
is used. The highly loaded links from 4.12b can be easily identified by their orange/red
color at the top-middle and on the bottom-left of the figure. The routing techniques
based on shortest path routing tend to frequently use these links because they provide
shortcuts to longer routes.
To make STREETE behave better under high network utilization, in the next chapter
we propose a solution that allows it to reach the quality of CPLEX at the cost of increased
computational complexity.
Computational time
One of our main goals was to create an online solution, capable to react fast to network
changes. The computational time is very important, especially when it is necessary to
turn link on to avoid an imminent congestion.
Table 4.2 indicates the maximum time taken by the algorithm on each of the ana-
lyzed networks. The most interesting results are provided by the Germany50, Coronet
and Generated200 networks, because those are the biggest. The implementation with
dynamic graph algorithm and a binary heap for Dijkstra computations allowed to scale
62
4.3. EVALUATION BY SIMULATION
Table 4.2: Maximum time taken by STREETE and CPLEX on each of the analyzed
networks
Path stretch
Fig. 4.14 indicates how much the paths in the network increase due to consolidating the
flows on a subset of network links. It is obvious that this action drastically extends some
network paths. A minority are almost 15 times longer than the shortest path. A solution
to this problem is to force some additional links to remain active. This decision has to be
done at a topological level. We propose to use graph spanners [89] instead of spanning
trees to guarantee that no unnecessarily long paths are created in the network as a result
of turning links off. This case is left for future work.
63
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
Open Network Operating System (ONOS) is an initiative to build an SDN [96] con-
troller that relies on open-source software components. This SDN controller facilitates
the work of developing network applications by providing northbound abstractions and
southbound interfaces that transparently handle the forwarding devices, such as Open-
Flow switches and other legacy network devices. In addition to a distributed core that
enables control functions to be executed by a cluster of servers, ONOS provides two inter-
esting northbound abstractions, namely the Intent Framework and the Global Network
View.
SDN
App 1 App 2 App 3
Applications
Distributed Core
Intents are
translated into
instructions for Southbound
network devices
Openflow Southbound
NetConf
Interface
Figure 4.16 illustrates our platform and its main components, depicting the deployment
of switches, an SDN controller and applications. At smaller scale, the platform comprises
components that are common to other infrastructures set up for networking research [97,
64
4.4. PROOF OF CONCEPT AND EVALUATION ON A TESTBED
98, 99]. Moreover, we attempt to employ software used at the Grid5000 testbed [100]4 to
which we intend to integrate the platform.
Resources connected to
Power consumption managed ePDUs
information
Managed ePDU
To use the platform, a user requests: a slice or a set of cluster nodes to be used by an
application as virtual switches or serving as traffic sources and sinks, an OS image to be
deployed and a network topology to be used (step 1). We crafted several OS images so that
nodes can be configured as SDN controllers and OpenFlow software switches, as discussed
later. A bare-metal deployment system is used to copy the OS images to the respective
nodes and configure them accordingly [101], whereas a Python application configures
VLANs and interfaces of the virtual switches emulating point-to-point interconnects to
create the user-specified network topology.
Once the nodes and the network topology are configured, the user deploys his or her
application (step 2 in Figure 4.16). All cluster nodes are connected to enclosure Power
Distribution Units (ePDUs)5 that monitor the power consumption of individual sockets
[102]. This information on power consumption may be used to evaluate the efficiency of
an SDN technique (step 3).
The data plane comprises two types of OpenFlow switches, namely software-based
and hardware-assisted. The former consists of a vanilla Open vSwitch (OVS) [103],
whereas the latter OVS offloads certain OpenFlow functionalities to NetFPGA cards
[104]6 . We use a custom OpenFlow implementation for NetFPGAs, initially provided
by the Universität Paderborn (UPB) [105], that performs certain OpenFlow functions in
the card, e.g. flow tables, packet matching against tables, and forwarding. Although the
NetFPGA cards are by default programmed as custom OpenFlow switches, a user can
reprogram them for different purposes by copying a bitstream file to their flash memories
and rebooting the system.
The current testbed 7 , depicted in Figure 4.17, comprises eight servers – five Dell
R720 servers equipped with a 10Gbps Ethernet card with 2 SPF+ ports each and three
4
https://www.grid5000.fr
5
http://www.eaton.com/Eaton/index.htm
6
http://netfpga.org/site/#/systems/3netfpga-10g/details/
7
The testbed was financially supported by the chist-era SwiTching And tRansmission (STAR) project
65
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
HP Z800 servers with NetFPGA cards with 4 SPF+ ports each. All servers also have
multiple 1Gbps Ethernet ports. The SPF+ ports have optical transceivers and are all
interconnected by a Dell N4032F L3 switch whereas two 1Gbps Ethernet ports of each
server are connected to a Dell N2024 Ethernet switch. This configuration enables testing
multiple network topologies.
The infrastructure and the use of ONOS satisfy
some requirements of energy-aware traffic engineering,
namely providing actual hardware, allowing for traffic
information to be gathered, using actual network proto-
cols, enabling the overhead of control and management
to be measured, and monitoring the power consump-
tion of equipment. Some energy-optimisation mecha-
nisms, however, are still emulated, such as switching
off/on individual switch ports. Although the IP cores of
the Ethernet hardware used in the NetFPGA cards en-
able changing the state of certain components, such as
switching off transceivers, that would require a complete
redesign of the employed OpenFlow implementation. It
has therefore been left for future work.
We implemented the algorithms as an ONOS OSGI
module. The base of our solution was the “Segment
Routing” module which is available in the ONOS distri- Figure 4.17: Photo of the ex-
bution. As a result, we leverage features of the Segment perimental platform
Routing/SPRING for traffic engineering, such as the
possibility to atomically change routes only on ingress
devices without having to synchronize updates between devices. This flexibility comes at
the cost of a small increase of the mean packet header size.
Segment-Routing application
66
4.4. PROOF OF CONCEPT AND EVALUATION ON A TESTBED
changes in topology as well as link utilisation, and periodically evaluates whether there
are links to switch off/on. If changes in the link availability are required, the energy-aware
module requests a flow-rule update to the Flow-Rule Population module.
The experimental validation of our solutions allowed us to discover some elements to
be taken into consideration, which are presented in the next section.
67
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
formation about link utilisation from switches every second and that a decision to power
a given link back on may be taken and enforced quickly.
onos
0.00285% 0.00285%
0.00285% 0.00285%
client2 star2 hp4 star3 client3
0%
23.7%
0% 0.00311%
23.6% springController
23.8%
hp1 hp2
0.00296%
0.00296%
star1 hp3 configurator
client1
Figure 4.19: Interfaces of the simulator and the testbed showing a data flow avoiding the
shortest path
We performed a simple test and measured the time needed for a controller to decide
whether a link should be switched on. A small network topology was considered, as
depicted in Figure 4.19. The Figure also shows the ONOS graphical interface and a data
flow (green lines). The network starts with a minimal number of links turned on, forming
a spanning tree, and with a TCP flow that nearly exceeds the utilisation threshold, above
which the controller decides to turn on more links to handle congestion. A second flow is
then injected, thus exceeding the threshold and forcing the controller to switch links on;
we measure the time from flow injection to a switch-on decision. In the simulation, the
decision takes on average 1.075 seconds, with most of the time spent gathering information
on link utilisation. In the testbed, the time is on average 20% higher than in simulation.
We notice that the difference in results between simulation and real testbeds are
generally due to simulations assuming zero delay at multiple parts of the processing
pipeline and the manner network events are handled. While a single delay simplification
would have marginal impact on the results, multiple delays along the packet processing
pipeline can account for up to 30% difference in the time to react to changes. Examples of
delay simplifications during discrete-event simulations include: instantaneous insertion of
forwarding rules into the data path, immediate update of routing tables, fast propagation
of flow counters from the simulated hardware ports to the software of the SDN switch, and
instantaneous processing of IP UDP/TCP packets. Generally, the only delay properly
handled by a simulator is packet queueing time.
Existing work has already shown that updating the data path forwarding rules is
slow in current commercial SDN switches [77]. Google employees report[78] that their
SDN-based WAN had an outage due to this issue on propagation of forwarding rules.
Improvements can be made in the simulation software to account for some delay, and in
the hardware design itself to reduce the time to propagate rules.
Other issues that we investigated concern the stability of the algorithms and the im-
pact of traffic re-routing on TCP flows. Unlike traditional networks where changes in link
availability are sporadic, under energy efficient traffic engineering, frequent changes may
be the rule. Re-routing TCP flows, however, may lead to serious performance degradation
due to segments arriving out of order, which in turn result in multiple duplicate ACKs
and hence triggering the TCP congestion algorithms at source. Even though the algo-
68
4.5. CONCLUSION
rithms in the simulator mimic the behaviour of their corresponding theoretical models,
they differ from the actual network software implementations provided by certain oper-
ating systems. In our in-depth analysis, which we present in the chapter 6 of this thesis,
we observed that the Linux kernel, for instance, includes several non-standard optimisa-
tions [107]. While simulations highlighted that re-routing TCP flows severely impacts the
throughput of the transported TCP flows, empirical evaluation on the testbed demon-
strated almost no impact under the same conditions. We believe that existing work that
wraps real network software stack into simulators8 may help minimise this issue.
4.5 Conclusion
Solutions for improving the energy efficiency of wired computer networks propose to
turn the links off during periods of low network utilization. This chapter follows this
trend and proposes a reactive Software-Defined Network (SDN)-based framework which
dynamically monitors the network status and turns the links on or off to provide the
capacity needed for routing the network flows. Compared to the related work in this
field, we focus on an online solution capable to instantly react to unpredictable network
events and keep the network out of congestion while maintaining a low number of links
active, thus reducing the energy consumption of the network.
To maintain a low computational overhead, our algorithm relies on an innovative idea
which maintains two different views of the network topology in parallel. This allows to
improve the quality of the results, especially for finding the correct links to ignite, field
which is mostly overlooked by the research community. Moreover, it allows to reduce the
impact of historical decisions on the execution of the turn-off algorithm.
We implemented our solution using state of art dynamic shortest path algorithms to
further increase the execution speed. Moreover, the fact of entirely relying on shortest
path computations, together with the use of the SPRING source routing protocol, allows
to keep the control traffic overhead at a minimum.
We evaluate the proposed algorithms using real backbone network topologies with
real traffic matrices and conclude that good results are obtained at low traffic load both
in number of turned-off links and in its ability to avoid network congestion. We also
implemented a proof of concept on a testbed using the ONOS SDN controller in order to
evaluate the framework on real hardware.
This extensive validation allowed to detect that there is room for improving our solu-
tion at high load, leading to the work which we’ll present in the next chapter. Moreover,
the validation on the testbed revealed some instabilities due to the bad inter-operation
between STREETE and the TCP congestion control, leading to the analysis performed
in the chapter 6 of this thesis.
8
http://www.wand.net.nz/~stj2/nsc
69
CHAPTER 4. CONSUME LESS: THE STREETE FRAMEWORK FOR REDUCING
THE NETWORK OVER-PROVISIONING
70
Chapter 5
In the previous chapter, we presented a framework for reducing the energy consumption of
IPoWDM networks by turning links on and off depending on the network load. However,
despite the energy saving potential and small footprint of the presented solution, the
results highlighted a potential shortcoming: a lot of resources can remain underutilized
due to the minimum hop shortest path routing. In this chapter, we intend to address this
weakness by including a load balancing technique in the STREETE framework to enable
better utilization of the active links.
We start the chapter with a description of our first attempt to design such a load bal-
ancing technique and justify the reason why we did not explore this direction further. In
the second part of the chapter, we present another, much more elegant and efficient, at-
tempt to design and integrate into STREETE an Software-Defined Network (SDN) based
online traffic engineering solution based on a state-of-art approximate multi-commodity
flow algorithm to keep the computational complexity as small as possible. This solution
also leverages the power of SPRING source routing to reduce the complexity and cost of
centralized management.
71
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
4
3.5
3
Link cost
2.5
2
1.5
1
0.5
0
0 0.2 0.4 0.6 0.8
Link utilization
shortest paths with this cost function will naturally diverge the traffic from the highly
utilized links by choosing longer paths regarding hop counts, but a lesser cost.
The computational complexity of such a solution may be quite significant. A trivial
algorithm will compute the shortest path for each origin-destination pair, resulting in |V |2
shortest path computations. To reduce the complexity, we leverage a particularity of the
sizes of network flows. More precisely, different authors [108] [109] [110] analyzed real
traffic traces and concluded that most network traffic is transported by a low number of
elephant network flows. This result was also confirmed in the case of aggregated origin-
destination demands [111]. Our solution starts by breaking the demands from the traffic
matrix into two groups. The first one contains the requirements of mice flows, while
the second one the demands of elephant flows. For this purpose, we use a classification
algorithm designed for heavy-tailed distributions [112].
The algorithm will route all the elephant demands one by one using a dedicated
shortest path computation per demand. However, to keep a low computational overhead,
the mice demands will be routed per-source: all the demands from the same source will
be routed using our custom modified shortest path algorithm which greedily adapts the
costs of links at the same time as computing the shortest path.
5.1.2 Algorithms
The only non-classical algorithm used by our solution is a modified Dijkstra created for
routing the mice flows. It is a greedy algorithm which computes a tree for routing the
flows originating from a given source node vroot P V while dynamically updating the link
costs in conformance with the cost provided by the function lpeq. In particular, during
the execution of the algorithm, the costs lpeq are not static, they change as consequence of
previous routing decisions and influence the selection of network paths for the subsequent
flows. Nevertheless, it remains a greedy algorithm: at each iteration, the previous choices
are never re-evaluated and only influence the subsequent ones.
72
5.1. CF: SEARCHING FOR THE PERFECT COST FUNCTION
Algorithm 6 Constructs the tree T which gives the routes for the demands originating
from the root node vroot towards all other nodes in the network while greedily respecting
the link costs defined by the function lpeq
1: function RouteInTree(T pV, ET q, vroot P V, dvroot ,˚ )
2: for all v P V do
3: distancervs Ð 8
4: parentrvs Ð undef ined
5: end for
6: treatedList Ð rs Ź Empty list of nodes
7: visitedList Ð rvroot s Ź List of nodes containing one element (vroot )
8: distancervroot s Ð 0
9: while visitedList.notEmpty() do
10: a Ð visitedListr0s
11: lowestDist Ð 8
12: for all v P visitedListr1 :s do Ź all elements in the list except the first one
13: distancervs Ð 8
14: for all u P V s.t. uv P EG and u P treatedList do
15: if distancervs ą distancerus ` lpuvq then
16: distancervs Ð distancerus ` lpuvq
17: parentrvs Ð u
18: end if
19: end for
20: if lowestDist ą distancervs then
21: lowestDist Ð distancervs
22: aÐv
23: end if
24: end for
25:
26: visitedList.delete(a)
27: treatedList.append(a)
28: if a ‰ vroot then
29: ET Ð ET Y tpparentras,aqu
30: end if
31:
32: vÐa
33: while v ‰ vroot do
34: fpparentrvs,vq Ð fpparentrvs,vq ` dvroot ,v
35: v Ð parentrvs
36: end while
37: for v P treatedListr1 :s do Ź all elements in the list except the first one (vroot )
38: distancervs Ð distancerparentrvss ` lppparentrvs,vqq
39: end for
40:
41: for all w P V s.t. aw P EG and w R treatedList Y visitedList do
42: visitedList.append(w) Ź non-visited neighbors of a
43: parentrws Ð a
44: end for
45: end while
46: end function
73
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
4
3.5 3.5 3.5
3.5
3 3 3
3
2.5 2.5 2.5
Link cost
Link cost
Link cost
Link cost
2.5
2 2 2 2
1.5 1.5 1.5 1.5
1 1 1 1
0.5 0.5 0.5 0.5
0 0 0 0
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
Link utilization Link utilization Link utilization Link utilization
4 4 4 4
3.5 3.5 3.5 3.5
3 3 3 3
Link cost
Link cost
Link cost
Link cost
2.5 2.5 2.5 2.5
2 2 2 2
1.5 1.5 1.5 1.5
1 1 1 1
0.5 0.5 0.5 0.5
0 0 0 0
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
Link utilization Link utilization Link utilization Link utilization
3.5 4 4 4
3 3.5 3.5 3.5
2.5 3 3 3
Link cost
Link cost
Link cost
Link cost
2.5 2.5 2.5
2
2 2 2
1.5
1.5 1.5 1.5
1 1
1 1
0.5 0.5 0.5 0.5
0 0 0 0
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
Link utilization Link utilization Link utilization Link utilization
4 4 4
3.5
3.5 3.5 3.5
3
3 3 3
2.5
Link cost
Link cost
Link cost
Link cost
4 4
3.5
3 3.5 3.5
3
2.5 3 3
2.5
Link cost
Link cost
Link cost
Link cost
2 2.5 2.5
2
2 2
1.5 1.5
1.5 1.5
1 1
1 1
0.5 0.5 0.5 0.5
0 0 0 0
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
Link utilization Link utilization Link utilization Link utilization
74
5.1. CF: SEARCHING FOR THE PERFECT COST FUNCTION
75
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
Considering the variables fe “ pi,jqPV 2 :di,j ą0 fei,j , the model objective is to minimize
ř
the utilization of the most loaded link:
fe
Minimize: maxp cpeq q
Subject to the flow conservation and edge capacity constraints:
$
ÿ ÿ &´di,j ,
’ if v “ i
fei,j ´ fei,j “ di,j , if v “ j
’ (5.1)
ePwv´ ePwv` 0, otherwise
%
@v P V, @pi,jq P V 2 : di,j ą 0
f e ď ce @e P E (5.2)
1
Maximum link utilization
1
0.8
0.8
USNET
Italian
0.6 0.6
0.4 0.4
1
1
Maximum link utilization
0.8
0.8
NSFNet
0.2
0.2
0
0
0 1 2 3 4 5 0 1 2 3 4 5
1
Maximum link utilization
1
0.8
0.8
Germany 50
0.6 0.6
Geant
0.4 0.4
0.2
0.2
0
0
0 1 2 3 4 5 0 1 2 3 4 5
time (normalized) time (normalized)
Figure 5.3: Utilization of the most loaded link under continuous growth of network traffic.
x axes are normalized to the moment of congestion with classical shortest path routing.
76
5.2. LB: MAXIMUM CONCURRENT FLOW FOR LOAD BALANCING WITH SDN
The results are shown in Fig. 5.3. We slightly improved the distribution of the load
on the network and kept it out of congestion for up to 3 times longer than with shortest
path routing. However, the quality of the result is volatile: the maximum link load varies
a lot. Our tests showed that even tiny variation in the shape of the cost function could
lead to extreme changes in the quality of the final solution.
Moreover, the additive nature of the cost of network paths does not suit the problem.
An unfavorable case that frequently occurs in practice is depicted in Fig. 5.4. Two paths
are possible between the node a and c: i)pad,dcq; and ii) pab,bcq. The first path contains
two links with a utilization of 50%. The second path passes through two links, with
a usage of 90% and 5%. At the same time, due to the additive nature of the lengths
of the path, both paths have the same length of 4 and are considered equally good for
routing the flow ac. This behavior is not desirable in a load balancing technique. The
first path should probably be preferred to avoid increasing the utilization of the link ab
even further.
d u
50
% l ““ 5
u
“
2 2 0%
l“
a u c
l ““ 9 5%
3. 0% u
“ .5
0
5
l“
b
Figure 5.4: Two paths of equal cost between the node a and c
Our preliminary tests showed particularly bad results in the context of the heteroge-
neous links: the flows were not discouraged from passing through low speed links, leading
to their congestion very fast.
We believe that there may exist techniques to improve the stability of the presented
approach, but we abandoned this direction in favor of the solution proposed in the next
section.
77
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
As a consequence of iteratively routing small fractions of demands at each iteration
of the algorithm, the integral demands will be split into multiple paths. The originality
of our contribution consists in proposing a way to efficiently achieve the multipath for-
warding by combining centralized and distributed optimization. This way we also reduce
the control traffic overhead, because the SDN controller does not have to orchestrate the
data-planes of SDN switches explicitly. The SDN controller transmits only a set of con-
straints to each switch. These constraints are used to re-compute the network paths and
update the forwarding tables locally. We rely on the source routing protocol SPRING
to be able to atomically change the paths of the flows by only updating the forwarding
database of ingress network devices.
The solution that we propose combines i) the advantage of making global routing
decision in the SDN controller; with ii) the offloading of some computations and the
multipath forwarding rule construction to the SDN switches.
d d
0 10
.0 19 5.
37 .0 6.
89
3
20.85 1.62
a c a c
1. 0
43 18 .3
.7 34
1
b b
d (b) Flow starting from a (c) Flow starting from b
0 1.
.0 62 05
62 10 .0
2. 0 d d
22.47
a 45.16
c 19
.9
0
36
.0 1 0
1.
05
8 2.
5. 0
31 68 .3 a c a c
.6 34 . 76 45.16
7 4 4 3 1.
12 .2 6
05
.9 7
4.
b 6 4
b b
(a) Total flow in the network
(d) Flow starting from c (e) Flow starting from d
Figure 5.5: The total flow in the network (5.5a), and the decomposition of this flow per
source node
Fig. 5.5 shows an example of output computed by this algorithm in a small network
with hypothetical all-to-all communications. In particular, we are interested in the fol-
lowing output: i) the best possible routing of the flow in the network which minimize
the utilization of the most loaded link (Fig.5.5a); ii) the decomposition of this multi-
commodity flow per source node (Fig. 5.5b - 5.5e).
78
5.2. LB: MAXIMUM CONCURRENT FLOW FOR LOAD BALANCING WITH SDN
The second step of our solution corresponds to sending the output from the previous
algorithm to the corresponding SDN switches. For example, it sends the following list to
the node a: ppab, 1.43q, pcd, 19.03q, pad, 37.00q, pac, 20.85qq. In particular, SDN controller
informs each SDN switch of how the data it originates must be transmitted through the
network and provides a way to achieve a global optimization of the network traffic. On
the other hand, the data transmitted to the SDN switches does not contain any explicit
information about the routes to be taken by each origin-destination flow. The communi-
cation overhead decreases, but the ingress SDN switches must now locally compute the
forwarding paths in a way to respect the constraints given by the centralized controller.
For example, the switch a will try to locally route the traffic to match the flow from
Fig.5.5b.
The last step of our solution is to locally compute the forwarding rules on SDN
switches while matching the flow transmitted by the SDN controller. We insist on the
fact that the flow sent by the SDN controller is, by construction, a valid flow, but is
not necessarily a minimum-cost one. As a consequence, it may contain cycles which we
would like to reduce to simplify the task of building the forwarding rules. An example
of cycle is shown in Fig. 5.5d: a Ñ b Ñ a. The SDN switch will start with applying the
cycle-canceling algorithm[113] on the received flow. This operation has the advantage of
outputting a Directed Acyclic Graph (DAG). The SDN switch can now easily compute the
forwarding paths in the network by leveraging the properties of the DAGs. In particular,
the fact that it can be topologically sorted. Moreover, the switch has the visibility into
the composition of the flow at a granularity which cannot be transmitted to the SDN
controller. It can leverage this knowledge to split the origin-destination flows into sub-
flows and send each sub-flow over a separate path to match the constraints imposed by
the controller correctly.
Notations
ř
Maximize λ under the constraints: Minimize lpeqcpeq under the constraints:
ePE
ÿ
ÿ lpeq ě lpSPi,j q @pi,jq P V 2 : di,j ą 0
f pP q “ λ ¨ di,j @pi,jq P V 2 : di,j ą 0 ePP
P PPi,j
ÿ @P P Pi,j
f pP q ď cpeq @e P E ÿ
lpSPi,j qdi,j ě 1
P PP:ePP
pi,jqPV 2 :di,j ą0
f pP q ě 0 @P P P
lpeq ě 0 ePE
(a) Primal
(b) Dual
Figure 5.6: The primal and dual of the maximum concurrent flow problem in a path-flow
formulation
79
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
For convenience, we summarize the notations. Fig. 5.6 shows the edge-path formula-
tion of the maximum concurrent flow problem that was introduced in a previous chapter.
We also recall that P is a path in the network, P represents the set of all the paths
and Pi,j is the set of all the paths starting at node i P V and finishing at node j P V .
Moreover, f pP q is the amount of flow passing through the path P , di,j is the demand
from node i to node j. The function l : E Ñ R` and c : E Ñ R` give, respectively,
the cost of passing through a link e P E and the capacity of this link. Finally SPi,j is
the shortest path among all the paths from i to j: implicitly
ř assumed under the length
function l. To simplify the notations, let Dplq “ lpeqcpeq be the value of the dual
ePE
objective function; Dpl˚ q be the dual optimum; and define aplq “
ř
lpSPi,j qdi,j
pi,jqPV 2 :di,j ą0
to be the expression from the last constraint of the dual.
Algorithm
The Alg. 7 presents the algorithm as implemented and used by our solution. It closely
follows the original algorithm [5]. Nevertheless, the structure was reorganized for easier
implementation and possibility to compute the per-source flow. Moreover, we incorpo-
rated the details of some non-trivial building blocks used as primitives by the Karakostas.
The algorithm starts with the trivial valid solution f “ 0 for the primal LP and a
δ
constant length function lpeq “ cpeq which represents an invalid solution for the dual LP.
After that, the algorithm proceeds in multiple phases to construct the optimum solution
for the dual LP. Each step corresponds to an iteration of the loop at line 4. δ is a very
small constant which is crucial to the proof of the algorithm and is specifically chosen for
this purpose. More details on this constant will be provided in the next section.
In each phase, for each source node src, the algorithm uses the shortest path tree
Tsrc rooted at this source which is computed by the Dijkstra’s algorithm. This allows to
simultaneously route the demands towards all the destinations (line 10 ). The details of
the RouteInT ree procedure was provided in the previous chapter (section 4.2.3). The
algorithm then increases the length/cost of the links which have just been used expo-
nentially to the amount of data which transited through the link (line 17). From an LP
perspective, this action corresponds to adjusting those variables of the dual which violate
the most its constraints. This is done while keeping a relationship between the variables
of the primal and those of the dual.
A more intuitive interpretation can also be given: the algorithm repeatedly routes the
demands over the least loaded links and increases the cost of these links exponentially to
the added load. This way, the most loaded links will become more costly and have less
probability to be used in the next steps of the algorithm.
By construction, the computed flow respects the flow conservation constraints and
non-negativity constraints but may violate the link capacity constraints. At the end, the
flow must be scaled down to obtain the minimum utilization routing used in the next
steps of our solution.
80
5.2. LB: MAXIMUM CONCURRENT FLOW FOR LOAD BALANCING WITH SDN
The proof that the algorithm efficiently computes a solution to the maximum concur-
rent flow with a precision ǫ is done in three steps:
• provide an upper bound for the dual optimum Dpl˚ q at the end of the algorithm.
• provide a lower bound for the primal optimum λ˚ at the end of the algorithm.
q ˚
• show that the dual to primal ratio Dpl
λ˚
ď 1 ` ǫ (in an optimal solution, the ratio
˚
Dpl q
λ˚
“ 1. Moreover, by the weak duality property, Dplq ě λ for any Dplq and λ.
This way, we prove that the algorithm computes an ǫ-approximation of the optimal
solution.)
i) To provide an upper bound on the resulted dual solution, we assume that Dpl˚ q ě 1.
If Dpl˚ q ă 1, it is possible to scale down all the demands di,j to ensure that Dpl˚ q ě 1.
In fact, having Dpl˚ q ă 1, corresponds to the case when the network capacity is not
enough to route all the demands. A possible scaling procedure will consist in routing the
demands into the network by ignoring the link capacity constraints. This will allow to
81
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
fuv
detect the maximum violation of a link capacity: r “ maxp cpuvq |uv P Eq. Scaling all the
˚
flows down by r will ensure that Dpl q ě 1.
Let li be the length function l at the end of phase i (i.e., of the iteration i of the
loop at line 4). The operation at line 16 increases Dplq by at most ǫ ¨ apli q during the
phase i. In fact, during the iterations of the outer loop, the increment added by each
execution ofř the line 16 ( ǫ ¨ lpuvq ¨ fpu,vq ) gradually computes the same sum as ǫ ¨ apli q
(i.e. ǫ lpSPi,j qdi,j ), but with lower link lengths. Considering that the lengths
pi,jqPV 2 :di,j ą0
monotonically increase at line 17, at the end of the phase i, apli q will be bigger than the
same sum computed with lower link lengths.
As a result, at the end of each phase i,
Dpli´1 q
Dpli q ď
1 ´ ǫ{Dpl˚ q
The previous recursive relation, together with the fact that Dpl0 q Ð mδ at line 3
implies
mδ
Dpli q ď
p1 ´ ǫ{Dpl˚ qqi
mδ ǫ
“ ˚
p1 ` ˚
qi´1
1 ´ ǫ{Dpl q Dpl q ´ ǫ
mδ ǫ¨pi´1q
ď e Dpl˚ q´ǫ
1 ´ ǫ{Dpl˚ q
Under the assumption that Dpl˚ q ě 1:
mδ p1´ǫq¨Dpl
ǫ¨pi´1q
Dpli q ď e ˚q
1´ǫ
Taking into consideration that the algorithm finishes at the first iteration t in which
ǫ¨pt´1q
mδ p1´ǫq¨Dpl˚ q
Dplq ě 1, we obtain 1´ǫ
e ě 1. As a result, at the end of the algorithm:
ǫ ¨ pt ´ 1q
Dpl˚ q ď (5.3)
p1 ´ ǫqln 1´ǫ
mδ
ii) To provide a lower bound on the value of the primal solution λ˚ , we observe that
each phase of the algorithm (except the last one) routes exactly dsrc,dst units of flow from
each source node src to each destination node dst. As a result, if the algorithm finishes
during the iteration t of the main loop, the algorithm has routed at least pt ´ 1q ¨ dsrc,dst
units of each flow.
Another observation is that, at the end of any phase i except
ř the last one, the length
li peq of each edge e is shorter than 1{cpeq (otherwise Dplq “ lpeqcpeq ě 1 and terminates
ePE
82
5.2. LB: MAXIMUM CONCURRENT FLOW FOR LOAD BALANCING WITH SDN
the algorithm). Furthermore, this property is also maintained during the last phase by
the condition Dplq ă 1 in the loop 8. As a result, when the algorithm ends, this property
is violated at most once per edge by the line 17. The length of any edge e at the end of
the algorithm is thus at most lpeq ă p1 ` ǫq{cpeq. Knowing that the lengths are initialized
to δ{cpeq at the start of the algorithm (line 2), each edge finishes with a length which is
at most p1`ǫq{cpeq
δ{cpeq
“ 1`ǫ
δ
times its initial length.
Moreover, the operation at line 13 ensures that each cpeq units of flow routed through
an edge e increases its length by at least 1 ` ǫ. As a result, the capacity of any edge is
violated by at most log1`ǫ 1`ǫ
δ
times its capacity.
Finally, all these observations imply that the algorithm routes at least t ´ 1 units of
each commodity, while violating the capacity of any edge by at most log1`ǫ 1`ǫ δ
times its
1`ǫ
capacity. Scaling the flow down by log1`ǫ δ gives us a feasible flow for the primal LP.
As a result, at the end of the algorithm, a lower bound for λ˚ is given by:
t´1
λ˚ ě (5.4)
log1`ǫ 1`ǫ
δ
ǫpt´1q
Dplq˚ p1´ǫqln 1´ǫ ǫ ¨ log1`ǫ 1`ǫ
δ 1 ǫ ln 1`ǫ
δ
ď t´1
mδ
“ “ ¨ ¨ 1´ǫ ď
λ˚ 1`ǫ p1 ´ ǫqln 1´ǫ
mδ
1 ´ ǫ lnp1 ` ǫq ln mδ
log1`ǫ δ (5.5)
1 ǫ ln 1`ǫ
δ 1 ln 1`ǫ
δ
ď ¨ ¨ ď ¨
1 ´ ǫ ǫ ´ ǫ2 {2 ln 1´ǫ
mδ
p1 ´ ǫq2 ln 1´ǫ
mδ
By setting
1 1´ǫ 1
δ“ 1´ǫ ¨p qǫ
p1 ` ǫq ǫ m
ln 1`ǫ q ˚
the term δ
ln 1´ǫ
1
becomes equal to 1´ǫ . and Dpl
λ˚
becomes smaller than p1 ´ ǫq´3 .
mδ
It is possible to pick ǫ in a way such that the ratio is less than 1 ` w for any w ą 0.
As a result, the presented algorithm computes an 1 ` ǫ approximation of the maximum
concurrent flow problem.
The asymptotic bound of the complexity proven by Karakostas [5] is Õpǫ´2 |E|2 q, where
the notation Õ hides the logarithmic factor logp|E|qlogp|V |q. In practice, the algorithm
is very dependent on the chosen precision ǫ for more than one reason.
Evidently, ǫ defines the execution time. However, the quality of the solutions is much
more affected than the speed. For example, setting a precision of ǫ “ 0.3 may be tempting,
as the algorithm will provide a result almost ten times ˚q
faster compared to a precision
ǫ “ 0.1. However, the obtained dual to primal ratio Dpl
λ˚
“ p1 ´ 03q´3 “ 2.91545 resulted
in almost useless outputs during our preliminary tests.
Another, less obvious, dependence on ǫ is hidden in the initialization of the algorithm
at line 2. δ is a very small constant which depends on ǫ and the number of links. Setting
83
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
the precision too low, like 0.01, not only drastically increases the execution time, but also
induces numerical issues due to the limited precision of floating point numbers.
To avoid these two issues, we have empirically chosen the precision to be ǫ “ 0.1, which
provides a dual to primal ratio of 1.37174. Even in this case, we had to use the “long
double” type for our variables to avoid numerical issues on the biggest of our evaluated
networks: Generated200.
d
. 00 19
37 .0
3
20.85
a c
1.
43
b
Figure 5.7: Flow originating from the node a
Fig. 5.7 retakes the same example that was presented in the previous section (5.2.1)
and illustrates the flow originating from the node a. In this case, the information transmit-
ted by the network controller to the node a will consist of the list ppab, 1.43q, pcd, 19.03q,
pad, 37.00q, pac, 20.85qq.
We evaluate the control traffic overhead in the Generated200 network. Even if we
assume the worst case scenario when each of the nodes uses all the links, the total
network traffic overhead is Op|V | ¨ |E|q. We also consider an efficient encoding of data
when transferring through the network, were only 32bytes are used to uniquely identify a
link and represent the amount of flow passing through it. In this case, the SDN controller
will have to send into the network a total of 200 ˚ 1148 ˚ 32bytes » 8M bytes of control
traffic. We consider that this is a reasonable small control traffic overhead for a network
with 200 nodes.
In practice, nodes do not use all the links for forwarding. For example, in 5.7, the
nodes use less than half of network links. The communication overhead will be smaller
than the estimated quantity.
84
5.2. LB: MAXIMUM CONCURRENT FLOW FOR LOAD BALANCING WITH SDN
forwarding database of the network device. For example, the node a from Fig. 5.7 will
construct the following routing:
• The demand from a to b will be integrally sent over the path pabq
• The demand from a to c will be integrally sent over the path pacq
This section presents the algorithms proposed for this purpose. These algorithms rely
on two hypothesis: i) use of a source routing forwarding protocol, and ii) the origin-
destination demands have a high aggregation level.
The first assumption is respected by the use of the SPRING protocol. It enables the
ingress nodes to locally change the network routes without having to update the for-
warding tables of midpoint devices. The second is true in backbone networks, making
it is possible to split the demands at a TCP/UDP flow granularity: out of thousands of
TCP/UDP connections which make the aggregated origin-destination flows, some connec-
tions will be forwarded over one path, while the others will take another path. Without a
large aggregation of flows, the efficiency of the proposed solution may decrease, because
it will be difficult to split the demands into multiple paths.
85
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
Algorithm 8 Compute the explicit routes for the flow from vroot towards all nodes ˚ P V
1: procedure computeRoutesInDAG(DpV, ED q, vroot P V, dvroot ,˚ , sortedV )
2: initialize ratiouv Ð 0, @uv P ED
3: for all v P V do
4: totalInT raf Ð 0
5: for all pu,vq P ED do
6: totalInT raf Ð totalInT raf ` fuv
7: end for
8: for all pu,vq P ED do
9: ratiouv Ð fuv {totalInT raf
10: end for
11: end for
12: Rvroot ,vroot Ð ppq,1q
13: for all v P sortedV r1 :s do
14: Rvroot ,v Ð H
15: for all pu,vq P ED do
16: X Ð tpR.P ` uv, R.r ¨ ratiouv q|@R P Rvroot ,u u
17: Rvroot ,v Ð Rvroot ,v Y X
18: end for
19: end for
20: end procedure
86
5.2. LB: MAXIMUM CONCURRENT FLOW FOR LOAD BALANCING WITH SDN
5.2.5 Evaluation
The algorithms were implemented in C++ and evaluated using the same methodology
as presented in the previous chapter (section 4.3).
87
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
1
Maximum link utilization
1
0.8
0.8
USNET
Italian
0.6 0.6
0.4 0.4
1
1
Maximum link utilization
0.8
0.8
NSFNet
0.6 0.6
ATT
0.4 0.4
0.2
0.2
0
0
0 1 2 3 4 5 0 1 2 3 4 5
1
Maximum link utilization
1
0.8
0.8
Germany 50
0.6 0.6
Geant
0.4 0.4
0.2
0.2
0
0
0 1 2 3 4 5 0 1 2 3 4 5
time (normalized) time (normalized)
Figure 5.8: Utilization of the most loaded link under continuous growth of network traffic.
x axes are normalized to the moment of congestion with classical shortest path routing.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Figure 5.9: Italian network with the highest traffic before congestion. At the left: shortest
path routing at t “ 1 on figure 5.8. At the right: LB with 4 times more traffic than the
shortest path routing (t “ 4 on figure 5.8).
88
5.2. LB: MAXIMUM CONCURRENT FLOW FOR LOAD BALANCING WITH SDN
Utilization
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0 1 2 3 4 0 1 2 3 4 0.0 0 1 2 3 4 0 1 2 3 4
Time (normalized) Time (normalized)
(a) Italian (b) USNET
Utilization
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0 1 2 0 1 2 0.0 0 1 2 3 4 0 1 2 3 4
Time (normalized) Time (normalized)
(c) NSFNet (d) AT&T
Utilization
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0 1 2 3 4 0 1 2 3 4 0.0 0 1 2 3 0 1 2 3
Time (normalized) Time (normalized)
(e) Germany50 (f) Geant
Utilization
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0 1 2 0 1 2 0.0 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
Time (normalized) Time (normalized)
(g) CORONET (h) Generated200
89
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
Fig. 5.11 shows the heatmap and the link load in the NSFNet and Geant networks
with daily traffic, i.e. the real traffic matrices introduced in section 4.3.1. As previously,
the load is effectively and uniformly distributed over the links of the network. The results
on other networks do not provide any additional useful information.
# of links
24 0.4 42 0.4
32 0.2 56 0.2
40 0.0 70 0.0
0 2 4 6 8 10 0 2 4 6 8 10 0 16 32 48 64 80 0 16 32 48 64 80
Index of the traffic matrix Index of the traffic matrix
(a) Heatmap, NSFNet (b) Heatmap, Geant
Utilization
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0 2 4 6 8 10 0 2 4 6 8 10 0.0 0 16 32 48 64 80 0 16 32 48 64 80
Index of the traffic matrix Index of the traffic matrix
(c) Link utilization, NSFNet (d) Link utilization, Geant
Figure 5.11: The heat-map and the link utilization in the NSFNet and Geant Networks
with the daily traffic
All these results and other can be obtained by repeating the simulations. Please follow
the instructions from the end of the first chapter of this thesis.
Computational time
Tab. 5.1 resumes the computational time taken by the algorithms. In the case of LB,
the numbers include both the time taken by the centralized optimization phase presented
in Section 5.2.2 and by the explicit computation of the forwarding paths presented in
section 5.2.4. The precision used for the Karakosta’s algorithm is ǫ “ 0.1.
For completeness, we also provide the computational time taken by our first attempt
to load balance (CF) presented in Section 5.1. We conclude that LB delivers better results
while being, in most cases, faster than CF. This is particularly the case on big networks
(Generated200, CORONET).
On small networks, CPLEX may seem to be an excellent alternative to the proposed
heuristics, because it provides a proven optimal solution and its computational time is
comparable to the heuristics. However, the output of the LP optimization is not a routing
which can be directly applied to the network. Additional post-treatment must be done to
build the routes for the flows. Alternatively, the linear programming (LP) model solved
by the CPLEX solver could be updated to generate a viable routing directly. However,
90
5.2. LB: MAXIMUM CONCURRENT FLOW FOR LOAD BALANCING WITH SDN
Network CF LB CPLEX
NSFNet 225ms 166ms 458ms
Italian 383ms 546ms (1s)1126ms
Geant 829ms 248ms 948ms
USNET 544ms 737ms (2s)2539ms
ATT 823ms 794ms (2s)2830ms
Germany50 (2s)2385ms (2s)2024ms (8s)8333ms
CORONET (24s)24358ms (3s)3688ms p˚ q-
Generated200 (8m)537820ms (1m)107526ms p˚ q-
*CPLEX was not able to initialize the model
this change will lead to a further increase of the computational time, making it less
appealing than LB.
Fig. 5.12 summarizes the results for all the network topologies. To construct the
boxplots we proceeded as follows: for an evaluated traffic matrix we recorded the path-
stretch of every origin-destination flow. That operation gave us N B :“ |V | ¨ p|V | ´ 1q
path-stretch values. The procedure was repeated for each evaluated traffic matrix. For
instance, if 12 traffic matrices are evaluated, the boxplot shows the distribution of all the
12 ¨ N B recorded values.
91
CHAPTER 5. CONSUME BETTER: TRAFFIC ENGINEERING FOR OPTIMIZING
NETWORK USAGE
The results show that only a few flows are routed over long routes. The path length
of each origin-destination flow rarely exceeds the double of the shortest path (SP) route.
The median, represented by the red line in the middle of the box-plots, is close to one.
This allows us to conclude that the majority of paths remain almost as short as with SP
routing.
5.3 Conclusion
In this chapter, we proposed two solutions for load balancing the network load. The first
one relies on a function that changes the cost of a link depending on its utilization. The
preliminary validation revealed that minimal variations in the shape of the cost function
could lead to a bad performance of the algorithm. In the search for a solution to this
problem, we proposed an alternative load balancing technique.
Our second attempt provides a complete solution which tries to keep a low overhead
of centralized management. For this purpose, the construction of the solution is split into
two parts. First, a centralized optimization phase uses a state-of-the-art approximate
maximum concurrent flow algorithm to find the near optimal distribution of the flow from
each source. Similarly to the first load balancing technique, the central optimization phase
of our second solution relies on assigning costs to network links. During the execution
of the algorithm, the cost of each link increase when demand is routed through the link.
Moreover, the algorithm also uses shortest path computations and relies on these costs
to prioritize the paths passing through the less loaded links. However, thanks to making
tiny steps during the execution of the algorithm, it is possible to achieve a testable near-
optimal solution.
Unfortunately, the algorithm delivers a solution which splits the traffic into multiple
paths. Our contribution is in providing a way to effectively apply the previously computed
routing into the network switches and change the paths of network flows. We avoid using
explicit routes and keep the communication overhead reasonable low even on the biggest
of the evaluated networks. This is achieved by deporting part of the computation to a
network device. The second part of our solutions consists in locally computing the explicit
paths for the network flows on ingress devices, which also allows updating the forwarding
database locally. This solution would not be possible without the use of a source routing
protocol for data forwarding. We propose to use SPRING for this purpose. It enables
the ingress nodes to locally change the network routes without having to update the
forwarding tables of midpoint devices.
The presented technique allows achieving a global, network-wise, traffic optimization.
It was evaluated against a linear programming model and was shown to compute a near-
optimal routing in all the evaluated networks. The utilization of the most loaded link
is always kept close to the value provided by the LP model. Moreover, the lengths of
network paths are kept close to the lengths with shortest path routing. At the same time,
the computational times do not surpass a couple of seconds in any of the real backbone
networks used for evaluation.
92
Chapter 6
In the previous two chapters, we presented two methods to increase the energy efficiency
of backbone networks. The first one, which we called SegmenT Routing based Energy
Efficient Traffic Engineering (STREETE), acts by putting unused resources to sleep when-
ever the network is at low utilization. In the second, we proposed a means to achieve a
near-optimal distribution of the load in the network while keeping a low computational
and communication overhead. The latter solution can be used separately, to push the
limits of the network and avoid premature updates to higher data-rates that consume
more power. Furthermore, this load balancing technique can also be incorporated into
our STREETE framework to obtain a solution that acts both by minimizing the number
of active resources and by optimizing the traffic within these active resources.
We start this chapter by evaluating the combination of the previously proposed tech-
niques: STREETE combined with load balancing (LB). The proposed solution, which
we call STREETE-LB, can compete with linear programming models both regarding
turned-off links and in its ability to keep the network out of congestion.
Unfortunately, achieving such performance comes at the cost of frequent changes of
network paths. Not only the algorithm has to reroute the flows each time a link is turned
on or off, but even a small variation in network conditions may induce a reorganization
of the paths due to the inclusion of the load balancing technique. To estimate the impact
of such frequent route changes, we conduct a detailed evaluation on our network testbed.
In particular, we evaluate how the change in the network delay due to moving a flow
between two paths impacts the congestion control mechanism of TCP.
93
CHAPTER 6. CONSUME LESS AND BETTER: EVALUATION AND IMPACT ON
TRANSPORTED PROTOCOLS
To reduce the influence of this computation, we use two different precisions for the
Karakosta’s algorithm. A low precision of ǫ “ 0.2 is used during each iteration of the
STREETE framework, i.e. when a link extinction/ignition is simulated. A higher pre-
cision, of ǫ “ 0.1 is used to compute the final routing, which will be sent to the SDN
switches. Performing less accurate computation during the iterations of STREETE allows
for drastically improving the computational time because the complexity of the algorithm
increases quadratically with the precision.
We use the same MILP model as in Chapter 4 (Sec. 4.3.2) for this evaluation.
In our opinion, STREETE-LB is not fast enough in the big, 200-node network. In the
worst case, it took a maximum of 20 minutes to compute a solution. Nevertheless, we
believe that the 15 seconds on the 75-node CORONET network topology are promising.
In fact, anticipating some conclusions detailed later, we affirm that very frequent rerout-
ings may be dangerous for the network stability. More details on this matter are given
in the second part of this chapter.
STREETE-LB is thus capable of computing a solution on any of the evaluated real
backbone networks in adequate time.
94
6.1. CONSUME LESS AND BETTER: STREETE-LB
80 100
Number of active links
70
80
60
USNET
Italian
50 60
40
30 40
50 140
Number of active links
120
40
100
NSFNet
30 80
ATT
20 60
40
10
20
0 0
0 1 2 3 4 5 0 1 2 3 4 5
90
200
Number of active links
80
70
Germany 50
150
60
50
Geant
100 40
30
50 20
10
0 0
0 1 2 3 4 5 0 1 2 3 4 5
time (normalized) time (normalized)
Figure 6.1: Maximum link utilization in the analyzed networks using the STREETE-LB
mixed solution
these scenarios in detail and concluded that it is due to the approximate nature of the
Karakosta’s algorithm and the low precision of 0.2 used by STREETE. Increasing the
precision of the algorithm leads to better results at the cost of longer computations. It is
a trade-off to be considered. It may be worth increasing the accuracy on small networks
if sub-second computational time is not needed.
100
Number of active links
80
60
40
Figure 6.2: Influence of the precision ǫ on the number of active links and on the congestion
avoidance. USNET network
95
CHAPTER 6. CONSUME LESS AND BETTER: EVALUATION AND IMPACT ON
TRANSPORTED PROTOCOLS
For example, Fig. 6.2 illustrates the performance on the USNET network with a
precision of ǫ “ 0.1. The additional precision enables improving the quality of the
results at a drastic increase in the computational complexity. The maximum time taken
by STREETE-LB in this case was of approximately 18 seconds. For comparison, in the
previous section, we showed that STREETE-LB takes approximately 0.7s with a precision
ǫ “ 0.2.
From Fig. 6.1 and 6.2, it may look like STREETE-LB behaves much worse than
CPLEX in terms of active links. We observe links being switched on in bursts. For
example, the biggest such burst is seen on the Germany50 network at around t “ 3.5.
Analyzing this behavior revealed that it is due to the much more conservative nature
of STREETE compared to the CPLEX model used for validation. In particular, our
solution uses a double threshold (60% and 80%) to decide if it is worth turning links
off /on and to avoid link flapping 1 .
200
Number of active links
150
100
Figure 6.3: Influence of the thresholds α and β on the number of active links in the
Germany50 network
Fig. 6.3 shows that setting the two thresholds close to the link congestion makes
STREETE-LB behave similarly to CPLEX for a longer period. However, using a double
threshold allows to improve the stability of the framework and avoid frequent change in
the paths of network flows.
96
6.1. CONSUME LESS AND BETTER: STREETE-LB
24 0.4
32 0.2
40 0.0
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
Index of the traffic matrix
(a) NSFNet
42 0.4
56 0.2
70 0.0
0 16 32 48 64 80 0 16 32 48 64 80 0 16 32 48 64 80 0 16 32 48 64 80
Index of the traffic matrix
(b) Geant
28
42 0.4
56 0.2
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0.0
Index of the traffic matrix
(c) Italian
Figure 6.4: The heat-map of network links with real traffic matrices
this is not necessarily a bad behavior. The additional links were ignited to reduce the
load on a couple of highly utilized links, which were close to congestion. The shortest
path routing was unable to handle this case and avoid network congestion efficiently.
The latest case, illustrated on the Italian network, shows a distinct advantage of
STREETE-LB. In this case, the algorithm was able to both slightly reduce the number
of active links and to keep the network at relatively low utilization.
97
CHAPTER 6. CONSUME LESS AND BETTER: EVALUATION AND IMPACT ON
TRANSPORTED PROTOCOLS
6.2.1 Background
Rerouting and congestion control
The TCP’s congestion control algorithm, defined in the RFC 5681, is an important, and
one of the most complex, features of modern TCP implementations. This algorithm
tries to split the network capacity fairly among all the flows traversing it. Under such
algorithm, the sender keeps a congestion window that is dynamically modified depending
on the network conditions. The source cannot send more data than what fits in this
window during a Round-Trip Time (RTT). The maximum instantaneous bandwidth, B,
of the TCP connection parameters, is thus limited by the size of the congestion window:
B “ W ¨ M SS{RT T , where W is the size of the congestion window in segments, M SS
is the maximum segment size in bytes.
TCP was initially designed for standard IP routing assuming that all packets follow the
same route towards a destination. Packet reordering and route changes were considered
rare. However, in an SDN network, the controller may frequently shift a TCP flow to
an alternative path to optimize the overall network throughput. The route change can
impact the throughput of a TCP flow in two different ways:
• When the new route has higher RTT, the sender increases the size of the congestion
window to maintain the same sending rate. For example, if the RTT doubles, the
98
6.2. IMPACT OF FREQUENT ROUTE CHANGES ON TCP FLOWS
bandwidth will be halved and will gradually increase with the growth of the TCP
congestion window; a direct application of the equation B “ W ¨ M SS{RT T .
• The receiver sees packets arriving out of order if the new route has a lower RTT.
TCP’s congestion control algorithms assume the worst-case scenario and view this
as an indication of packet loss due to congestion. Figure 6.5 illustrates the prob-
lem considering a sample backbone network. The link ab becomes available for
transmission and hence packets 2 and 3 take a route shorter than packet 1. The
packets arrive out of order at the receiver who uses duplicate ACKs to notify the
sender about a problem. The sender reduces the size of the sending window to
avoid congestion, hence decreasing the transmission rate.
2
3 2 3
a b a b
1
c c 1
d d
Given that a large part of network traffic is often carried by a small number of large,
long-lived flows [118], rerouting these flows may significantly reduce the overall network
throughput.
500
450
Segments (cwnd)
400
350
300
250
200
150
150
100
100
50
50 0 1
reno cubic
0
0 20 40 60 80 100 120 140
Time (s)
Figure 6.6 shows how the congestion window of a single flow evolves when using the
Reno and Cubic congestion control algorithms.
99
CHAPTER 6. CONSUME LESS AND BETTER: EVALUATION AND IMPACT ON
TRANSPORTED PROTOCOLS
RENO The standard TCP congestion control mechanism is Reno, which is still the
fallback algorithm in the Linux kernel. This algorithm is straightforward and comprises
two phases: 1) the slow start phase, where the window size doubles each RTT; and 2) the
congestion control phase where the window size increases by one each RTT or is divided
by two under packet loss.
The slow start phase allows for a fast increase in the window size at the beginning
of communication. It corresponds to the spike at time 0 in Figure 6.6. In this work, we
ignore the slow start.
Although TCP Reno behaves well under small Bandwidth x Delay Product (BDP),
it severely underutilizes the channel on long fat networks – i.e. networks with high data
rate and high delay – because it linearly grows the windows by one every RTT when
recovering from packet loss. In Figure 6.6, Reno takes more than a minute to grow the
congestion window and fill a 100Mbps link with an RTT of 50ms.
CUBIC The Cubic TCP algorithm [119] aims to avoid the shortcomings of Reno and
achieve high data rates in networks with large BDP. At the same time, when Cubic
detects a network with small BDP, it tries to mimic Reno’s behavior emulated by a
mathematical model. This happens, particularly in local high-speed, low-delay, networks.
As its name suggests, the algorithm grows the congestion window by using a cubic
function of the elapsed time from the last loss event y “ p∆tq3 . More precisely, using a
shifted and scaled version y “ 0.4 ¨ p∆t ´ RT T ´ Kq3 ` Wmax . In this equation, ∆t is
the time passed from the last loss event, Wmax is the size of the congestion window just
before the loss event. K “ pWmax {2q1{3 is a parameter that depends on Wmax .
Cubic is also less aggressive in reducing the congestion window at a loss event. Com-
pared to Reno, which halves the window, Cubic reduces it by 20%.
750
450
300
150
cubic
y=0.4 * ((t-29.43-RTT)-K)3 + 455
0
25 30 35 40 45 50
Time (s)
100
6.2. IMPACT OF FREQUENT ROUTE CHANGES ON TCP FLOWS
Figure 6.7 illustrates the cubic function when applied to part of the trace from Figure
6.6. At time t “ 29.43, a loss is detected when reaching the bottleneck capacity of a
network path. Cubic hence reduces the congestion window to 80% of Wmax .
The growth function, y, used to increase the congestion window depends mainly on
Wmax , which makes it easy to estimate the time taken by TCP cubic to restore its
congestion window after a loss. To do that, we ignore the insignificant dependency on
RTT in the Cubic function and solve Wmax “ y “ 0.4 ¨ p∆t ´ Kq3 ` Wmax . We obtain
∆t “ K “ pWmax {2q1{3 .
Moreover, the RTT and the bottleneck bandwidth are used to determine the maximum
window size. Table 6.2 estimates the size of the congestion window needed to transmit
at 100Mbps with 1448 byte segments. It also uses the equation ∆t “ K to assess the
time taken by the Cubic algorithm to grow the window back to Wmax after a loss.
Table 6.2: Estimated Wmax and time needed to recover after a loss, 100Mbps bottleneck
link.
This estimation is valid only if no further loss is detected. Figure 6.7 contains a
counter-example where TCP would reach Wmax at approximately t “ 36s, but a loss
happened at „34s and forced the congestion control algorithm to reduce the size of the
congestion window.
While particularly interested in the time needed to recover from a loss, we note that
route changes may generate false loss events when the TCP flow shifts towards a shorter
route. At such time, the size of the congestion window decreases, and cubic tries to grow
it back. Intuitively, the bandwidth of a TCP flow may drop significantly if the SDN
controller attempts a second re-optimization in the meantime.
The TCP implementation in the Linux kernel incorporates a lot of standard and non-
standard optimizations [107]. One of the non-standard features, which is very impor-
tant for our work, is the possibility to undo an adjustment to the congestion window.
This implementation tries to distinguish between packet reordering and loss by using the
“Timestamp” TCP option. When the sender detects that a past loss event was a false
positive due to packet reordering, the algorithm reverts the window size to the value used
before the reduction. As a result, packet reordering may have much less impact compared
to standard TCP.
101
CHAPTER 6. CONSUME LESS AND BETTER: EVALUATION AND IMPACT ON
TRANSPORTED PROTOCOLS
Multiple TCP Flows on a Bottleneck Link
As mentioned earlier, TCP cannot send more data than the size of the congestion window
per RTT. As a result, when the sender’s window W is smaller than Wmax , TCP spends
part of the time waiting for acknowledgments without sending any data.
Figure 6.8 illustrates this case, summarizing data collected using the tcpdump ap-
plication to capture the packets of a TCP flow over a second. The figure shows the
times when packets were captured. Approximately 2/3 of the time, there was no packet
passing through the network because the congestion window is too small (only 150 seg-
ments). Under such conditions, a congestion window of 431 segments was needed to fill
the bottleneck link.
If a route change happens when a flow is waiting for acknowledgments, there is no re-
ordering problem since no packet is sent. Hence, the probability that a TCP destination
will receive packets out of order increases as W approaches Wmax . Respectively, when
TCP backs off and lowers its sending rate by reducing W , it also reduces the probability
to be impacted by the rerouting.
If a large number of TCP flows share the same bottleneck link, the flows spend a lot
of time waiting for the ACKs. Hence, rerouting a large bunch of TCP flows at the same
time may have less impact on the overall network bandwidth than rerouting a single TCP
flow.
This TCP behavior may change in the future. The benefit of smoothing the trans-
mission over time by employing traffic pacing was shown to have a beneficial effect on
network performance [120] and state-of-art congestion control algorithms, like BBR[121],
use this technique to avoid the bufferbloat problem.
102
6.2. IMPACT OF FREQUENT ROUTE CHANGES ON TCP FLOWS
.
group 1 .
.
10Gbps
0ms 0ms
.
.
.
10Gbps 10Gbps
120ms sink
X ms
.
group 20 .
100Mbps
.
each
(bottleneck points)
by these two links. The paths have different delays. One of the backbone paths has
no additional delay (0ms), while the second path induces an additional delay of X ms
to simulate a longer route. We use tc-netem for this purpose. Each of the two paths
(links) has sufficient capacity to transfer the flows. The bottleneck capacity is outside
the backbone network. Congestion never happens on these two links.
The sources are aggregated in groups. The flows from 2 different groups never share
a bottleneck point. However, two flows of the same group compete for the bandwidth
allocated to the group. Each group has a different path RTT, uniformly distributed in
the interval [0ms, 120ms], which adds to the delay of the backbone paths. The goal is to
recreate the scenario where flows with different end-to-end delay coexist in the backbone
network.
Concerning the physical infrastructure: nodes in the topology correspond to Ubuntu
14.04 Linux servers running Open vSwitch2 v2.4 (manually compiled for Ubuntu 14.04).
The network is controlled by the ONOS SDN controller (v1.7) via an out-of-band control
connection. We wrote an ONOS SDN application to reroute the flows between the two
backbone links.
The backbone links use 10G optical Ethernet interconnects. The connections to the
TCP sources use 1G Ethernet ports. In the physical topology, both 10G and 1G network
interconnects pass through Dell Ethernet switches. The point-to-point links are emu-
lated using VLANs on these hardware switches. This permits to reconfigure the network
topology remotely. Unfortunately, the queuing disciplines on these switches are opaque
and cannot be fine tuned. To avoid perturbations, we use tc-tbf (Token Bucket Filter)
to limit the speed of each group to 100Mbps. In such a way, the data passing through a
hardware switch is always much below the link capacity. Moreover, this allows avoiding
perturbations due to the Ethernet flow control mechanisms: PAUSE frame. We use a
drop-tail bottleneck queue having the size 0.1 * BDP.
2
http://openvswitch.org
103
CHAPTER 6. CONSUME LESS AND BETTER: EVALUATION AND IMPACT ON
TRANSPORTED PROTOCOLS
Rerouting Independent Flows
This section analyses the impact of rerouting multiple independent TCP flows and how
the total backbone throughput changes as a consequence of rerouting.
We generate 20 unsynchronised flows, one flow per group, that start at random times
within a 30-second interval. We run the transfer for 400 seconds to allow TCP flows
to converge to a steady state. After that, we reroute the traffic every 200 seconds and
measure how the aggregated throughput changes at during rerouting.
1
Bandwidth (normalized)
0.8
0.6
1.1
1
0.4
0.9
0.8
0.7
0.2 10ms
0.6
30ms
0.5
50ms
0.4
0 400 405 410 415 420 425 599 602 605 608 70ms
90ms
Figure 6.10 shows how the aggregated bandwidth varies when the second backbone
path induces an additional delay of 10ms, 30ms, 50ms, 70ms or 90ms.
At t “ 600, the flows are rerouted towards the path with lower delay. Respectively,
at t “ 400 and t “ 800, the flows are rerouted towards the longer path. This experiment
confirms our expectations, that rerouting the flows impacts their throughput both when
moving towards a longer route and when moving towards a shorter one.
Nevertheless, packet reordering, which happens at t “ 600, has a less pronounced
impact than expected. Theoretically, Cubic would reduce the transmission window to
80% of its size when a loss is detected. Then it would gradually increase the window
back to the size before the loss. However, the Linux TCP can detect packet reordering in
some cases. In particular, this happens when the difference of delay between two routes
is small compared to the delay of the fastest route. Hence, a slight variation of delay,
like 10ms, has a marginal impact on the total throughput of the flows. The next section
gives more insights on this optimization.
Rerouting towards the link with higher RTT substantially reduces the aggregate band-
width of the 20 flows (t “ 400 and t “ 800). The larger the delay difference between the
two routes, the bigger the drop. In this case, rerouting is transparent to the congestion
control algorithm. The drop comes from the unexpected increase of the delay. As a con-
sequence, the size of the transmission window must be increased to allow transmission at
104
6.2. IMPACT OF FREQUENT ROUTE CHANGES ON TCP FLOWS
1
Bandwidth (normalized)
0.8
0.6
1.1
1
0.4
0.9
0.8
0.7
0.2
0.6
1 flow
0.5
3 flows
0
0.4
400 405 410 415 599 602 605 608 5 flows
9 flows
Figure 6.11 shows how competing flows impact the recovering speed after rerouting.
The bandwidth drop caused by rerouting towards a shorter route becomes quickly in-
significant (t “ 600s) in contrast to rerouting towards a longer path (t “ 400s). With
the increase in the number of flows sharing a bottleneck link, each of these flows gets a
smaller proportion of the total bottleneck bandwidth. The flows hence spend more time
waiting for acknowledgments than actually sending data. At rerouting, fewer flows see
their packets arrive out-of-order. Moreover, the flows that do not experience out-of-order
arrivals can increase their sending rate by using the capacity released by flows that have
just reduced the sending window. As a consequence, the total aggregated throughput on
the backbone links is almost not impacted.
When rerouting flows towards a longer path, the aggregate bandwidth recovered
slightly quicker when the number of competing flows per group increases. It is due
to nonlinear growth function used by Cubic. Having more flows increases the probability
that some of them will be in the “aggressive growth” phase of the cubic function.
105
CHAPTER 6. CONSUME LESS AND BETTER: EVALUATION AND IMPACT ON
TRANSPORTED PROTOCOLS
reorder enough packets to perturb the congestion control algorithm even if the delay
difference between the two considered routes is small compared to the RTT.
While focusing on analyzing a single TCP flow, we use iperf to transfer 2Gbytes of
data and measure its mean bandwidth (on testbed from Figure 6.9). A big enough trans-
fer size was chosen to reduce the impact of the slow start phase on the mean throughput.
Considering a maximum bottleneck throughput of 100Mbps, each transfer takes approx-
imately 3 minutes if transferred at full speed. The flow is frequently rerouted between
the two paths. We tested with periods of rerouting going from once every 15 seconds, to
as low as once every 0.1s.
100
80
Mean bandwidth (Mbps)
100
60 80
60
40 40
20
45<->55ms 90<->105ms
no reroute, 45ms no reroute, 90ms
0
0 2 4 6 8 10 12 14 16
Rerouting period (s)
Figure 6.12: Throughput of a TCP flow. Periodic rerouting between 2 routes with dif-
ferent RTT; Low RTT.
Figure 6.12 summarises the results. Considering five experiment runs, each data point
in the graph corresponds to mean throughput, i.e. the amount of transferred data divided
by the total transmission time. The error intervals show the absolute minimum and
maximum values recorded. The baseline without rerouting shows the average throughput
of a TCP flow when it always passes through the same path. The baseline is the mean
of 20 experiments. The inner plot zooms in on the region with frequent route changes.
Frequent route changes have a large impact on the performance of the congestion
control algorithm. When the traffic is rerouted once every 0.1s, the throughput is as low
as 35% of the throughput without rerouting. We can also observe that with a route change
every 2 seconds, the throughput is around 85% compared to the throughput without
rerouting. Although from a practical point of view we believe that these frequencies
of rerouting are extreme (and not very realistic), they were included here to evaluate
the limits on traffic rerouting. To re-optimise the network at such frequencies, a lot
of control messages must be transmitted between the SDN controller and the switches,
creating a flood of control traffic. Moreover, the network optimization problems are
usually computationally intensive and take the time to find a good solution.
106
6.2. IMPACT OF FREQUENT ROUTE CHANGES ON TCP FLOWS
It is worth noting that the biggest drop in throughput is observed when a second
rerouting happens before Cubic recovers from the first rerouting. Previously, we gave an
estimation of the recovery time in Table 6.2.
Figure 6.13 shows two interesting results for higher RTT:
100
80
60
40
20
Mean bandwidth (Mbps)
60
40
20
0
0 2 4 6 8 10 12 14 16
Rerouting period (s)
Figure 6.13: Throughput of a TCP flow. Periodic rerouting between 2 routes with dif-
ferent RTT; High RTT.
1. Frequent rerouting has a beneficial effect under an RTT of 170, observed at period “
0.6s (the first ellipse) where the mean bandwidth of a flow rerouted every 0.6s is
higher than the average bandwidth of a flow that is always transmitted over the
same path.
2. Under RTT=480, the bandwidth of the flows is also less impacted when rerouted
frequently.
These results are consistent among the multiple transfers. The error intervals are very
tight and difficult to see in the figure. To explain this behavior, we analyze in detail the
evolution of the congestion window and packets traversing the network. We choose to
concentrate on the points marked with ellipses in Figure 6.13.
Inspecting the beneficial effect at RTT=170 Figure 6.14 gives more insights on
this case. The dashed line shows the evolution of the congestion window without rerout-
ing. At times t “ 23, t “ 43, etc, the congestion control algorithm does not behave as
107
CHAPTER 6. CONSUME LESS AND BETTER: EVALUATION AND IMPACT ON
TRANSPORTED PROTOCOLS
expected. Instead of reducing the congestion window by 20% as dictated by the Cubic al-
gorithm, the drop is much higher. It is because Cubic claims bandwidth too aggressively.
Hence a huge number of packets are lost when reaching the bottleneck capacity.
Moreover, Linux TCP does not employ a multiplicative decrease technique of reducing
the congestion window. Instead, it reduces it additively at every second duplicate ACK
received. These two causes together imply a bad performance at RTT=170.
Rerouting the flow in the meantime perturbs the Cubic protocol and avoids this
unwanted behavior. It is due to the optimization in the Linux kernel which detects
out-of-order deliveries thus reducing the impact of the rerouting. The evolution of the
congestion window in the case when the route changes every 0.6s can be seen on the
continuous, blue, line of Figure 6.14. For example, at t “ 8, a duplicate ACK is detected
as being due to reordering and not to a loss, the congestion window is restored to the
size before the reduction.
1800
1600
1400
Segments (cwnd)
1200
1000
800
600
.
400 ... ..
out-of-order delivery detected; the reduction of the congestion window is reverted
200
no reroute, 170ms 170<->190 every 0.6s 170<->190 every 15s
0
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230
Time (seconds)
We mentioned in an earlier section that the TCP sender might “miss” a rerouting
because the sender alternates between sending packets and waiting for ACKs. If the route
change happens when the sender waits for ACKs, no packets will arrive out-of-order. The
blue line illustrates this case. At small window size, between t “ 0 and t “ 30, the sender
spends a lot of time waiting. As a result, the probability to be impacted by a rerouting
is small. The bigger the congestion window, the higher the probability to be impacted
by a rerouting. Starting with t “ 120, an equilibrium is created. The sender struggles to
grow the window any further: any rerouting has a high probability of reordering packets.
108
6.2. IMPACT OF FREQUENT ROUTE CHANGES ON TCP FLOWS
2000
Segments (cwnd)
1500
1000
500
However, rerouting once every 15 second allows the flow to converge towards the
maximum window size. Any bigger rerouting period (20s, 30s, etc.) will have less impact
on the average throughput. The worst case scenario occurs at a period of 14s.
6.2.4 Conclusion
In this chapter, we started by evaluating the combination of the techniques presented in
the previous chapter, which we call STREETE-LB. We showed that it could achieve near-
optimal solutions both regarding active links and in its capacity to keep the network out
of congestion. We managed to avoid the drastic increase of the computational complexity
by using a variable precision during the execution of the algorithms.
After that, we analyzed the impact of frequent route changes on the Cubic TCP flows.
We performed an extensive analysis of the behavior of cubic TCP on a real testbed to
find a safe rerouting frequency. Our study confirmed that at a low aggregation level when
a limited number of TCP flows share a path, the rerouting reduces the speed of the TCP
flow. Nevertheless, the reason for this behavior is opposite to the reason that is usually
evoked by the networking community.
It is considered that packet reordering will make TCP falsely assume a congestion
in the network and reduce its speed. In reality, the engineers implemented a set of
non-standard optimizations that detect this case and reduce the impact of reordering
to a minimum. Moreover, the impact of reordering decreases as the aggregation level
increases. This is because the probability of affecting a particular TCP flow decreases.
At the same time, we recorded a bandwidth drop when the flows are rerouted towards a
longer path. The increase of delay artificially limits the speed of TCP until the congestion
window “catches up” with the new network delay. The time needed by TCP to converge
to a steady state depends on the difference of network delay between the two routes and
the congestion control mechanism used by TCP. For example, in the case of the Cubic
congestion control, approximately 20 seconds are enough, even with a very high difference
in network delay, due to its fast growth of congestion window.
In any case, the traffic engineering SDN applications must be made aware of the
short drop of network throughput following a rerouting and must limit the frequency of
109
CHAPTER 6. CONSUME LESS AND BETTER: EVALUATION AND IMPACT ON
TRANSPORTED PROTOCOLS
network re-optimisations. Otherwise, this throughput drop caused by the reaction of the
TCP congestion control mechanisms may be incorrectly interpreted as a need for further
optimization of network flows, which can cause uncontrolled oscillations of traffic between
network paths.
110
Chapter 7
7.1 Conclusion
Information and communication technologies (ICT) are at the core of the digital economies
and increasingly important part of all of our personal and professional lives. With this
drastic rise in communicating applications, the network operators are in need to deploy
additional capacity to their network to support predicted and unpredicted traffic growth.
The resulted increase of the energy consumption has not only negative environmental con-
sequences but also an economic impact. Major ICT players, including Google, Microsoft,
and Facebook, are already deploying solutions to increase the efficiency of their networks.
Nevertheless, these solutions rely on full control of the communicating applications and
are limited to private deployments. They are suitable for cases where the communicating
applications are not under the network operator’s control. This thesis investigated the
problem of improving the energy efficiency of operator backbone networks by changing
the paths of network flows transparently to the communicating applications.
The first contribution of this thesis is the SegmenT Routing based Energy Efficient
Traffic Engineering (STREETE) framework for aggregating the flows over a subset of
links and turning the other links off. It is an online solution that uses two different views
of the network to speed-up the execution. This is especially handy when there is a need
for rapidly reacting to an unexpected burst of network traffic; a case that is mostly over-
looked by the research community. To further increase the speed of our algorithms, we
implemented them using state of the art dynamic shortest path algorithms. Moreover, re-
lying on shortest path computations, together with the SPRING source routing protocol,
allows keeping the control traffic overhead at a minimum. The evaluation of STREETE
on real backbone network topologies with real traffic matrices enabled us to conclude
that good results are obtained at low traffic load both in the number of turned-off links
and in its ability to avoid network congestion.
To improve the quality of STREETE under heavy load, we searched for a SDN-based
traffic engineering technique for optimizing the network utilization. We first presented
an unsuccessful attempt to build such a solution using a link cost function. Later we
presented a second solution, which we name “LB”. It enables near-optimal load balancing
of traffic on the network thanks to a careful combination of centralized and distributed
computations.
The centralized computation performed by LB at the SDN controller consists of a state
111
CHAPTER 7. CONCLUSION AND PERSPECTIVES
of the art algorithm for approximately solving the maximum concurrent flow problem. It
can achieve near-optimal load balancing of traffic in the network. Our contribution is in
proposing a means to apply the result of this centralized optimization into the network
devices. Our solution keeps a low communication overhead between the SDN controller
and the SDN switches. To achieve this goal, part of the optimization is made on the
network devices. This allows for locally computing the paths of the flows and update
the forwarding databases without the need of centralized orchestration. Similarly to
STREETE, we rely on the source routing provided by the SPRING protocol to update
the network paths atomically on a single device: the ingress SDN switch.
As a next step, we combined STREETE with the LB load balancing technique and
proposed the STREETE-LB framework. Overall, this combination revealed excellent
results both concerning the number of turned-off links and regarding load distribution
in the network. Nevertheless, to avoid an explosion of the computational complexity,
we had to reduce the precision of the algorithms used by the load balancing technique,
leading to sometimes sub-optimal, but still promising results.
We took the challenge of implementing and evaluating the STREETE framework on
a network testbed using real SDN hardware and the ONOS SDN controller. This action
allowed us to discover an oscillation due to the elasticity of TCP flows which did behave
well under the frequent route changes made by STREETE. To find a safe rerouting fre-
quency, we performed an in-depth evaluation of the behavior of TCP flows under frequent
route changes. Contrary to popular belief, the experiments showed that packet reorder-
ing incurred by rerouting towards a path with lower RTT has a marginal impact on the
bandwidth of TCP flows. This result is partially due to non-standard optimizations in-
troduced by the Linux kernel developers. A degradation of throughput was only observed
under very high rerouting frequencies. However, even this effect decreases with the in-
crease of the flow aggregation level. In the meantime, shifting the traffic towards a path
with longer RTT has an adverse influence on the bandwidth of TCP flows. We conclude
that SDN-based traffic engineering techniques must be constrained to avoid significant
variations in the end-to-end delay when rerouting the flows. Alternatively, if the increase
of delay is unavoidable, the traffic engineering SDN applications must be made aware of
the short temporary of network throughput which follows the rerouting. Otherwise, this
short drop of throughput may be wrongly interpreted as a sign to perform a network
re-optimization.
In addition to the scientific contributions, we took special care to enable the repro-
ducibility of the results presented in chapters 4 and 5. All the results, except the ones
based on the network tested, can be reproduced by following a couple of simple steps on
any Linux computer. Moreover, an Android application was developed to allow interac-
tive interaction with the proposed algorithms.
7.2 Perspectives
At critical times in the course of this thesis, some research directions have been preferred
over others, leaving entire territories unexplored. Here we describe some of the areas that
can be explored in the future.
112
7.2. PERSPECTIVES
Extending the STREETE framework Medium term perspectives include the ex-
tension of the STREETE framework to handle additional constraints.
We believe that the most important improvement is the inclusion of various link
speeds. At this point, the decision of which link to shut down or to turn on does not
take into consideration that it may be better to switch off multiple slow links instead
of a single, high data-rate one. The load balancing technique presented in Chapter 5 is
already capable of working with multiple link speeds. As a result, the main objective
would be to find a more energy-aware metric for turning off compared to the number of
links.
Another relevant problem of the STREETE framework is due to the extensive growth
of network path as a result of rerouting towards longer routes. Future research may
consider this aspect. We already mentioned that it is possible to use “graph spanners”
for this purpose, but did not yet validate this solution.
Link failure is also an aspect that we did not take into consideration. In particular,
due to the reduction of the number of active links, a fiber cut can disconnect the network.
To avoid this problem, protection links may be computed in the network. We believe
that link failure, similarly to the issue of path extension, is better treated at a topological
level, by forcing some links to be always active. This would slightly increase the energy
consumption but provide a much better resiliency. Moreover, we believe that these two
problems may be solved together, the computed graph spanner for the previous solution
may already provide a satisfactory level of redundancy.
113
CHAPTER 7. CONCLUSION AND PERSPECTIVES
STREETE is not suitable for access networks. It relies on the availability of multiple
alternative paths between the network devices. In access networks, this is usually not
true. That is why protocols similar to 802.3az reign at this level. However, in between
the access and the core, lay the aggregation and metro networks. STREETE may be a
viable solution at this level. However, due to a relatively large number of network devices,
it may be needed to virtually partition these systems in smaller subsets to reduce the
computational complexity.
114
Publications
International Journals:
[J1] M. D. Assunção, R. Carpa, L. Lefevre, O. Glück, P. Borylo, A. Lason, A. Szymanski
and M. Rzepka "Designing and Building SDN Testbeds for Energy-Aware Traffic Engi-
neering Services". Photonic Network Communications(PNET), 2017, to appear.
[J2] R. Carpa, O. Glück, L. Lefevre and J.-C. Mignot. "Improving the Energy Efficiency
of Software Defined Backbone Networks". Photonic Network Communications(PNET),
2015, vol. 30, nr. 3, pp. 337-347.
International Conferences:
[C1] R. Carpa, M. D. Assunção, O. Glück, L. Lefevre, and J.-C. Mignot. "Responsive
Algorithms for Handling Load Surges and Switching Links On in Green Networks." 2016
IEEE International Conference on Communications(ICC16), Kuala Lumpur, Malaysia,
May 2016.
[C2] M. D. Assunção, R. Carpa, L. Lefèvre and O. Glück. "On Designing SDN Services for
Energy-Aware Traffic Engineering." 11th EAI International Conference on Testbeds and
Research Infrastructures for the Development of Networks & Communities(TRIDENTCOM2016),
Hangzhou, China, June 2016.
[C3] R. Carpa, O. Glück and L. Lefevre. "Segment routing based traffic engineering for
energy efficient backbone networks," 2014 IEEE International Conference on Advanced
Networks and Telecommuncations Systems (ANTS), New Delhi, India, Dec 2014, pp. 1-6.
National Conferences:
[NC1] R. Carpa, O. Glück, L. Lefevre and J.-C. Mignot. "STREETE : Une ingénierie de
trafic pour des réseaux de cœur énergétiquement efficaces." Conférence d’informatique en
Parallélisme, Architecture et Système (COMPAS2015). Lille, France, Jun 2015.
Ongoing work:
[UC1] (submitted to CNSM 2017) R. Carpa, M. D. Assunção, O. Glück, L. Lefevre, and
J.-C. Mignot. "Evaluating the Impact of Frequent Route Changes on the Performance
of Cubic TCP Flows".
[UJ1] (to be submitted) R. Carpa et al. "A Congestion Avoidance Solution to Reprovision
Link Capacity in Green Networks".
115
CHAPTER 7. CONCLUSION AND PERSPECTIVES
116
Bibliography
[1] Dan Kilper. Energy challenges in access and aggregation networks. Symposium:
Communication networks beyond the capacity crunch. Accessed: sep/2015. The
Royal Society, London, May 2015. url: https://royalsociety.org/events/
2015/05/communication-networks/.
[2] Sushant Jain, Alok Kumar, Subhasree Mandal, Joon Ong, Leon Poutievski, Ar-
jun Singh, Subbaiah Venkata, Jim Wanderer, Junlan Zhou, Min Zhu, Jon Zolla,
Urs Hölzle, Stephen Stuart, and Amin Vahdat. “B4: Experience with a Globally-
deployed Software Defined Wan”. In: Proceedings of the ACM SIGCOMM 2013
Conference on SIGCOMM. SIGCOMM ’13. Hong Kong, China: ACM, 2013, pp. 3–
14. url: http://doi.acm.org/10.1145/2486001.2486019.
[3] Geant network looking glass and usage map. https://tools.geant.net/portal/.
Accessed: sep/2015.
[4] “GreenTouch Final Results from Green Meter Research Study”. In: GreenTouch
White Paper (). url: https : / / s3 - us - west - 2 . amazonaws . com / belllabs -
microsite-greentouch/index.php@page=greentouch-green-meter-research-
study.html.
[5] George Karakostas. “Faster Approximation Schemes for Fractional Multicommod-
ity Flow Problems”. In: ACM Trans. Algorithms 4.1 (Mar. 2008), 13:1–13:17. url:
http://doi.acm.org/10.1145/1328911.1328924.
[6] Nokia 1830 Photonic Services Switch Datasheet. https : / / tools . ext . nokia .
com/asset/194070.
[7] Ward Van Heddeghem, Filip Idzikowski, Willem Vereecken, Didier Colle, Mario
Pickavet, and Piet Demeester. “Power consumption modeling in optical multilayer
networks”. English. In: Photonic Network Communications 24.2 (2012), pp. 86–
102.
[8] ITU-T G.709 : Interfaces for the optical transport network. http://www.itu.
int/rec/T-REC-G.709/en.
[9] Maruti Gupta and Suresh Singh. “Greening of the Internet”. In: Proceedings of the
2003 Conference on Applications, Technologies, Architectures, and Protocols for
Computer Communications. SIGCOMM ’03. Karlsruhe, Germany: ACM, 2003,
pp. 19–26.
[10] Xiaowen Dong, T.E.H. El-Gorashi, and J.M.H. Elmirghani. “On the Energy Effi-
ciency of Physical Topology Design for IP Over WDM Networks”. In: Lightwave
Technology, Journal of 30.12 (June 2012), pp. 1931–1942.
117
BIBLIOGRAPHY
118
BIBLIOGRAPHY
119
BIBLIOGRAPHY
120
BIBLIOGRAPHY
121
BIBLIOGRAPHY
[59] H. Yonezu, K. Kikuta, D. Ishii, S. Okamoto, E. Oki, and N. Yamanaka. “QoS aware
energy optimal network topology design and dynamic link power management”.
In: 36th European Conference and Exhibition on Optical Communication. Sept.
2010, pp. 1–3.
[60] Mohamad Khattar Awad, Mohammed El-Shafei, Tassos Dimitriou, Yousef Rafique,
Mohammed Baidas, and Ammar Alhusaini. “Power-efficient routing for SDN with
discrete link rates and size-limited flow tables: A tree-based particle swarm op-
timization approach”. In: International Journal of Network Management (2017).
e1972 nem.1972, e1972–n/a. url: http://dx.doi.org/10.1002/nem.1972.
[61] F. Giroire, J. Moulierac, and T. K. Phan. “Optimizing rule placement in software-
defined networks for energy-aware routing”. In: 2014 IEEE Global Communications
Conference. Dec. 2014, pp. 2523–2529.
[62] Luca Davoli, Luca Veltri, Pier Luigi Ventre, Giuseppe Siracusano, and Stefano
Salsano. “Traffic Engineering with Segment Routing: SDN-Based Architectural
Design and Open Source Implementation”. In: Proceedings of the 2015 Fourth
European Workshop on Software Defined Networks. EWSDN ’15. Washington, DC,
USA: IEEE Computer Society, 2015, pp. 111–112. url: http://dx.doi.org/10.
1109/EWSDN.2015.73.
[63] C. Thaenchaikun, G. Jakllari, B. Paillassa, and W. Panichpattanakul. “Mitigate
the load sharing of segment routing for SDN green traffic engineering”. In: 2016 In-
ternational Symposium on Intelligent Signal Processing and Communication Sys-
tems (ISPACS). Oct. 2016, pp. 1–6.
[64] R. Wang, Z. Jiang, S. Gao, W. Yang, Y. Xia, and M. Zhu. “Energy-aware routing
algorithms in Software-Defined Networks”. In: Proceeding of IEEE International
Symposium on a World of Wireless, Mobile and Multimedia Networks 2014. June
2014, pp. 1–6.
[65] François Baccelli and Dohy Hong. “Interaction of TCP Flows As Billiards”. In:
IEEE/ACM Trans. Netw. 13.4 (Aug. 2005), pp. 841–853. url: http://dx.doi.
org/10.1109/TNET.2005.852883.
[66] Jon CR Bennett, Craig Partridge, and Nicholas Shectman. “Packet reordering is
not pathological network behavior”. In: IEEE/ACM Transactions on Networking
(TON) 7.6 (1999), pp. 789–798.
[67] J. He, M. Bresler, M. Chiang, and J. Rexford. “Towards Robust Multi-Layer Traffic
Engineering: Optimization of Congestion Control and Routing”. In: IEEE J.Sel.
A. Commun. 25.5 (June 2007), pp. 868–880. url: http://dx.doi.org/10.1109/
JSAC.2007.070602.
[68] M. Laor and L. Gendel. “The effect of packet reordering in a backbone link on
application throughput”. In: IEEE Network 16.5 (Sept. 2002), pp. 28–36.
[69] K. c. Leung, V. O. k. Li, and D. Yang. “An Overview of Packet Reordering in
Transmission Control Protocol (TCP): Problems, Solutions, and Challenges”. In:
IEEE Transactions on Parallel and Distributed Systems 18.4 (Apr. 2007), pp. 522–
535.
122
BIBLIOGRAPHY
[70] Jie Feng, Zhipeng Ouyang, Lisong Xu, and Byrav Ramamurthy. “Packet reordering
in high-speed networks and its impact on high-speed {TCP} variants”. In: Com-
puter Communications 32.1 (2009), pp. 62–68. url: http://www.sciencedirect.
com/science/article/pii/S0140366408005100.
[71] Ethan Blanton and Mark Allman. “On Making TCP More Robust to Packet Re-
ordering”. In: SIGCOMM Comput. Commun. Rev. 32.1 (Jan. 2002), pp. 20–30.
url: http://doi.acm.org/10.1145/510726.510728.
[72] Reiner Ludwig and Randy H. Katz. “The Eifel Algorithm: Making TCP Robust
Against Spurious Retransmissions”. In: SIGCOMM Comput. Commun. Rev. 30.1
(Jan. 2000), pp. 30–36. url: http://doi.acm.org/10.1145/505688.505692.
[73] J. Karlsson, P. Hurtig, A. Brunstrom, A. Kassler, and G. Di Stasi. “Impact of
multi-path routing on TCP performance”. In: 2012 IEEE International Symposium
on a World of Wireless, Mobile and Multimedia Networks (WoWMoM). June 2012,
pp. 1–3.
[74] A. Dixit, P. Prakash, Y. C. Hu, and R. R. Kompella. “On the impact of packet
spraying in data center networks”. In: 2013 Proceedings IEEE INFOCOM. Apr.
2013, pp. 2130–2138.
[75] G. Swallow, S. Bryant, and L. Andersson. Avoiding Equal Cost Multipath Treat-
ment in MPLS Networks. RFC 4928 (Best Current Practice). RFC. Updated by
RFC 7274. Fremont, CA, USA: RFC Editor, June 2007. url: https://www.rfc-
editor.org/rfc/rfc4928.txt.
[76] Naga Katta, Mukesh Hira, Changhoon Kim, Anirudh Sivaraman, and Jennifer
Rexford. “HULA: Scalable Load Balancing Using Programmable Data Planes”. In:
Proceedings of the Symposium on SDN Research. SOSR ’16. Santa Clara, CA, USA:
ACM, 2016, 10:1–10:12. url: http://doi.acm.org/10.1145/2890955.2890968.
[77] Keqiang He, Junaid Khalid, Aaron Gember-Jacobson, Sourav Das, Chaithan Prakash,
Aditya Akella, Li Erran Li, and Marina Thottan. “Measuring Control Plane La-
tency in SDN-enabled Switches”. In: Proceedings of the 1st ACM SIGCOMM Sym-
posium on Software Defined Networking Research. SOSR ’15. Santa Clara, Cali-
fornia: ACM, 2015, 25:1–25:6. url: http://doi.acm.org/10.1145/2774993.
2775069.
[78] Lessons learned from B4, Google’s SDN WAN. https://atscaleconference.com/videos/lessons-
learned-from-b4-googles-sdn-wan/. 2015.
[79] M. Kamola and P. Arabas. “Shortest path green routing and the importance of
traffic matrix knowledge”. In: Digital Communications - Green ICT (TIWDC),
2013 24th Tyrrhenian International Workshop on. Sept. 2013, pp. 1–6.
[80] Y. Ohsita, T. Miyamura, S. Arakawa, S. Ata, E. Oki, K. Shiomoto, and M. Murata.
“Gradually Reconfiguring Virtual Network Topologies Based on Estimated Traffic
Matrices”. In: IEEE/ACM Transactions on Networking 18.1 (Feb. 2010), pp. 177–
189.
123
BIBLIOGRAPHY
[81] Anders Gunnar, Mikael Johansson, and Thomas Telkamp. “Traffic Matrix Estima-
tion on a Large IP Backbone: A Comparison on Real Data”. In: Proceedings of the
4th ACM SIGCOMM Conference on Internet Measurement. IMC ’04. Taormina,
Sicily, Italy: ACM, 2004, pp. 149–160.
[82] Curtis Yu, Cristian Lumezanu, Yueping Zhang, Vishal Singh, Guofei Jiang, and
Harsha V. Madhyastha. “FlowSense: Monitoring Network Utilization with Zero
Measurement Cost”. In: Passive and Active Measurement: 14th International Con-
ference, PAM 2013, Hong Kong, China, March 18-19, 2013. Proceedings. Ed. by
Matthew Roughan and Rocky Chang. Berlin, Heidelberg: Springer Berlin Heidel-
berg, 2013, pp. 31–41. url: http://dx.doi.org/10.1007/978-3-642-36516-
4_4.
[83] C. Filsfils, S. Previdi, A. Bashandy, B. Decraene, S. Litkowski, and R. Shakir. Seg-
ment Routing Architecture", draft-ietf-spring-segment-routing-06 (work in progress).
Oct. 2015.
[84] IPv6 segment routing in linux kernel. https://lwn.net/Articles/722804/. 2017.
[85] Cisco Segment Routing Overview with Clarence Filsfils. https://www.youtube.com/watch?v=ZpU
2016.
[86] S.P. Bradley, A.C. Hax, and T.L. Magnanti. Applied Mathematical Programming.
Addison-Wesley Publishing Company, 1977. url: https://books.google.fr/
books?id=MSWdWv3Gn5cC.
[87] Wilbert E. Wilhelm. “A Technical Review of Column Generation in Integer Pro-
gramming”. In: Optimization and Engineering 2.2 (2001), pp. 159–200. url: http:
//dx.doi.org/10.1023/A:1013141227104.
[88] Farhad Shahrokhi and D. W. Matula. “The Maximum Concurrent Flow Problem”.
In: J. ACM 37.2 (Apr. 1990), pp. 318–334. url: http://doi.acm.org/10.1145/
77600.77620.
[89] David Peleg and Alejandro A Schäffer. “Graph spanners”. In: Journal of graph
theory 13.1 (1989), pp. 99–116.
[90] Frédéric Giroire, Stephane Perennes, and Issam Tahiri. “Grid spanners with low
forwarding index for energy efficient networks ”. In: International Network Opti-
mization Conference (INOC). Electronic Notes in Discrete Mathematics. Warsaw,
Poland, May 2015. url: https://hal.inria.fr/hal-01218411.
[91] Camil Demetrescu and Giuseppe F. Italiano. “Experimental Analysis of Dynamic
All Pairs Shortest Path Algorithms”. In: ACM Trans. Algorithms 2.4 (Oct. 2006),
pp. 578–601. url: http://doi.acm.org/10.1145/1198513.1198519.
[92] G. Ramalingam and Thomas Reps. “An Incremental Algorithm for a Generaliza-
tion of the Shortest-path Problem”. In: J. Algorithms 21.2 (Sept. 1996), pp. 267–
305. url: http://dx.doi.org/10.1006/jagm.1996.0046.
[93] Korotky S. “Projections of IP Traffic to 2020 Based on Regression Analyses of
Historical Trends and Nearer Term Forecasts”. In: GreenTouch Confidential Report
().
124
BIBLIOGRAPHY
[94] Hinton K. “Traffic modelling for the core network”. In: GreenTouch Confidential
Report ().
[95] OMNeT++ Discrete Event Simulator. https://omnetpp.org/.
[96] Introducing ONOS: A SDN Network Operating System for Service Providers. Whitepa-
per. Open Networking Lab ON.Lab, Nov. 2014. url: http://onosproject.org/
wp-content/uploads/2014/11/Whitepaper-ONOS-final.pdf.
[97] JongWon Kim, ByungRae Cha, Jongryool Kim, Namgon Lucas Kim, Gyeongsoo
Noh, Youngwan Jang, Hyeong Geun An, Hongsik Park, JiHoon Hong, DongSeok
Jang, TaeWan Ko, Wang-Cheol Song, Seokhong Min, Jaeyong Lee, Byungchul
Kim, Ilkwon Cho, Hyong-Soon Kim, and Sun-Moo Kang. “Proceedings of the Asia-
Pacific Advanced Network”. In: OF@TEIN: An OpenFlow-enabled SDN Testbed
over International SmartX Rack Sites 36 (2013), pp. 17–22.
[98] N.B. Melazzi, A. Detti, G. Mazza, G. Morabito, S. Salsano, and L. Veltri. “An
OpenFlow-based Testbed for Information Centric Networking”. In: Future Network
Mobile Summit (FutureNetw 2012). July 2012, pp. 1–9.
[99] Sebastia Sallent, Antonio Abelém, Iara Machado, Leonardo Bergesio, Serge Fdida,
Jose Rezende, Siamak Azodolmolky, Marcos Salvador, Leandro Ciuffo, and Lean-
dros Tassiulas. “FIBRE Project: Brazil and Europe Unite Forces and Testbeds
for the Internet of the Future”. In: Testbeds and Research Infrastructure, Develop-
ment of Networks and Communities. Ed. by Thanasis Korakis, Michael Zink, and
Maximilian Ott. Vol. 44. LNICST. Springer Berlin Heidelberg, 2012, pp. 372–372.
[100] Raphaël Bolze, Franck Cappello, Eddy Caron, Michel Daydé, Frédéric Desprez,
Emmanuel Jeannot, Yvon Jégou, Stephane Lantéri, Julien Leduc, Noredine Melab,
Guillaume Mornet, Raymond Namyst, Pascale Primet, Benjamin Quetier, Olivier
Richard, El-Ghazali Talbi, and Touche Iréa. “Grid’5000: a large scale and highly
reconfigurable experimental Grid testbed”. In: Int. Journal of High Performance
Computing Applications 20.4 (Nov. 2006), pp. 481–494.
[101] Emmanuel Jeanvoine, Luc Sarzyniec, and Lucas Nussbaum. “Kadeploy3: Efficient
and Scalable Operating System Provisioning”. In: USENIX ;login: 38.1 (Feb. 2013),
pp. 38–44.
[102] Francois Rossigneux, Jean-Patrick Gelas, Laurent Lefevre, and Marcos Dias de
Assuncao. “A Generic and Extensible Framework for Monitoring Energy Con-
sumption of OpenStack Clouds”. In: SustainCom 2014. Sydney, Australia, Dec.
2014, pp. 696–702.
[103] Ben Pfaff, Justin Pettit, Teemu Koponen, Ethan J. Jackson, Andy Zhou, Jarno
Rajahalme, Jesse Gross, Alex Wang, Jonathan Stringer, Pravin Shelar, Keith Ami-
don, and Martin Casado. “The Design and Implementation of Open vSwitch”.
In: 12th USENIX Symposium on Networked Systems Design and Implementation
(NSDI 2015). 2015.
[104] NetFPGA 10G. http://netfpga.org.
[105] The NetFPGA-10G UPB OpenFlow Switch. https://github.com/pc2/NetFPGA-
10G-UPB-OpenFlow.
125
BIBLIOGRAPHY
[106] Shih-Chun Lin, Pu Wang, and Min Luo. “Control traffic balancing in software
defined networks”. In: Computer Networks 106 (2016), pp. 260–271. url: http:
//www.sciencedirect.com/science/article/pii/S1389128615002571.
[107] Pasi Sarolahti and Alexey Kuznetsov. “Congestion Control in Linux TCP”. In:
Proceedings of the FREENIX Track: 2002 USENIX Annual Technical Conference.
Berkeley, CA, USA: USENIX Association, 2002, pp. 49–62. url: http://dl.acm.
org/citation.cfm?id=647056.715932.
[108] A. Hassidim, D. Raz, M. Segalov, and A. Shaqed. “Network utilization: The flow
view”. In: INFOCOM, 2013 Proceedings IEEE. Apr. 2013, pp. 1429–1437.
[109] I. Juva, R. Susitaival, M. Peuhkuri, and S. Aalto. “Traffic characterization for
traffic engineering purposes: analysis of Funet data”. In: Next Generation Internet
Networks, 2005. Apr. 2005, pp. 404–411.
[110] S. Uhlig, B. Quoitin, S. Balon, and J. Lepropre. “Providing public intradomain
traffic matrices to the research community”. In: ACM SIGCOMM Computer Com-
munication Review 36.1 (Jan. 2006).
[111] R.O. De Schmidt, R. Sadre, and A. Pras. “Gaussian traffic revisited”. In: IFIP
Networking Conference, 2013. May 2013, pp. 1–9.
[112] B. Jiang. “Head/tail Breaks: A New Classification Scheme for Data with a Heavy-
tailed Distribution”. In: ArXiv e-prints (Sept. 2012). arXiv: 1209.2801 [physics.data-an].
[113] Morton Klein. “A Primal Method for Minimal Cost Flows with Applications to the
Assignment and Transportation Problems”. In: Management Science 14.3 (1967),
pp. 205–220.
[114] Lisa K. Fleischer and Kevin D. Wayne. “Fast and simple approximation schemes
for generalized flow”. In: Mathematical Programming 91.2 (2002), pp. 215–238.
url: http://dx.doi.org/10.1007/s101070100238.
[115] Naveen Garg and Jochen Könemann. “Faster and Simpler Algorithms for Mul-
ticommodity Flow and Other Fractional Packing Problems”. In: SIAM Journal
on Computing 37.2 (2007), pp. 630–652. eprint: https://doi.org/10.1137/
S0097539704446232. url: https://doi.org/10.1137/S0097539704446232.
[116] A. B. Kahn. “Topological Sorting of Large Networks”. In: Commun. ACM 5.11
(Nov. 1962), pp. 558–562. url: http://doi.acm.org/10.1145/368996.369025.
[117] Reuven Cohen, Liran Katzir, and Danny Raz. “An efficient approximation for
the Generalized Assignment Problem”. In: Information Processing Letters 100.4
(2006), pp. 162–166. url: http://www.sciencedirect.com/science/article/
pii/S0020019006001931.
[118] L. Qian and B. E. Carpenter. “A flow-based performance analysis of TCP and TCP
applications”. In: 2012 18th IEEE International Conference on Networks (ICON).
Dec. 2012, pp. 41–45.
[119] Sangtae Ha, Injong Rhee, and Lisong Xu. “CUBIC: A New TCP-friendly High-
speed TCP Variant”. In: SIGOPS Oper. Syst. Rev. 42.5 (July 2008), pp. 64–74.
url: http://doi.acm.org/10.1145/1400097.1400105.
126
BIBLIOGRAPHY
127
BIBLIOGRAPHY
128
Appendices
129
Using the android application
The android application can be installed by scanning the QR code from Fig. 1. It allows
to interactively visualize the execution of the algorithms proposed in chapters 4 and 5 of
this thesis.
At launch, a network view is opened (Fig. 2). Its main components are:
• (1) Network selection. Sliding to the left/right reveals more available network
topologies. To select a network, click on it.
• (2) Algorithm selection. By default, a shortest path routing based on the Dijkstra’s
algorithm is used. Checking “Consume Less”, “Consume Better”, or both, enables
respectively STREETE, LB, or STREETE-LB.
• (3) Network state. It shows the link utilization when the algorithm selected by the
previous check-boxes is applied into the network. On start, an uniform all-to-all
traffic is injected into the network.
• (4) The slider allows to increase / reduce the traffic in the network by a multiplica-
tive factor.
131
Figure 2: Main interface of the application
It is possible to click on a node (Fig. 3a) to select it. Doing so, filters on the the
total flow originating from the selected node towards all other nodes in the network.
Moreover, a new slider appears at the top-left corner. It allows to increase/reduce this
flow. However, this functionality is in an early stage of development: the changes are
applied into the network, but the slider does not keep the state.
To deselect the node, it is sufficient to click on it a second time.
While having a node selected, it is possible to click on a second node to visualize the
paths taken by the flows between these two nodes and the associated network flow (Fig.
3b).
132