0% found this document useful (0 votes)
61 views

A Survey On Bitrate Adaptation Schemes For Streaming Media Over HTTP

This document surveys bitrate adaptation schemes for streaming media over HTTP. It discusses how HTTP adaptive streaming works and became popular by treating media like regular web content. It also outlines factors that bitrate adaptation algorithms may consider to provide high quality of experience for viewers.

Uploaded by

Ali Issa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

A Survey On Bitrate Adaptation Schemes For Streaming Media Over HTTP

This document surveys bitrate adaptation schemes for streaming media over HTTP. It discusses how HTTP adaptive streaming works and became popular by treating media like regular web content. It also outlines factors that bitrate adaptation algorithms may consider to provide high quality of experience for viewers.

Uploaded by

Ali Issa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 1

A Survey on Bitrate Adaptation Schemes for


Streaming Media over HTTP
Abdelhak Bentaleb, Member, IEEE, Bayan Taani, Member, IEEE, Ali C. Begen, Senior Member, IEEE,
Christian Timmerer, Senior Member, IEEE, and Roger Zimmermann, Senior Member, IEEE

Abstract—In this survey, we present state-of-the-art bitrate (HAS), treated the media content like regular Web contentand
adaptation algorithms for HTTP adaptive streaming (HAS). delivered it in small pieces over HTTP protocol. HAS quickly
As a key distinction from other streaming approaches, the became the dominant approach for video streaming due to
bitrate adaptation algorithms in HAS are chiefly executed at
each client, i.e., in a distributed manner. The objective of its adoption by leading service and content providers. Video
these algorithms is to ensure a high Quality of Experience delivery over the public Internet is also referred to as over-the-
(QoE) for viewers in the presence of bandwidth fluctuations top (OTT) video streaming, since the content or the streaming
due to factors like signal strength, network congestion, network service provider usually differs from the network provider. The
reconvergence events, etc. While such fluctuations are common emergence of HAS and new, mostly mobile end-user devices
in public Internet, they can also occur in home networksor even
managed networks where there is often admission control and with high processing and rendering capabilities played a key
QoS tools. Bitrate adaptation algorithms may take factors like role in the growth of streaming video traffic.
bandwidth estimations, playback buffer fullness, device features, In traditional non-HAS IP-based streaming, the client re-
viewer preferences, and content features into account, albeit with ceives media that is typically pushed by a media server using
different weights. Since the viewer’s QoE needs to be determined either connection-oriented protocols such as the Real-time
in real-time during playback, objective metrics are generally
used including number of buffer stalls, duration of startup Messaging Protocol (RTMP/TCP) [2] or connectionless proto-
delay, frequency and amount of quality oscillations, and video cols such as the Real-time Transport Protocol (RTP/UDP) [3].
instability. By design, the standards for HAS do not mandate any A common protocol to control the media servers in traditional
particular adaptation algorithm, leaving it to system builders to streaming systems (as shown in Fig. 1a) is the Real-time
innovate and implement their own method. This survey provides Streaming Protocol (RTSP) [4]. RTSP is responsible for setting
an overview of the different methods proposed over the last
several years. up a streaming session and keeping the state information
during this session, but is not responsible for actual media
Index Terms—Bitrate adaptation; HAS; DASH; adaptive video delivery, which is the task for a protocol such as RTP. Based
streaming; ABR schemes.
on the RTP Control Protocol (RTCP) reports sent by the
client, the media server may perform rate adaptation and data
I. I NTRODUCTION delivery scheduling. These characteristics result in complex
Video delivery has evolved to constitute a major fraction and expensive servers. Additional protocols or configurations
of today’s Internet traffic in the last decade thanks to ad- are needed during the session establishment in case network
vancements in network technologies, device capabilities, and address translation (NAT) devices and firewalls block the
audio-video compression schemes. Cisco reported in their control or media traffic [5]. Despite implementing the same
annual Visual Networking Index that in 2016, 67% of the baseline protocol(s), media servers from different vendors may
global Internet traffic was video, with a projection that it behave differently due to optional features or differences in
will reach 80% by 2021 [1]. This trend poses challenges implementation. Failovers due to a server fault often cause
in delivering videos with the best Quality of Experience presentation glitches and are rarely seamless unless certain
(QoE) over today’s Internet, which was originally designed redundancy schemes are in place. These scalability and vendor
for best-effort, non-real-time data transmission. Around 2005, dependency issues as well as high maintenance costs have
an elegant yet simple video delivery paradigm was introduced resulted in deployment challenges for protocols like RTSP.
by Move Networks, which quickly became popular due to its HAS uses HTTP as the application and TCP as the
better features and cheaper deployment costs over progressive transport-layer protocol, as illustrated in Fig. 1b, and clients
download and other proprietary streaming methods. This new pull the data from a standard HTTP server, which simply hosts
paradigm, which we refer to as HTTP adaptive streaming the media content. HAS solutions employ dynamic adaptation
with respect to varying network conditions to provide a
A. Bentaleb, B. Taani and R. Zimmermann are with the School of seamless (or at least smoother) streaming experience. Once
Computing, National University of Singapore, Singapore (e-mail: ben- a media file (or stream) is ready from a source, it is prepared
[email protected]; [email protected]; [email protected]).
Ali C. Begen is with Ozyegin University and Networked Media, Istanbul, for streaming before it is published to a standard, off-the-
Turkey (e-mail: [email protected]). shelf HTTP server. The original file/stream is partitioned into
C. Timmerer is with the Institute of Information Technology, Alpen-Adria segments (also called chunks) of equi-length playback time.
Universität Klagenfurt, 9020 Klagenfurt, Austria, and also with Bitmovin
Inc., San Francisco, CA 94105 USA (e-mail: [email protected] Multiple versions (also called representations) of each segment
klu.ac.at). are generated that vary in bitrate/resolution/quality using an

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 2

TABLE I: Differences between the traditional streaming and HAS systems.


Push-based Delivery (Traditional) Pull-based Delivery (HAS)
Communication protocols RTSP, RTP, UDP HTTP, RTMPx, FTP
Adaptation logic runs at Server side Client side
Transmission data units Packets Media segments/chunks
Video monitoring and user tracking RTCP for RTP transport Currently proprietary, standardization is underway
Multicast support Yes No
Caching support Protocol specific Web caches as used for HTTP

RTSP (TCP) HTTP Live Streaming (HLS) [9], Adobe’s HTTP Dynamic
Streaming (HDS) [10], Akamai’s HD [11] and several open-
Media Streaming
Server
RTP (UDP)
Client
source solutions. To avoid fragmentation in the market, the
Moving Picture Experts Group (MPEG) together with the
RTCP Reports (UDP)
3rd Generation Partnership Project (3GPP) started working
(a) Traditional streaming with RTSP. on HTTP streaming of MPEG media and HAS, respectively.
These efforts eventually resulted in the standardization of
HTTP Server HTTP GET Adaptive Dynamic Adaptive Streaming over HTTP (DASH) [12]. Unlike
Request Streaming Client proprietary solutions, DASH provides an open specification for
Media Buffer adaptive streaming over HTTP and leaves the implementation
Response of the adaptation logic to third parties as shown in Fig. 2a,
where blue components are specified in the DASH standard,
(b) HTTP adaptive streaming (HAS). while red components are left unspecified or specified in
other standards. The DASH server is essentially an HTTP
Fig. 1: Communication in traditional streaming and HAS
server that hosts the media segments, which are typically two
systems.
to ten seconds each, or could be as long as hours for the
entire content duration in presentation time. Each segment is
encoded at multiple bitrate levels and listed in the manifest
encoder or a transcoder (See Section II-A). Moreover, the termed Media Presentation Description (MPD, see Fig. 2b).
server generates an index file, which is a manifest that lists The MPD is an XML document that provides an index for
the available representations including HTTP uniform resource the available media segments at the server. At the client side,
locators (URLs) to identify the segments along with their DASH implements the bitrate adaptation logic, which issues
availability times. During a typical HAS session, the client timed requests and downloads segments that are described in
first receives the manifest that contains the metadata for video, the MPD from the server using HTTP (partial) GET messages.
audio, subtitles, and other features, then constantly measures During download, the DASH client estimates the available
certain parameters: available network bandwidth, buffer status, bandwidth in the network and uses information from the
battery and CPU levels, etc. According to these parameters, the playback buffer to select a suitable bitrate for the next segment
HAS client repeatedly fetches the most suitable next segment to be fetched. This behavior is called bitrate switching, where
among the available representations from the server. Table I the client’s goal is to fetch the highest-bitrate segments it can,
compares the main characteristics of the traditional streaming while keeping sufficient data in the playback buffer to avoid
and HAS systems. video stalls and thus achieve a good QoE trade-off.
HAS is addressing several aspects that were major concerns There are various implementations of DASH players1 . For
in traditional streaming protocols [2]–[4]: (1) it uses HTTP to example, dash.js [13] is a JavaScript-based DASH client,
deliver video segments, which simplifies the traversal through which is the reference client from the DASH Industry Fo-
NATs and firewalls [6]; (2) at the server side, it uses con- rum. Another JavaScript-based client is DASH-JS [14], which
ventional Web servers or caches available within the networks proposes a simple rate adaptation logic.
of Internet Service Providers (ISPs) and Content Distribution A recent survey [15] describes a range of bitrate adaptation
Networks (CDNs); (3) a client requests and fetches each (called also Adaptive BitRate (ABR)) schemes and techniques
segment independently from others and maintains the playback for DASH. The authors classified the schemes into three main
session state, whereas the server is not required to maintain any categories: client-side, server-side and in-network approaches.
state, hence, the client may download segments from different They provided a general review of video traffic measurement
servers without impacting system scalability [7]; and (4) it methods and a set of characterization studies for well-known
does not require a persistent connection between the client commercial streaming providers like Netflix, YouTube, and
and the server, which improves system scalability and reduces Akamai, and outlined several open research problems in the
implementation and deployment costs. DASH streaming field. Our survey differs in terms of two
Today, HAS accounts for the majority of Internet video key aspects: (1) a scheme classification is provided that is
traffic. It has reached mainstream due to commercial so-
lutions such as Microsoft’s Smooth Streaming [8], Apple’s 1 In this survey, the terms player and client are used interchangeably.

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 3

DASH Server DASH Client II. BACKGROUND AND D EFINITIONS


Media Presentation A. Video Coding Standards
MPD Parser
Description (MPD)
In an HAS system, a media file (or in the case of live video,
Segment Segment a stream comprising chunks of audiovisual data) is encoded

Representation 1
Adaptation Logic

Segment Segment
or transcoded into multiple representations. The most widely
used video coding format is currently H.264, also known as

Segment Download
Media
MPEG-4 Advanced Video Coding (AVC) [18]. This video

Decisions
Segment Segment Decoders
coding standard was introduced by MPEG in collaboration
Segment with the ITU-T Video Coding Experts Group (VCEG). A
Representation N

Segment Segment Parser


client requesting and downloading segments from possibly
Segment Segment
Segment
different representations (encoded at different bitrates) seam-
HTTP Downloader lessly concatenate these segments in its playback buffer. This
Segment Segment
results in a conforming bitstream that can be processed using a
standard decoder. A common assumption is that each segment
(a) DASH architecture components.
starts with an intra/key frame (i.e., IDR-frame in AVC), in
Media Presentation order for the decoder to process segments independently from
Description (MPD)
each other. This may lead to coding inefficiencies for short
segment durations [19].
Period Period Period Period
Scalable Video Coding (SVC) has been introduced as
an extension to AVC [20]. SVC enables splitting a video
Adaptation Set Adaptation Set
stream into multiple bitstreams or layers, where each one of
them consists of subsets of video data. It recombines these
Representation Representation Representation
bitstream subsets in order to additively increase the video
Initialization Media Media Media Media
quality. Typically, SVC allows the video stream to be split
Segment Segment Segment Segment Segment into three different dimensions of quality: temporal, spatial,
(b) Media Presentation Description (MPD) structure. and quality/Signal-to-Noise Ratio (SNR). In the temporal-
based technique, the video is encoded at multiple frame rates
Fig. 2: Dynamic Adaptive Streaming over HTTP (DASH). for a given resolution. The base layer has the lowest frame
rate, while enhancement layers increase the frame rate, which
gradually improves quality. In the spatial-based technique, the
structured based on the unique features of the adaptation logics video is encoded at multiple spatial resolutions for a given
and (2) more schemes are examined and a detailed comparison frame rate. In case of the SNR-based technique, the video is
table is provided. encoded at a single spatial resolution, and the enhancement
Most state-of-the-art HAS solutions solely integrate the layers improve quality, keeping the resolution constant.
bitrate adaptation logic inside the HAS client, since it al- The H.265 video codec (also known as High Efficiency
lows the client to select a bitrate level independently and Video Coding (HEVC)) was developed to provide approxi-
avoids the requirement of having intelligent devices inside mately twice the encoding efficiency of AVC [21]. Similarly,
the network infrastructure. This represents a key reason why as an extension to HEVC, Scalable High-efficiency Video
HAS solutions are used in OTT scenarios. Nevertheless, both Coding (SHVC) [22] was developed to support scalability.
industry and academia recommend using HAS systems in Conceptually similar to SVC, it adds extra scalability features
managed networks as well [16], [17]. For instance, a client such as bit-depth, color gamut, and hybrid scalability. In
may use feedback reported by a server or the network in addition, it enhances coding-specific functionalities like Inter-
bitrate adaptation to improve the overall QoE, or by using IP Layer Prediction (ILP) (optionally encoding the base layer in
multicasting to simplify the video distribution in the context AVC instead of HEVC), and the use of motion-constrained
of connected TVs. In this survey, we present a classification of tiles. In both SVC and SHVC, the base layer is always back-
state-of-the-art bitrate adaptation schemes including features, wards compatible with the non-scalable version of the encoder
pros, and cons. We classify the schemes into four main (AVC and HEVC, respectively), thus, only an AVC/HEVC
categories: client-based, server-based, network-assisted, and decoder is needed.
hybrid (See Fig. 3). The classification is based on the location Recently, MPEG and VCEG teamed up to work on Ver-
of the bitrate adaptation logic within the system and which satile Video Coding (VVC), aiming to provide almost twice
entities are involved. the encoding efficiency of HEVC. VVC specifically targets
The rest of this survey is organized as follows: Section II applications and services using immersive and high-dynamic-
describes background information and definitions. Section III range (HDR) videos. The new standard is expected to become
surveys the bitrate adaptation schemes. Comparisons between available in 2020 [23].
different schemes and a discussion are presented in Sec- Additionally, royalty free encoding formats such as VP9 and
tions IV and V, respectively. Finally, Section VI provides AV1 are increasingly used for HAS, and subject to various
concluding remarks. evaluations. For example, open-source implementations of

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 4

Bitrate Adaptation Schemes

Sec III-A Sec III-B Sec III-C Sec III-D


Client-based Adaptation Server-based Adaptation Network-assisted Adaptation Hybrid Adaptation

1) Bandwidth-based 3) Proprietary solutions


1) SDN-based
2) Buffer-based 4) Mixed adaptation
2) Server and network-assisted
5) MDP-based

Fig. 3: HAS adaptation scheme classification.

AVC, HEVC and VP9 have been evaluated in large-scale Buffer-filling State Steady State
video-on-demand environments [24].
HAS Server

Res
uest

pon
B. Common Problems in HTTP Adaptive Streaming

Req

se
While moving from a server-push to a client-pull model has HAS Client
clear benefits, HAS still faces challenges. Known issues relate OFF ON OFF ON OFF ON

to the heterogeneous nature of networks, the increasing num-


Fig. 4: HAS video streaming session states.
ber of users, and the growing demand of high-quality content.
We describe four main problems that can affect HAS systems:
(1) multi-client competition and stability issues, (2) consistent- referred to as ON and OFF. Fundamentally, an HAS client
quality streaming, (3) QoE optimization and measurement, and requests a segment every Ts time units, where Ts represents
(4) inter-destination multimedia synchronization. the content time duration of each segment, and the sum of ON
1) Multi-Client Competition / Stability Issues: Seufert et and OFF period durations equals Ts . During the ON period,
al. [25] have shown that using a centralized management con- the HAS client downloads the current segment and notes the
troller can enhance the overall video quality, while improving achieved throughput value that will be later used in selecting
the viewer QoE. In that regard, a robust HAS scheme should the appropriate bitrate for future segments. After that, the
achieve three main objectives: client temporarily becomes idle in the OFF period (See Fig. 4).
• Stability: HAS clients should avoid frequent bitrate As shown in Fig. 5, when a set of HAS clients competes
switching, which leads to quality oscillations and video for the available bandwidth, the per-segment activity periods
stalls, which in turn can negatively affect QoE. (ON, OFF) of the steady state differ from client to client.
• Fairness: Multiple HAS clients competing for available Depending on the amount of overlap of the ON periods, the
bandwidth should equally share network resources based clients may at times considerably overestimate the available
on viewer-, content-, and device characteristics. The bandwidth. This potentially causes video instability, quality
fairness desired here does not often result in bandwidth- oscillations, bitrate switches, buffer underruns, unfairness and
fairness. underutilization, which are collectively referred to as HAS
• High Utilization: While the clients attempt to be stable stability issues.
and fair, network resources should be used as efficiently Consider, for example, three HAS clients that share a
as possible. bottleneck link. Suppose that these three clients have reached
A streaming session in general consists of two states, the the steady state and they request a new segment every Ts time
buffer-filling state and the steady state [26]. The buffer-filling units. As illustrated in Fig. 5a, if the ON periods of these
state aims to fill the playback buffer and reach a certain clients do not overlap during the current segment download,
threshold where the playback can be initiated or resumed. each client will overestimate the available bandwidth. This
In this state, the client requests the next segment as soon as would not be the case if the ON periods were partially (Fig. 5b)
the previous chunk is fully downloaded (See Fig. 4). After or fully (Fig. 5c) overlapping. Many HAS bandwidth estima-
the playback buffer level reaches a target threshold (e.g., tion algorithms use the current segment download speed as an
30 seconds, however, note that this threshold varies among input. Non-overlapping ON periods lead to overestimating the
different bitrate adaptation schemes or could be increased or fair share of the bandwidth, and thus, clients incorrectly select
decreased based on the expected conditions), the client enters a higher encoding bitrate for the next segment. Downloading
the steady state. The objective during the steady state is to keep the next segment, which has a higher encoding bitrate, will
the buffer level above a minimum threshold despite bandwidth take longer, which will cause the initially non-overlapping
fluctuations or interruptions, in order to avoid buffer underrun ON periods to eventually start overlapping. As the amount
or stall events. The steady state consists of two activity periods of overlap increases, the clients will have lower bandwidth

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 5

ON HAS ON ON
HAS HAS
Client 1 Client 1 Client 1

ON HAS ON ON
HAS HAS
Client 2 Client 2 Client 2

ON ON ON
HAS HAS HAS
Client 3 Client 3 Client 3
(a) Unsynchronized downloads. (b) Overlapping downloads. (c) Synchronized downloads.

Fig. 5: Illustration of the main cause of HAS stability issues because of different segment download patterns.

estimations and start selecting segments that have a lower stalling duration and frequency, as well as quality switching
encoding bitrate. These segments will take less time to down- amplitude and frequency. The impact of each of these factors
load, causing the amount of overlap among the ON periods differs depending on the users subjectivity. Several studies
to procedurally shorten, until the process reverts to its initial have shown that most users consider initial delays less critical
situation. This cycle repeats itself, causing periodic up- and than stalling [34], [35], that longer stalling periods decrease
downshifts in the selected bitrates, leading to unstable video the perceived quality [36], and that frequent changes in video
quality, unfairness, and underutilization [26]–[28]. quality have a negative impact on the QoE [37]–[39]. The tech-
2) Consistent-Quality Streaming: Research studies in the nical factors that influence QoE are the algorithms, parameters,
field of video quality analysis [29], [30] confirm that the and hardware/software used in the video streaming system.
correlation between video bitrate and its perceptual quality Specifically, such factors include encoding parameters, video
is non-linear. Additionally, different video content types have qualities and segment sizes at the server side, the adaptation
unique characteristics, e.g., high and low-motion scenes, which logic, device capabilities and content type at the client side, as
result in different qualities. well as the adaptation parameters and the type of environment
In the context of HAS, even if the available bandwidth that the client resides in. All of these factors are challenges to
stays constant, the delivered video quality may still vary, as be taken into account for the best trade-off between conflicting
illustrated in Fig. 6, due to unequal video scene complexity goals (e.g., less stalling vs. high encoding bitrate) in order to
across content: inter-stream and intra-stream differences. Fig. 7 achieve viewer satisfaction.
depicts the non-linear relationship between bitrate and the One major challenge regarding video streaming is the lack
Structural SIMilarity plus (SSIMplus) [31] perceptual quality. of a unified quantitative approach to measure the QoE. Ex-
Generally speaking, it is preferred to stream video with a isting HAS solutions in industry and academia assess their
consistent quality than at a consistent bitrate [32], [33], leading QoE based on three different metrics: (1) Objective metrics,
to a reduction in perceptual quality oscillations. such as Peak Signal-to-Noise Ratio (PSNR) [40], [41], Struc-
3) QoE Optimization and Measurement: The changing con- tural SIMilarity (SSIM and SSIMplus) [31], [42], Perceived
ditions of best-effort networks introduce numerous problems Video Quality (PVQ) [43], and Statistically Indifferent Qual-
in delivering multimedia content to viewers. In traditional ity Variation (SIQV) [44]; (2) Subjective metrics, such as
non-adaptive streaming, the client streams a video that is Mean Opinion Score (MOS); or (3) Quality-of-Service (QoS)-
typically available in one bitrate at the server side. If the derived metrics such as the startup delay, average video bitrate,
network conditions worsen, the download rate may fall be- quality switches and rebuffering events. Achieving high QoE is
low the playback rate, which leads to buffer depletion and difficult because trying to optimize each metric may result in
discontinuous playback. With HAS, streamed videos show conflicts. The complex relationship between these measures
less buffering and higher bandwidth utilization compared to and the interplay between the adaptation logic with other
traditional streaming, since the video segments are transcoded application and network-layer decisions can significantly affect
into different bitrate levels, and segments are downloaded the QoE. Balachandran et al. [45] address these issues and
based on the current network conditions and the playout buffer propose a data-driven approach that uses machine-learning
level. Fig. 8 illustrates the application control loop of a typical to build a QoE prediction model. They showed that it could
HAS client. This survey focuses on reviewing the adaptation enhance the user engagement when applied in a CDN.
algorithms, i.e., the part responsible for selecting the next 4) Inter-Destination Multimedia Synchronization: The
segment(s) to download. The application control loop also ever-growing development of social multimedia sites is chang-
interacts with a lower-layer control loop (in this case TCP ing the way people share content. Apart from online gaming,
congestion control), which can play a key role in determining photo sharing, and instant messaging, online communities
the viewer QoE. are drifting towards watching online videos together in a
In a recent survey by Seufert et al. [25], factors influencing synchronized manner. Having multiple streaming clients dis-
QoE are categorized as (a) perceptual, directly perceived by tributed in different geographical locations poses challenges
the viewer, and (b) technical, indirectly affecting the QoE. Per- in delivering video content simultaneously, while keeping the
ceptual factors include the video image quality, initial delay, playback state of each client the same (playing, paused).

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 6

Fig. 6: Different inter-stream (on the left) and intra-stream (on the right) scene complexities lead to different display qualities
at the same encoding bitrate or vice versa.

1 compared to the synchronous reference case. Synchronization


in IDMS systems is crucial to the QoE. Dedicated QoE
models have to be developed that take the visual quality, user
Quality (SSIMplus)

0.9
engagement and the synchronization accuracy into account.
After describing the various factors that affect Internet video
0.8 streaming systems, we will now continue with a survey of the
Animation existing bitrate adaptation schemes.
Documentry
0.7
Movie
News
III. B ITRATE A DAPTATION S CHEMES
Sport We classify bitrate adaptation schemes based on the entity
0.6
0 1000 2000 3000 4000 5000 6000 of the system where the logic is implemented:
Bitrate (Kbps) • Client-based adaptation (Section III-A),
Fig. 7: Illustration of quality versus bitrate trade-off. • Server-based adaptation (Section III-B),
• Network-assisted adaptation (Section III-C), taking into
account explicit information from within the network, and
Measurements Adaptation Segment • Hybrid adaptation (Section III-D), using information from
(throughput/buffer level) Algorithm Download any combination of the client, server(s), and network.
The taxonomy graph in Fig. 3 illustrates our classification
Application Layer of bitrate adaptation schemes.
Transport Layer
A. Client-Based Adaptation
TCP Congestion
Control In relevant literature, most of the proposed bitrate adaptation
schemes reside at the client side, according to the specifica-
Fig. 8: The application control loop in a typical HAS client. tions in the DASH standard [50]. These schemes try to adapt
to bandwidth variations by switching to an appropriate video
bitrate according to one or more metrics such as the available
Moreover, it becomes more challenging for HAS streams to bandwidth, playback buffer size, etc. Fig. 9 shows a simple
synchronize, since each client adaptively streams depending on model of a client-based adaptation. The client uses one or more
their current network conditions. This problem is called Inter- metrics as input for its bitrate selection algorithm in order
Destination Multimedia Synchronization (IDMS). Typically, to choose the appropriate bitrate level for the next segment
IDMS solutions involve a master node (either a dedicated to be downloaded. These algorithms try to avoid streaming
master or a peer among the streaming clients in a session) to problems like video instability, quality oscillations, and buffer
which clients synchronize their playout to. One of the earliest starvation, while improving viewer QoE. They strive to achieve
papers in this field was published by Montagud et al. [46], (i) minimal rebuffering events when the playback buffer de-
in which the authors discuss use cases where IDMS and its pletes, (ii) minimal startup delay especially in case of live
schemes is essential. Rainer et al. [47], [48] proposed an IDMS video streaming, (iii) a high overall playback bitrate level with
architecture for DASH by using a distributed control scheme respect to network resources, and (iv) minimal video quality
(DCS) where peers can communicate and negotiate a refer- oscillations, which occur due to frequent switching.
ence playback timestamp in each session. The MPD file was We further organize the client-based bitrate adaption into
altered to include IDMS session objects that enabled session five classes: (1) available bandwidth-based (Section III-A1),
management. In another work [49], the same authors provided (2) playback buffer-based (Section III-A2), (3) proprietary
a crowdsourced subjective evaluation to find an asynchronism solutions (Section III-A3), (4) mixed (Section III-A4), and
threshold at which QoE was not significantly affected. They (5) Markov Decision Process (MDP)-based (Section III-A5).
found that an asynchronism level of 400 ms was acceptable

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 7

and improve viewer QoE.


Bandwidth
Estimator For the specific case of mobile clients that are in motion,
the network conditions are more fluctuating with respect to
Bitrate
Selection
Segment HTTP
Internet location and time. Several studies deploy a bandwidth lookup
Downloader
Algorithm service in a real-life mobile network in order to guide the
bandwidth estimation among the mobile clients [59]–[63].
Buffer Media Playback However, these frameworks take a spatial point of view of
bandwidth fluctuations and pay little attention to the temporal
Fig. 9: Client-based bitrate adaptation. factor. GeoStream [64] addresses this issue and introduces the
use of geostatistics to estimate future bandwidth in unknown
locations.
1) Available Bandwidth-Based Adaptation: In this intuitive In general, available bandwidth-based adaptation suffers
type of scheme, the client makes its representation decisions from poor QoE due to a lack of a reliable bandwidth estimation
based on the measured available network bandwidth, which methods, which results in frequent buffer underruns.
is usually calculated as the size of the fetched segment(s)
divided by the transfer time. Liu et al. [51] proposed a bitrate 2) Playback Buffer-Based Adaptation: In this type of
adaptation algorithm that tries to detect bandwidth fluctuations scheme, the client uses the playout buffer occupancy as
and congestion using a smoothed network throughput based a criterion to select the next segment bitrate during video
on the segment fetch time (SFT), which measures the time playback.
starting from sending the HTTP GET request to receiving Mueller et al. [65] were motivated by the limitation of
the last byte of the segment. Later, the authors extended their bandwidth-based adaptation when multiple clients competed
work in [52] to include both sequential and parallel segment for the available bandwidth, specifically in the presence of
fetching methods in CDNs, by using a metric that compares cache servers. Therefore, the authors proposed a buffer-based
the expected segment fetch time (ESFT) with the measured bitrate adaptation scheme that combines the buffer size with a
SFT to determine if the selected segment bitrate matches the tool-set of client metrics for accurate rate selection and smooth
network capacity. A similar approach was employed by Rainer switching. Huang et al. [66] proposed a set of buffer-based
et al. [14] where the bandwidth estimated for the next segment rate selection algorithms, named BBA that aim to maximize
was calculated based on the bitrate observed for the last the average video quality and avoid unnecessary rebuffering
segment downloaded and the estimated throughput that was events. However, BBA suffers from QoE degradation during
calculated during the previous estimation. The initialization long-term bandwidth fluctuations. Buffer Occupancy based
was based on the bandwidth measured when downloading the Lyapunov Algorithm (BOLA) [67], on the other hand, is an
MPD. online control algorithm that treats bitrate adaptation as a
Probe AND Adapt (PANDA) [53] estimates the available utility maximization problem. This utility is associated with
bandwidth accurately and tries to eliminate the ON-OFF the average bitrate and rebuffering time, while adapting to
steady state issue as well as reduce bitrate oscillations when network changes to account for better QoE. The authors
multiple clients share the same bottleneck link. The video provide strong theoretical proof that it is near optimal, design
adaptation framework for DASH clients in LTE networks, a QoE model that incorporates both the average playback
piStream [54], enables clients to estimate the available band- quality and the rebuffering time, and empirically show its
width based on a resource monitor module that acts as a efficiency using various network traces. BOLA is the buffer-
physical-layer daemon. Andelin et al. [55] integrated SVC based algorithm that is implemented and available in the
with DASH by proposing an algorithm that prefetches base dash.js player.
layers of future segments or downloads enhancement layers for Sieber et al. [68], introduced an SVC-based adaptation
existing segments using a bandwidth-sloping-based heuristic. algorithm called Bandwidth Independent Efficient Buffering
In live video streaming, the nature of the live experience (BIEB). BIEB maximizes video quality based on SVC priority
puts stringent constraints on the delay. DASH to Mobile while reducing the number of quality oscillations and avoiding
(DASH2M) [56] by Xiao et al., is a strategy designed for stalls and frequent bitrate switching. BIEB maintains a stable
mobile streaming clients using HTTP/2 server push and stream buffer occupancy before increasing the quality (enhancement
termination properties with the goal of enhancing the QoE as layers). However, BIEB does not take bitrate switches or stalls
well as reducing the battery consumption of the client. An in the QoE model during peak times when dynamic cross
extension of the authors’ previous work [57], the adaptive traffic occurs in the network into consideration.
k-push scheme proposes to increase/decrease k according to The decision by these algorithms which bitrate to se-
a bandwidth increase/decrease while keeping in mind the lect largely depends on factors such as estimated network
overall power consumption in a push cycle. In the same throughput, buffer occupancy, and buffer capacity. Yet, these
context, Miller et al. [58] proposed a low-latency prediction algorithms are not informed by a fundamental relationship
based bitrate adaptation scheme over wireless access links between these factors and the chosen bitrate. Thus, they do
termed LOw-LatencY Prediction-based adaPtation (LOLY- not work consistently in all scenarios. To address this issue,
POP), which leverages TCP throughput predictions on mul- Yadav et al. [69] modeled a DASH client as an M/D/1/K
tiple time scales (i.e., 1 to 10 seconds) to achieve low latency queue referred to as a QUEuing Theory approach to DASH

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 8

Rate Adaptation (QUETRA), which allowed them to calculate using ActionScript [76] by Adobe systems with the following
the expected buffer occupancy given a bitrate choice, network objectives: (1) simplify player development where developers
throughput, and buffer capacity. Using this model, the authors could focus on improving the overall viewer experience,
proposed a simple rate adaptation algorithm and evaluated (2) offer a set of features for third-party services like ren-
QUETRA under a diverse set of scenarios. They found that dering, advertising, and reporting, and (3) simplify third-party
despite its simplicity, QUETRA led to better QoE than the developments by enabling ecosystem partners to focus on
existing algorithms. delivering best-in-class services instead of player integration.
In general, buffer-based adaptation schemes suffer from OSMF supports both live and on-demand video streaming,
many limitations including low overall QoE and instability progressive download, sequential and parallel compositions of
issues, especially in the case of long-term bandwidth fluctua- video, and it adapts to the network variations based on the
tions. SVC-based approaches also have limitations related to available bandwidth and device processing capabilities.
the complexity of SVC encoding and decoding, processing The three proprietary streaming solutions described above
resources and overhead. Some alternative solutions have tried show efficiency in terms of bitrate adaptation behavior of a
to tackle these issues using multiple SVC streams, hierarchical single client in response to bandwidth fluctuations. However,
encoding with a small number of enhancement layers, and several studies [51], [77]–[80] have shown instability issues
encoding overhead [70]. when multiple clients competed for a bottleneck link in a
shared network. From these experiments the following insights
3) Proprietary Solutions: In the past, we witnessed many were deduced:
proprietary adaptive streaming solutions and player imple- • The bitrate adaptation heuristics provide suboptimal bi-
mentations such as Microsoft’s Smooth Streaming (MSS) [8], trate decisions as they fail to adapt quickly to rapid
Apple’s HTTP Live Streaming (HLS) [9], Adobe’s HTTP Dy- bandwidth variations. Thus, clients suffer from buffer
namic Streaming (HDS), and Open Source Media Framework underruns, video instability, quality oscillations, and un-
(OSMF) [10]. These solutions use different metrics in their necessary bitrate switches.
bitrate adaptation process and are designed to satisfy various • They are not able to ensure a fair viewer experience under
business requirements. some circumstances resulting in low efficiency and poor
Microsoft Smooth Streaming (MSS) [8]: In 2008, Microsoft per-viewer QoE.
launched IIS Media Services extension with a new adaptive • The MSS client outperforms the others, since it achieves
video streaming over HTTP feature called Smooth Stream- the highest playback bitrate, and a low number of bitrate
ing. It was designed to deliver HD videos to viewers. MSS switches during mobile video streaming sessions.
periodically detects network conditions to avoid bandwidth • Based on standard capabilities and features [80], DASH
fluctuations. It uses the available bandwidth, playback window offers nearly everything compared to these proprietary
resolution, and CPU load at the client side as the metrics for formats.
bitrate adaptation. During each streaming session, MSS opens
two TCP connections with the server. The first one is used to 4) Mixed Adaptation: In this type of scheme, the client
deliver video segments, while the second one is used for audio, makes its bitrate selection based on a combination of metrics
though the two TCP connections could interchange depending including available bandwidth, buffer occupancy, segment size
on the conditions. MSS showed its efficiency in many sports and/or duration.
events like the Beijing Summer Olympic Games 2008, where Other studies have looked at both the available bandwidth
TV broadcasters used MSS to provide live streaming to 16 and buffer occupancy in order to determine the bitrate of the
million clients [71]. next segment. Yin et al. [81] developed a control-theoretic
Apple HTTP Live Streaming (HLS) [9], [72]: Due to framework that allows the understanding and exploration
the popularity of Apple’s mobile devices, HLS is the most of the trade-offs between bandwidth-based and buffer-based
widely used adaptive video streaming system. Apple Inc. adaptation algorithms under different network bandwidth vari-
implemented it as part of QuickTime [73] and on iOS devices ations. The authors designed a practical model-predictive
such as the iPhone and the iPad. It is designed to support both controller, FastMPC, that optimally combines both bandwidth
live and on-demand streaming but specifically targets mobile and buffer size predictions in order to find an appropriate
environments. The HLS client makes its bitrate decisions bitrate for the next segment and maximize QoE. A similar
based on network throughput and device capabilities (e.g., approach was also studied in [82]. Li et al. [32] formulated the
CPU, resolution, memory, etc.). In an attempt to better utilize bitrate selection decision as an optimization problem, where at
the available bandwidth, an HLS client can request many each segment downloading step, the proposed scheme finds an
segments at the same time. Furthermore, HLS provides a appropriate bitrate that ensures a high and consistent quality
flexible framework for media encryption. Currently, HLS is subject to bandwidth fluctuations and without risking a buffer
natively supported in the Safari Web browser in both iOS and depletion. Similarly, Sobhani et al. [83] predicted available
macOS devices, Windows 10 Edge browser [74], and Android bandwidth and buffer level using a fuzzy logic mechanism,
3.0+ devices [75]. which is used to select a suitable bitrate. However, these algo-
Adobe Open Source Media Framework (OSMF) [10]: rithms only ensure a consistent quality at each client without
OSMF is a free, open source software framework for robust taking the fairness and content type/properties into account
adaptive video streaming over HTTP. It was implemented when many clients compete for the available bandwidth.

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 9

ELASTIC [84] is a fEedback Linearization Adaptive a single low-quality segment estimation. Later, the algorithm
STreamIng Controller, based on feedback control theory [85], uses the spectrum, which is the variation of the average seg-
that generates a long-lived TCP flow and avoids the ON- ment bitrates, and the buffer level to choose the next segment
OFF steady state behavior which leads to bandwidth over- bitrate. Havey et al. [92] designed a multi-path solution for
estimations. ELASTIC was introduced to ensure bandwidth rate adaptation in wireless networks. The authors avoided the
fairness between competing clients based on network feedback problems of TCP congestion control by implementing a similar
assistance, but without taking the viewer QoE into consider- logic at the application layer. Parallel TCP streams have
ation. In addition, it ignores quality oscillations in its bitrate been proven to increase the throughput compared to single
decisions. Thus, both during bandwidth fluctuations and in TCP streaming. However, this incurs extra request/response
fixed bandwidth environments, ELASTIC may produce a high overhead and imposes changes on the application stack.
number of bitrate switches resulting in poor QoE. Other studies incorporate more metrics for bitrate selection
Miller et al. [86] presented a bitrate adaptation algorithm like the current segment quality, size and download time.
that uses the current buffer occupancy level, estimated avail- SARA [93] is a Segment-Aware Rate Adaptation algorithm
able bandwidth, and average bitrate of the different bitrate that is based on the segment size variation, the available
levels from the MPD as metrics in its bitrate selection. It aims bandwidth estimate, and the buffer occupancy. Since HTTP
to (i) accurately estimate the available bandwidth and avoid uses TCP, the throughput of a segment is dependent on the
bandwidth overestimation, and (ii) maximize the bitrate while file size, and thus, the authors propose to enhance the typical
minimizing startup delay, number of stalls, quality oscillations, MPD file to include the size of every segment. For each new
and playback interruptions. The algorithm changes its behavior segment download, the client decides the new segment quality
based on the current buffer level. It can improve the fairness based on the estimated bandwidth (which is assessed using the
between competing clients, but it does not take any metric segment size) and the current status of the buffer.
of viewer satisfaction into account. Furthermore, in a shared ABMA+ [94] is a lightweight adaptation algorithm that
network environment, clients can suffer from video instability, selects the highest segment representation based on the esti-
stalls and quality oscillations even when clients reach the mated probability of video rebuffering. It makes use of buffer
highest quality level. This is due to the lack of bitrate decisions maps, which define the playout buffer capacity that is required
which consider viewer QoE. under certain conditions to satisfy a rebuffering threshold and
Jiang et al. [87] studied the limitations of video players to avoid heavy online calculations. The authors defined five
when a large number of clients shared the same network QoE metrics to evaluate ABMA+ and compared it with BBA
by providing an experimental study that identified the main and Rate-Based Algorithm (RBA), which are explained in
factors in bitrate selection. The authors introduced FESTIVE detail in [94]. The authors showed that ABMA+ can efficiently
(Fair, Efficient and Stable adapTIVE algorithm), a bitrate adap- adapt the video representations to the network conditions,
tation algorithm that aims to improve efficiency, fairness and while minimizing frequent quality switches. Bentaleb et al.
stability. FESTIVE contains (a) a bandwidth estimator module, [95] discussed the shortcomings of the existing client-based
(b) a bitrate selection and update method that tries to avoid schemes. To sidestep these drawbacks, the authors leveraged
unfairness of stateless bitrate selection2 by making the player a game theory [96] framework and developed the GTA (Game
stateful, and (c) a randomized scheduler that incorporates the Theory Adaptive bitrate) scheme. GTA uses a cooperative
buffer size to schedule the download of the next segment. For game in coalition formand then formulates the bitrate selection
the same purpose, Throughput-Friendly DASH TFDASH [88] problem as a bargaining process and consensus mechanism.
uses a logarithmic-increase-multiplicative-decrease (LIMD) Thus, the DASH clients can create an agreement among
based bandwidth probing algorithm to estimate the available themselves and achieve their QoE objectives. GTA improves
bandwidth and a dual-threshold buffer for the bitrate adapta- the viewer QoE and video stability without increasing the stall
tion. rate or startup delay.
Tian et al. [89] offered algorithms that aim to balance
bandwidth utilization and smoothness in DASH in both single- 5) MDP-Based Adaptation: In Markov Decision Process
and multi-CDN scenarios. Using the buffered video time as (MDP)-based adaptation, the video streaming process is for-
a feedback signal, the client is able to adapt the video rate mulated as a finite MDP to be able to make adaptation de-
according to the available bandwidth, which is estimated using cisions under fluctuating network conditions. Xing et al. [97]
the support vector regress (SVR) [90] algorithm. Spectrum- proposed a real-time best-action search algorithm over mul-
based QUality ADaptation (SQUAD) [91] is a lightweight tiple access networks that aims to produce smooth and high-
bitrate adaptation algorithm that uses the available bandwidth quality video playbacks. The authors used both Bluetooth and
and buffer information to increase the average bitrate of a WiFi links to simultaneously download video segments. In
video, while minimizing the number of quality switches. For each state, the MDP was formulated so the rate adaptation
startup, SQUAD follows a conservative approach of fetching agent takes the buffer level, SVC layer index, Bluetooth traffic,
more low-quality segments in order to alleviate any inaccura- available bandwidth, and the index of each segment fetchedas
cies in future bandwidth estimations which could result from inputs. The reward function is designed to consider the average
playback quality, interruption rate, and playback smoothness.
2 Stateless bitrate selection refers to selecting the highest bitrate lower than However, this scheme shows limitations during user mobility
the available bandwidth. which negatively affect the viewer QoE. The mobility problem

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 10

was addressed by Bokani et al. [98] who modeled the bitrate 400 Kbps
adaptation logic as an MDP problem in vehicular environ- HAS Server
ments. A three-variant of Reinforcement Learning (RL)-based
algorithms were introduced. These algorithms take advantage 800 Kbps
of the historical bandwidth samples to build an accurate
bandwidth estimation model.
Another noteworthy work is Petrangeli et al. [99]. The 1100 Kbps
authors tackled the problem of QoE and fairness when multiple
Traffic Shaper
clients compete at a bottleneck link and they proposed a
multi-agent RL-based bitrate adaptation scheme that uses a Fig. 10: Server-based bitrate adaptation.
central manager in charge of collecting QoE statistics (segment
bitrate) and coordination between the competing clients. The
developed algorithm ensures a fair QoE distribution among the increases, probably because such factors are not taken into
competing clients and improves viewer QoE, while avoiding account in the MDP models and due to clients’ decentralized
suboptimal decisions. However, this model does not consider ON-OFF patterns.
stalls and quality switches which can lead to rebuffering
events. Unlike [99], Chiariotti et al. [100] developed an MDP- B. Server-Based Adaptation
based online bitrate adaption algorithm for DASH clients that
Server-based schemes use a bitrate shaping method at the
aimed to select the optimal representation, maximizing the
server side and do not require any cooperation from the client
long-term expected reward (QoE). This reward function was
(see Fig. 10). Thus, the switching between the bitrates is
calculated from a combination of quality oscillations, segment
implicitly controlled by the bitrate shaper. The client still
quality, and stalls experienced by the client. The authors used
makes its own decisions, but the decisions are more or less
RL to gather information on the network environment through
determined by the shaping method on the server.
experience to approach an optimal solution. To avoid slow
Traffic shaping methods have been deployed in [106],
convergence and suboptimal solutions caused during the RL
[107] where the authors analyzed instability and unfairness
process, the authors exploited a parallel learning technique.
issues in the presence of multiple HAS players competing
Zhou et al. [101] tackled a similar problem by proposing
for the available bandwidth. These studies proposed a traffic
mDASH to improve viewer satisfaction during long-term
shaping method that can be deployed at a home gateway to
bandwidth variations. The authors first formulated the bitrate
improve fairness, stability and convergence delay [107], and
adaptation logic as an MDP optimization problem where the
to eliminate the OFF periods during the steady states (the root
buffer size, bandwidth conditions, and bitrate stability were
cause of the instability problem) [106].
taken as Markov state variables. They subsequently solved
To improve the live experience, Detti et al. [108] proposed a
this problem by proposing a low-complexity greedy subop-
tracker-assisted adaptation strategy in the presence of network
timal algorithm. Compared to previous MDP-based studies,
caches. The proposed architecture consists of clients communi-
Pensieve [102] and Deep Q-Learning DASH (D-DASH) [103]
cating with a server through a shared proxy and a server having
were proposed to improve accuracy and speed of bitrate de-
a tracker functionality that manages the clients’ statuses and
cision estimations using Deep Reinforcement Learning (Deep
helps them share knowledge about their statuses. De Cicco et
RL) [104]. Pensieve [102]3 is a framework that is built based
al. [109] proposed a feedback control theory-based algorithm
on observations collected by DASH clients (i.e., throughput
called Quality Adaptation Controller (QAC). QAC aims to
estimation and buffer occupancy) across large video streaming
control the size of the server sending buffer in order to adjust
experiments. It does not rely on pre-programmed models or
and select the most appropriate bitrate level for each DASH
assumptions about the environment, but, in fact, gradually
player. It aims to maintain the playback buffer occupancy of
learns the best policy for bitrate decisions through observation
each player as stable as possible and to match bitrate level
and experience. D-DASH [103] combines deep learning and
decisions with the available bandwidth. Bruneau et al. [110]
reinforcement learning mechanisms to improve the QoE for
developed the MS-Stream system, a multiple-source adaptive
DASH, and achieves a good trade-off between policy opti-
streaming solution to improve viewer QoE, where the client
mality and convergence speed during the decision process. In
fetches the segments (divided into a set of subsegments and
particular, it uses mixed learning architectures including feed-
stored in the servers) from multiple MS-Stream servers.
forward and recurrent deep neural networks with advanced
The server-based bitrate adaptation schemes produce high
strategies. Both solutions [102], [103] perform adequately and
overhead on the server side with a high complexity4 , especially
present the benefits of incorporating Deep RL with ABR
when the number of clients increases. These schemes also
heuristics in the bitrate decision process. The proposed MDP-
need modifications to the MPD [108] or a custom server
based schemes yield a significant improvement in the overall
software to implement the bitrate adaptation logic [106]–
performance in terms of stalls, and thus, ensure an acceptable
[109]. This may be perceived as a violation of the DASH
level of viewer QoE. However, they may suffer from instabil-
standard design principles, namely that the server should be a
ity, unfairness, and underutilization when the number of clients
4 The server needs to store and maintain the information for each client to
3A pensieve is a device used in Harry Potter [105] to review memories. perform bitrate adaptation.

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 11

classification, respectively. For the same aim and inspired


HAS Server by the Network Utility Maximization (NUM) [115] frame-
work, Stefano et al. [116] proposed a distributed price-based,
network-assisted HAS system for multiple concurrent HAS
clients sharing a common bottleneck. The proposed solution
Proxies introduces the definition of a price (i.e., a function of the
segment download times that are captured by the HAS clients),
Fig. 11: Network-assisted bitrate adaptation. which is inspired by a congestion control algorithm. Then,
using the price information, a central coordinator assists the
clients in their decisions to maximize overall user satisfaction
standard HTTP server, and that the bitrate adaptation algorithm
and QoE fairness.
should, consequently, run at the client side. The server and
To alleviate overhead-caused network performance degra-
network-assistance approach [16], [111] can be an alternative
dation, Petrangeli et al. [117] tried to avoid fairness issues
solution, where in-network entities and servers aid the client
when multiple HAS clients consume video at the same time
in its bitrate decisions. This approach is discussed in detail in
and compete for shared network resources by proposing a
Section III-D2.
QoE-driven in-network bitrate adaptation algorithm named
FINEAS (Fair In-Network Enhanced Adaptive Streaming). To
C. Network-Assisted Adaptation achieve fairness, FINEAS uses in-network components such as
The network-assisted approach depicted in Fig. 11 allows proxies that offer information about network conditions like
the HAS clients to take in-network decisions during the currently available bandwidthand suggestions about the best
bitrate adaptation process into consideration. This happens by bitrate. Each client may use these suggestions as a criterion
collecting measurements about the network conditionswhile for bitrate selection. FINEAS shows good performance in
informing the clients on the suitable bitrates to be selected. homogeneous systems but in the real world, heterogeneous
The in-network process needs a special component (e.g., devices with different characteristics exist. Thus, sharing the
agent/proxy deployed in the network) to monitor the network bandwidth equally among competing clients may result in
status and conditions. It offers network-level information that high QoE on some devices but low QoE on others. In [118],
allows the HAS clients to efficiently use network resources. Network Optimization for Video Adaptation (NOVA) was
QoE-aware DASH (QDASH) [112] is a proxy between the proposed to fairly maximize viewer QoE while avoiding
clients and the streaming server that aims to avoid video unnecessary bitrate switching in a heterogeneous environment.
oscillations by ensuring a gradual change in bitrate levels The authors formulated the multi-client competition issue as
using integrated intermediate levels, which can lead to a an optimization problem subject to buffer occupancy, network
better QoE. QDASH consists of a QDASH-abw module to conditions and delivery cost. Thereafter, NOVA tries to find
measure the bandwidth and a QDASH-qoe module that assists the optimal bitrate for each client. NOVA consists of two main
the client in choosing a suitable bitrate that can support the elements: bandwidth allocation and quality adaptation. While
current network conditions and buffer occupancy. However, NOVA achieves good QoE compared to traditional DASH
it generates significant overhead in the network, especially systems, the efficiency of the proposed architecture relies on
with increasing client numbers. This overhead may eventu- strong statistical assumptions such as stationary ergodicity,
ally lead to network congestion in itself, resulting in a low which may negatively impact the convergence time during the
QoE. Similarly, Bouten et al. [113] tackled the problem of search for optimal decisions [119].
multiple DASH clients competing for the available bandwidth Many studies [120]–[127] have proposed bitrate adapta-
by proposing a QoE-driven in-network optimization system tion schemes to improve viewer QoE in cellular networks.
for adaptive video streaming. The proposed system consists AVIS [120] is a network-based radio resource allocation
of a set of agents deployed along the path between the clients framework designed for adaptive video flows in cellular net-
and streaming server, where they play the role of proxies. works. It can optimally allocate resources for each client
These network agents periodically measure and monitor the (separating DASH flows from others) and ensure fairness and
available bandwidth along the path using packet sampling stability between them while maintaining high resource uti-
techniques and solve an optimization problem to determine lization. Similarly, Kleinrouweler et al. [122] installed HTTP
the optimal bitrate for the next segments to be downloaded. proxies at the network gateways that evenly allocated the
This information is then sent to the clients. However, similar available bandwidth between the streaming clients. The proxy
to [112], it can generate significant overhead and is not re-writes client requests that demand a bitrate higher than
resilient to agent failures. To reduce buffer underrun events the one designated by the proxy, and also adds an HTTP
and improve the client’s viewing experience, Krishnamoorthi header to the response informing the client of the change.
et al. [114] presented BUFFEST, a classification framework The streaming process was modeled as an MDP, where each
for real-time prediction of the client’s buffer conditions from state represents the number of active clients and the transitions
both HTTP and HTTPS traffic. It consists of an event-based between the states are linked to starting and stopping the
buffer emulator module and an automated training online players. To account for stability, the number of switches relates
classifier that are responsible for accurately tracking/predicting to the frequency of transitions between the MDP states. In
the client’s buffer conditions and TCP/IP packet-level traffic contrast, El Essaili et al. [121] developed a QoE optimizer

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 12

and resource manager framework that can dynamically find of network states [130], Zahaib et al. [131] designed Oboe
the optimal bitrate for a subject to wireless channel conditions, which allows the automatic tuning of configuration param-
buffer levels, and achievable QoE. It allocates the required eters to different network conditions for an ABR scheme.
bandwidth for each client based on its QoE unlike [120], Consequently, these configuration parameters are applied at
[122] where all clients receive an equal share of the allocated run-time to match the current network state. The proposed
bandwidth, which does not necessarily mean that all the clients system significantly improves the bitrate decision of client-
enjoy a good experience due to intrinsic differences across the based adaptation schemes like BOLA [67] and FastMPC [81],
device capabilities. and it offers a 24% on average better viewer QoE compared
In the same context, the Rebuffering Aware Gradient Al- to Pensieve [102].
gorithm (RAGA) [123] is a cross-layer buffer-aware wireless Other approaches incorporate OpenFlow-enabled solutions
resource allocation algorithm that considers only the playback with HAS. Georgopoulos et al. [132] proposed an OpenFlow-
buffer size during the bitrate selection process. It makes based in-network caching service, named OpenCache, that
use of DASH’s standardized user feedback from the buffer, leverages software defined networking (SDN) to optimize
both its level and rate of level changes. The same authors video-on-demand DASH streams. OpenCache uses SDN to
later proposed a new architecture to enhance the QoE in provide cache-as-a-service (CaaS) for media content and
LTE networks [124]. The architecture consists of a Video aims to alleviate last mile scalability issues by pushing the
Aware Controller (VAC) at the network core that acts as a DASH segments as close to the client as possible without
central intelligence unit for translating the video qualities and requiring any modifications in the delivery method, and to
buffer levels into QoS parameters. The authors also proposed improve network resource utilization and QoE for the view-
a new algorithm that computes the dynamic Maximum Bit ers. Additionally, it can provide network and DASH clients’
Rate (dynamic-MBR) for each client based on its buffer measurements that help CDN providers to enhance content
level obtained from the feedback. Han et al. [125] proposed placement and delivery mechanisms. Cofano et al. [133]
Multi-path DASH (MP-DASH), a multi-path framework with investigated video quality fairness (VQF) for cases in which
awareness of the network interface preferences of the clients. multiple heterogeneous adaptive streaming players share the
It aims to improve multi-path TCP (MPTCP) to support same bottleneck link. The authors proposed a Video Control
DASH considering the user network interface preferences, thus Plane (VCP) that enforces a video quality management policy
enhancing the efficiency of video delivery without sacrificing to ensure fairness. VCP was implemented on top of an SDN
viewer QoE. MP-DASH consists of two main components controller as a network controller application and consists of
including the MP-DASH scheduler and the video adapter. The three network-assisted streaming approaches: bandwidth reser-
scheduler takes user interface preferences, segment size and vation, bitrate guidance and hybrid between bitrate guidance,
its delivery time from the DASH client into consideration. and bandwidth reservation. Bhat et al. [134] designed an SDN-
Based on this, it decides the best way to fetch the segment assisted architecture for HAS systems, termed SABR. This
over multiple paths. The video adapter is a lightweight add- method leverages SDN capabilities to assist and manage HAS
on to existing client-based adaptation schemes to be multi- players and it collects various information such as available
path friendlier, being responsible for handling the interaction bandwidth and client states to guide player bitrate decisions.
between the bitrate adaptation scheme, and the MP-DASH Seema et al. [135] developed a DASH-based video platform
scheduler. for miniaturized devices including sensors, called Wireless
To reduce video instability, QoE unfairness and stalls in Video Sensor Node Platform DASH (WVSNP-DASH). The
cellular networks, Yan et al. [126] designed Prius as a frame- proposed platform uses an alternative approach to segment
work that consists of a hybrid edge cloud and a client-based the video to be convenient for miniaturized wireless devices
adaptation scheme. Similarly, Zahran et al. [127] proposed and sensors. It utilizes a specific naming syntax (based on
a Stall-aware Pacing (SAP) traffic management solution for a simplified Backus-Naur Form [136]) for video segments
DASH clients. It aims to reduce video stalls while maintaining such that each segment is an independently playable file
a consistent QoE when multiple DASH clients with diverse that embeds essential metadata required for video playout
channel conditions compete for resources. SAP leverages in its name. In this way, the client can play the segment
both network and client state information to optimize the without requiring to download the manifest file and initial seg-
per-player QoE. Leveraging Machine Learning (ML) [128] ments. WVSNP-DASH is designed based on core elements of
mechanisms, De Grazia et al. [129] developed a multi-stage HTML5 (e.g., HTML5 File System). Also, it can encapsulate
ML cognitive approach for DASH when multiple clients any container, codec and DRM that are supported by a Web
compete for the available bandwidth in a shared channel. The browser. However, this paper does not analyze the overhead
proposed solution incorporates unsupervised and supervised introduced by WVSNP-DASH, i.e., the new data embedded
ML to comprehend the Quality-Rate (Q-R) relationship. The in each segment which may significantly impact the network
authors deployed a cognitive HTTP proxy (CHP) that was efficiency and lifetime. For bitrate adaptation schemes over
responsible for controlling the video traffic towards the clients, Information-Centric Networking (ICN), Lederer et al. [137]
performing traffic classification, helping clients in their deci- investigated the possibilities of integrating HAS over ICN. The
sions, and applying resource allocation according to the Q- authors highlighted use cases and scenarios, namely Netflix-
R function. Motivated by the fact that TCP connections are like video streaming, peer-to-peer (P2P) uses, video sharing,
well-modeled as traversing a piecewise-stationary sequence and IPTV. Additionally, the authors presented available tools

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 13

and testbeds to evaluate HAS over ICN, and highlighted HTTP REST API

several challenges and open issues. Further details of the HAS


HAS Server HAS Clients
over ICN architecture can be found in RFC 7933 [138]. The Application
Application
performance of DASH over ICN is examined by Rainer et Layer
Controllers
al. [139]. The authors analyzed the performance gap between
different ICN-based forwarding strategies with their theoret- Policies
ical optimum at the network level and various client-based
adaptation schemes at the application level. They derived SDN Network
Management
SDN Controller
the theoretical optimum bound by formulating the concurrent
streaming clients in ICN as a fractional Multi-Commodity OpenFlow

Flow Problem (MCFP) with and without caching, showing


that HAS performance can be improved by benefiting from Network Layer and
Forwarding Devices
ICN multi-path and caching capabilities. Petrangeli et al. [140]
focused on combining HAS and SVC over ICN networks.
They used SVC mainly for the following reasons: (i) SVC Fig. 12: Architecture for SDN-based bitrate adaptation.
allows to fully exploit the benefits of ICN while avoiding
suboptimal bitrate selections, (ii) it helps the clients to mitigate
• SDN allows for network resource control and monitoring
bandwidth overestimation, and (iii) the layered structure of
capabilitiesand thus simplifying network resource pro-
SVC enables the benefits from ICN’s multi-path capabilities
gramming and deployment.
Xu et al. [141] proposed EcoMD, an ICN-based cost-efficient
• Pure client-driven bitrate adaptation algorithms show
multimedia content delivery solution for vehicular ad hoc
their limitations when a set of DASH clients compete
networks (VANETs) to reduce the cost of video delivery
in a shared network environment and when the network
in highly dynamic VANETs. The authors first analyzed two
size grows, resulting in issues such as video instability,
essential factors, namely content mobility and supply-demand
quality oscillations, buffer underruns, unfairness, and
balance. Then they formulated the cost associated with video
underutilization. These issues are largely due to a lack of
delivery as a Mixed Integer Programming (MIP) optimization
coordination among the clients, which could be ensured
problem. Finally, they proposed three adaptive heuristic solu-
by a central mechanism that has the global network view
tions to solve the optimization problem: (1) priority-based path
in a manageable network environment (e.g., a last mile
selection, (2) least-required sources maintaining, and (3) on-
like campus network). With a central coordinator and the
demand in-path caching enhancement. Similarly, Detti et
integration of such coordination information, these issues
al. [142] proposed an ICN-based P2P streaming application
can be avoided and viewer satisfaction can be improved.
for live HAS systems over cellular networks. The main insight
of this work is to show the possibility of exploiting ICN Fig. 12 depicts SDN-based bitrate adaptation, where the
capabilities to provide a good HAS service and achieve a network resources and competing clients are controlled and
simplified deployment process. In the application, the HAS monitored by a central component in the control plane, more
clients (or peers) construct a P2P one-hop mesh network that precisely the SDN controller.
enables cooperative downloading of the same live video. These Georgopoulos et al. [145] proposed an SDN-assisted QoE
clients use their cellular network interfaces to connect to the Fairness Framework (QFF), which sought to optimize QoE
HAS server and are connected to each other through proximity by ensuring video quality fairness among multiple competing
WiFi channels. DASH clients in the last mile. The proposed framework lever-
In general, the presented ICN-based solutions use heuristic ages OpenFlow to monitor the quality of the video streams
information (collected from the requested content) to perform and allocate/manage resources in the network.
the caching decision by a special node. Some of these solutions The same authors later proposed an improvement of QFF
produce a large number of redundant copies, and thus, impact by introducing the SDN-based in-network QoE measurement
storage resources. Providing efficient content management, framework IQMF [146], which acts as a proxy and aims to
ensuring high cache performance, and designing a robust HAS provide per-client transparent monitoring of QoE during the
delivery system over ICN are still open issues. video session, and subsequently offers its feedback to network
and content providers through a well defined API. IQMF was
proposed due to the fact that traditional network-level metrics
D. Hybrid Adaptation
like bandwidth, packet loss, jitter, and end-to-end delay could
In hybrid bitrate adaptation, many networking entities col- not provide an estimation of video quality. Both, QFF and
laborate together and collect useful information about network IQMF take only two metrics into account, device resolution
conditions that can help HAS clients in their bitrate selection. and available bandwidth, without considering the buffer level.
This type of technique consists of SDN-based and server-and- Thus, clients may be subject to buffer starvation.
network-assisted adaptations. Nam et al. [147] proposed an SDN-based application that
1) SDN-Based Adaptation: Two key insights of integrating aims to manage network resources while monitoring network
SDN [143], [144] within an adaptive video streaming system conditions and client feedback (QoE metrics), when multiple
are as follows: clients compete for a shared capacity. The SDN application

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 14

dynamically reroutes the video flows using the Multiprotocol and (iii) limited support for system heterogeneity. The latter
Label Switching (MPLS) traffic engineering mechanism over is an online reinforcement learning (RL) QoE optimization
SDN when QoE requirements are violated (during buffer framework for SDN-enabled HAS systems. The proposed
underrun events, for instance). Such an approach can improve framework consists of three phases. First, it groups the HAS
the overall QoE by selecting the best path to the server. How- players into a set of disjoint clusters based on a perceptual
ever, the authors do not describe the time effect of dynamic quality index. Second, it formulates the bitrate selection as
path changes during the streaming session. This problem was a Partially Observable Markov Decision Process. Third, it
addressed by Wang et al. [148] through the development of implements an online Q-learning algorithm to solve the QoE
GENI Cinema (GC), an SDN-assisted service for live video optimization problem and find in parallel the optimal bitrate
streaming. GC aims to provide online live educational video decision for each cluster.
streaming among many campuses using the GENI SDN-based To improve the viewer QoE in the context of HAS in
network resource infrastructure. Steaming clients can upload hybrid fiber coax (HFC) network environments, an SDN-based
and/or watch online videos via a public shared Web portal, and bandwidth broker solution [154] termed BMS (Bandwidth
the GC service is able to monitor and manage the video flows Management Solution) was developed. BMS formulates the
and resources over one or multiple routes dynamically using bitrate decisions as a convex optimization problem, which
SDN features. The GC service has been shown to provide relies on a concave network utility maximization (NUM)
scalable, stable, and fair live video streaming. function. BMS is proposed to meet per-session and per-
Petrangeli et al. [149] proposed an SDN-based framework group QoE objectives. Thus, BMS is able to avoid common
that aimed to reduce video freezes caused by sudden band- HAS issues like video instability, unfair and unequal quality
width fluctuation by applying a prioritization technique during distribution, and network resource underutilization.
the segment delivery process. The SDN controller represents Lai et al. [155] proposed an SDN-based manager in 5G
the main component of the proposed framework, where it OpenFlow-enabled wireless networks for HLS services. The
is responsible to collect the network status information such manager aims to allocate a suitable on-demand network re-
as bandwidth changes, latency, and statuses of the HAS source (e.g., bandwidth) that improves the QoE taking into
clients. Based on this information, the controller decides consideration the media segment perceptual quality and client
whether a segment has to be prioritized or not in order to buffer size during bitrate selection. However, the authors
alleviate video freezing at the client. In the same context, consider neither the radio characteristics that exhibit sudden
Kleinrouweler et al. [150] described an SDN-based network bandwidth fluctuations nor the handover situations due to user
architecture for DASH that aims to ensure stable and high mobility.
quality video delivery, while avoiding the mismatch between All theses studies as well as C3 [156], CFA [157],
the TCP mechanism and the dynamic bursty nature of DASH CS2P [158], Pytheas [159] share a common characteristic,
traffic. The proposed architecture consists of three layers: SDN which is that there exists a central controller to control, manage
network application controllers, SDN network management, and monitor HAS traffic. However, these solutions do not scale
and programmable network infrastructure. The SDN network well and support system heterogeneity. They also generate
application helps the set of competing DASH clients in their additional overhead that can affect the network performance.
bitrate selection, while the SDN network management uses
a dynamic queue-based mechanism for QoS provisioning. 2) Server and Network-Assisted Adaptation: Thomas et
However, the proposed architecture does not consider device al. [16], [111], [160] were motivated by the fact that the
heterogeneity, which is important for determining the fair share client-driven approach of DASH left less control to the net-
of available bandwidth and QoE fairness. To address these work and service providers, which introduced new challenges
issues, Bentaleb et al. [151] proposed a new end-to-end SDN- for them in service differentiation, and proposed the Server
based resource allocation and management architecture for and Network-assisted DASH (SAND) architecture. SAND is
HAScalled SDNDASH. The proposed architecture leverages a control plane that offers asynchronous client-to-network,
SDN capabilities to manage and allocate network resources network-to-client, and network-to-network communications.
for each client based on its QoE. It consists of the three layers SAND allows to collect metrics and status information from
application, control, and network, as well as six core entities different entities in the system including the clients and to send
within those layers: DASH server, DASH clients, SDN-based feedback to the clients and DASH-aware Network Elements
external application, SDN controller, SDN-based internal ap- (DANE) including the servers, caches and other network
plication, and forwarding devices. SDNDASH formulates the entities along the media path. This feedback is used by the
QoE maximization and optimal decision for both bitrate and clients to assist in the bitrate adaptation and by the DANEs to
network resource allocation as a maximization optimization improve media delivery. To enable the communication between
problem, leading to significant improvements in per-client QoE the clients and DANEs, SAND defines the following interfaces
while avoiding HAS stability issues. For the same context, to carry various types of messages:
SDNHAS [152] and ORL-SDN [153] were developed. The • Client-to-Metrics-Server and Client-to-DANE Interfaces
former was proposed to resolve three limitations that were not carry the metrics and status messages, respectively.
addressed in SDNDASH and could affect the ABR decisions, • DANE-to-DANE Interface carries the parameters enhanc-
namely: (i) the difficulty to support large-scale deployments ing delivery messages.
of HAS players, (ii) non-trivial communication overhead, • DANE-to-Client Interface carries the parameters enhanc-

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 15

ing reception messages. They are suited for large-scale deploymentsand they
The SAND architecture is primarily based on feedback require modifications only on the client side. However,
from the clients (e.g., QoE metrics) and the network (e.g., most of these schemes suffer under network-bottleneck
available bandwidth). This kind of architecture is not easy conditions (i.e., they are not globally optimal). Reason
to implement, and hence, only few works have tackled this for this is the lack of a central element that guides the
problem yet. Unsurprisingly, SDN is one of the main enablers players in their bitrate decisions.
for the SAND architecture [151], [161]. Further details on the • The server-based adaptation schemes provide the advan-
SAND architecture and messages can be found in ISO/IEC tage of central control. However, these schemes may
23009-5, which was published by MPEG in early 2017. introduce a high complexity on the server and produce
additional overhead, which may harm the network effi-
IV. C OMPARISON BETWEEN B ITRATE A DAPTATION ciency. Additionally, these schemes need modifications in
S CHEMES the manifests and/or the server side.
• The network-assisted adaptation schemes aim to have a
Each adaptation scheme proposes distinct criteria for bitrate
general view of the network, which helps the clients in
decisions, where they work only under indirect or implicit
their bitrate decisions. These schemes are suitable for
assumptions and specific scenarios, and focuses on a specific
small-to-large networks and show a high performance in
deployment or different network characteristics. Currently,
improving the viewer QoE. A similar observation can
there is a lack of general consistent frameworks that can
be made for hybrid schemes. However, the real-world
formally evaluate and compare different bitrate adaptation
deployment of both scheme classes remains challenging
schemes, and test and verify the efficiency of their compo-
as they introduce some overhead that may harm the
nents. Only a few algorithms formally describe what objective
network performance and since they require additional
they want to optimize, making an effective comparison nigh
entities in the network.
impossible. In this part, we provide a feature comparison
between various state-of-the-art bitrate adaptation schemes It might be of interest to note that Table II provides only
in each category from the taxonomy in Fig. 3. Table II a feature comparison between the different schemes, such
summarizes this comparison for each surveyed paper in terms as the heuristics, experimentation parameters and collected
of the following aspects: metrics. A performance comparison is difficult for mainly
three reasons: (i) the unavailability of source codes, (ii) the
• Heuristic(s): The measurements and values that the al-
lack of a unified QoE framework and metrics to evaluate
gorithm bases its download decision on {BW: avail-
these schemes, and (iii) because every scheme has its own
able bandwidth, Buffer: buffer occupancy, SDT: segment
parameters and assumptions, and may have been designed for
download time, DC: device capabilities, CPU: CPU load,
a specific environment and settings.
QT: perceptual quality, PA: proxy assistance, CA: central
entity assistance, SDN or SDN-app: SDN assistance, Seg-
V. D ISCUSSION
size: segment size, Seg-quality: segment quality, Seg-
schedule: segment scheduling}. In this section, we discuss emerging HAS trends, namely
• Fairness: Describes the algorithm’s fairness between mul- (A) HAS and scalable video coding (SVC), (B) advanced
tiple clients that share the network. Some algorithms transport options such as HTTP/2 and Quick UDP Internet
equally share the bandwidth among the clients, indicated Connections (QUIC), (C) immersive media streaming, specif-
by BW, others share the bandwidth based on either ically 360-degree video streaming, and (D) HAS datasets.
perceptual quality or QoE, indicated by QT and QoE,
respectively. A. HAS and Scalable Video Coding (SVC)
• QoE: Does the algorithm support and integrate one of the
In most state-of-the-art adaptive streaming systems, non-
objective QoE models? scalable video coding (i.e., AVC, HEVC, VP9/AV1) is widely
• Number of clients: Single indicates one client only,
relied upon due to its coding efficiency, ease of implementa-
multiple(few) indicates less than 10 clients, and multi- tion, and widespread adoption. However, scalable video coding
ple(many) indicates more than 10 clients. has multiple benefits such as resiliency to packet losses and
• QoE optimization: Does the algorithm propose a QoE
better adaptability to device capabilities (e.g., if a device is
model and aim to optimize it? not capable of decoding high-quality videos, it can choose
• Content type: Live or video-on-demand (VoD).
to decode lower layers only). Many studies [55], [68], [162],
• Heterogeneity: Does the algorithm take heterogeneous
[163] have shown benefits of using SVC in HAS rather
devices into account in its experimental testing? than AVC [30]: (1) it allows HAS to support heterogeneous
• SVC support: Does the adaptation algorithm support the
clients, (2) it reduces storage and networking costs, and (3) it
streaming of SVC-encoded video? enables CDNs and caches to be used more efficiently, e.g., by
• BG traffic: Does the paper include background traffic in
prioritizing the base layer and providing enhancement layers
their experimental tests? only when network resources are available. Fig. 13 depicts
From Table II, we can deduce the following outcomes: SVC-based HAS where each segment can be split into a subset
• The client-based adaptation schemes show a good per- of bitstreams instead of different bitrate levels, and thus, the
formance given certain environments and circumstances. video segments can be encoded at different SVC qualities

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
TABLE II: Feature comparison between the surveyed bitrate adaptation schemes.
Adaptation Heuristic(s) Fairness QoE Number of QoE Content Heterogeneity SVC BG
Scheme Clients Optimization Type Support Traffic
[51] BW 7 7 Single 7 VoD 7 7 7
[52] BW 3(BW) 7 Multiple(few) 7 VoD 7 7 3
[53] BW 3(BW) 7 Multiple(few) 7 VoD/Live 7 7 3
[54] BW 3(BW) 3 Multiple(few) 7 VoD 7 7 3
[108] BW 3(BW) 7 Multiple(few) 7 Live 3 7 3
[58] BW 7 3 Single 7 VoD/Live 7 7 7
[98] BW 7 3 Single 7 VoD 7 7 3
[55] BW 7 3 Single 7 VoD 7 3 7
[56] BW 7 3 Single 3 VoD 7 7 7
[64] BW 7 3 Single 7 VoD 7 7 3
[65] Buffer 7 3 Multiple(few) 7 VoD/Live 7 7 7
[66] Buffer 7 7 Multiple(few) 7 VoD 7 7 7
[67] Buffer 7 3 Single 3 VoD 3 7 3
[68] Buffer Single VoD

III-A
7 3 7 3 3 7
[8], [9] BW, DC 7 3 Multiple(few) 7 VoD/Live 3 7 7
[10] BW, CPU 7 3 Multiple(few) 7 VoD/Live 3 7 7
[81], [102] BW, Buffer 7 3 Single 3 VoD 7 7 7
[32] BW, Buffer, QT 3(QT) 3 Multiple(few) 3 VoD/Live 7 7 3
[82], [84] BW, Buffer 3(BW) 3 Multiple(few) 7 VoD 7 7 3
[86] BW, Buffer 7 7 Single 7 VoD 7 7 3
[87] Buffer, Seg-schedule 3(BW) 3 Multiple(few) 7 VoD 7 7 7
[88] BW, Buffer 3(BW) 3 Multiple(few) 7 VoD 7 7 7
[93] BW, Buffer, Seg-size 7 3 Single 7 VoD 7 7 7
[89] BW, Buffer 7 3 Single 7 VoD 7 7 3
[91] BW, Buffer 7 3 Single 7 VoD/Live 7 7 3
[94] SDT, Buffer 3(BW) 3 Multiple(few) 7 VoD 7 7 3
[99] BW, Buffer 3(QoE) 3 Multiple(few) 3 VoD 7 7 3
[101] BW, Buffer Single VoD
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X

7 3 3 7 7 3
[92] BW, Buffer 3(BW) 7 Multiple(few) 7 VoD 3 7 3
[100] BW, Buffer, Seg-quality 7 3 Single 3 VoD/Live 7 7 7
[97] BW, Buffer 7 3 Single 7 VoD 7 3 3
[83] BW, Buffer 3(BW) 3 Multiple(few) 3 VoD 7 7 3
[69] Buffer 3(BW) 3 Multiple(few) 3 VoD 7 7 3
Communications Surveys & Tutorials

[106], [107] BW, Server traffic shaper 3(BW) 7 Multiple(many) 7 VoD/Live 7 7 7


[109] BW, Buffer 3(BW) 7 Multiple(many) 7 Live 7 7 3

III-B
[110] BW, Multiple servers 3(BW) 3 Multiple(many) 7 VoD/Live 3 7 3
[112] BW, Buffer, PA 7 3 Multiple(few) 3 VoD 7 7 3
[113] BW, Buffer, PA 7 3 Multiple(many) 3 VoD 7 7 3
[117] BW, Buffer, PA 3(BW) 3 Multiple(many) 3 VoD 7 7 3
[118] BW, Buffer, PA 3(QoE) 3 Multiple(few) 3 VoD 7 7 7
[116], [121] BW, Buffer, PA 7 3 Multiple(few) 3 VoD/Live 3 7 7
[133] BW, Buffer, SDN Multiple(few) VoD

III-C
3(QoE) 3 7 3 3 7

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.


[123] Buffer 3(QT) 3 Multiple(few) 7 VoD 3 7 3
[124] CA 3(QT) 3 Multiple(many) 7 VoD 7 7 7
[120] BW 3(BW) 3 Multiple(few) 7 VoD 3 7 3
[122] BW, PA 3(BW) 3 Multiple(many) 7 VoD 3 7 7
[132] BW 3(QoE) 3 Multiple(few) 7 VoD 7 7 3
[134] BW, SDN 3(BW) 3 Multiple(many) 3 VoD/Live 3 7 3
[126], [127] BW, CA 3(QoE) 3 Multiple(few) 3 VoD/Live 3 7 3
[125] BW, Seg-size, Seg-schedule, CA 3(BW) 3 Multiple(few) 3 VoD/Live 3 7 3
[145] BW, Buffer, SDN-app 3(QoE) 3 Multiple(few) 3 VoD/Live 3 7 7
[146] BW, Buffer, SDN-app 3(QoE) 3 Multiple(few) 7 VoD/Live 3 7 7
[147] BW, Buffer, SDN-app 7 3 Multiple(many) 3 VoD 7 7 3
[148] BW, SDN 7 7 Multiple(many) 7 VoD/Live 3 7 7

III-D
[151], [152], [154] Hybrid, SDN 3(QoE) 3 Multiple(many) 3 VoD/Live 3 7 3
[150] BW, DC, SDN 3(BW) 3 Multiple(few) 7 VoD 7 7 3
[149] Buffer, QoE, SDN 3(BW) 3 Multiple(few) 7 VoD 3 7 3
[155] BW, Buffer, SDN 7 7 Single 7 Live 7 7 7

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
16
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 17

1100 Kbps
which shows that HTTP/2 can achieve a similar performance
compared to HTTP/1.1 (with pipelined persistent connections

800 Kbps
enabled). Wei et al. [57] used the HTTP/2 server push feature
400 Kbps
and introduced k-push to reduce both live latency and the
number of segment requests. In k-push, the client sends
one request to the server every k segments indicating the
number of segments (k) to be pushed to the client. The
server responds by pushing each segment consecutively as
soon as it becomes available, but all at the same requested
bitrate level (cf. Fig. 14). Xiao et al. [172] further evaluated
the k-push scheme and showed that it deteriorated network
adaptability, since its gains diminish as k increases and it
HAS Server HAS Client led to the “over-push” problem where network resources are
(a) AVC-based HAS. wasted due to video abandonment by the viewers. Thus,
the authors proposed adaptive-push to overcome k-push’s
Enh. layer 2 limitations, which uses the same principle as its predecessor
Enh. layer 1
but selects k adaptively. In both k-push and adaptive-push,
the client can implement various rate adaptation algorithms.
Base layer
Cherif et al. [173] also used HTTP/2 server push to implement
a fast startup where segments were initially pushed to the client
upon receiving a request for the MPD. As the client would
typically be unaware of the initial bandwidth conditions, the
authors suggested using a WebSocket connection over HTTP/2
HAS Server HAS Client to exchange various status messages including bandwidth
(b) SVC-based HAS. estimation information. Finally, an overview of HTTP/2-based
methods to improve the live experience of HAS has been
Fig. 13: AVC-based vs. SVC-based HAS. presented in [174], which includes (1) stream termination,
(2) request/response multiplexing and stream prioritization,
and (3) server push. It provides a detailed analysis of HTTP/2-
(temporal, spatial, SNR). Using this mechanism, a HAS client
based QoE-improvement methods including a comprehensive
can incrementally improve the quality of a segment by fetching
evaluation.
additional bitstreams or layers depending on the dynamics of
the available bandwidth. One key difference when using SVC 2) QUIC: QUIC is a UDP-based secure transport layer
with DASH is that a client may have to download multiple protocol that aims to speed up the connections, reduce la-
segments (i.e., base and enhancement layers) for one playback tencies, enable congestion and flow control, allow multi-
epoch, unlike in the case of non-scalable video. Dayananda et ple (multiplexed/pipelined) data connections (e.g., HTTP re-
al. [164] investigated the gain of SHVC in HAS, and they quest/response) over the same UDP connection without HOL
found that it could result in bitrate savings but at the price of blocking, and UDP connection migration with Forward Error
increased encoding overhead due to scalability. Interestingly, Correction (FEC). QUIC has been evaluated in the context
MPEG’s exploration towards a future video coding format of HAS by Timmerer et al. [170] and the results show a
with capabilities beyond HEVC initially suggested having similar adaptation performance of HTTP/2 over TCP, HTTP/2
scalability as a built-in feature, but that has been withdrawn over SSL, HTTP/1.1 over QUIC and SPDY over QUIC. The
from the final call for proposals [165], and thus, is not experimental results reported that QUIC introduces around
considered in Versatile Video Coding (VVC) [23]. 10% more overhead than TCP at low bitrates. Also, the band-
width utilization decreases when the round-trip time (RTT)
increases, but it remains high and stable around 87%. A
B. HTTP/2 and QUIC-Based Streaming similar evaluation was conducted by Bhat et al. [171], which
Google initially developed SPDY [166], which eventually revealed that bitrate adaptation schemes deployed on top of
led to the specification of HTTP/2 [167], and also developed QUIC do not show a performance increase unless the existing
QUIC [168], which, with HTTP/2, addresses the latency and schemes are properly adjusted to be used in conjunction with
head-of-line (HOL) blocking issues that were inherent to QUIC. Other evaluations of QUIC have focused on generic
HTTP/1.1 over TCP. Both HTTP/2 and QUIC may have an traffic patterns such as regular Web sites [175], [176] without
impact on the HAS performance [79], [169]–[171]. providing details on HAS.
1) HTTP/2: HTTP/2 is used over a single persistent TCP
connection (with pipelining support) between the client and
server comprising multiple streams in a full duplex model C. Immersive Media Streaming
with advanced features such as frame exchange, request Immersive media streaming and specifically virtual reality
prioritization, header field compression, and server push. A (VR)/360-degree video streaming is nowadays gaining sig-
first evaluation has been conducted by Mueller et al. [169], nificant attention from both academia and industry due to

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X 18

in [180]. A number of open issues are also discussed by Graf


Client Server Client Server
et al. [179] ranging from encoding/streaming issues to QoE.
Request
Seg. 1
Request
Seg. 1 to
k
3) Standardization: Several standardization bodies and in-
RTT
Seg. 1
RTT
Seg. 1 dustry forums have started working towards achieving inter-
Seg. 2
Request Se
operability between different VR systems. An overview is
g. 2 Seg. k
RTT
provided in [181]. MPEG’s efforts to standardize the storage
Seg. 2 Request Se
g. k+ 1 to 2k and delivery formats for 360-degree video content is specified
Request Se
g. 3
RTT Seg. k+1 in the Omnidirectional Media Format (OMAF) standard [182].
Seg. k+2
RTT Seg. 3 OMAF describes the content processing architecture, projec-
Seg. 2k
tion and packaging formats, streaming approaches and DASH
integration of 360-degree videos [183].

(a) HAS using HTTP/1.1. (b) The k-push method using


HTTP/2 [57]. D. HAS Datasets
Fig. 14: HAS using HTTP/1.1 versus the k-push method using In the past, a great number of DASH datasets has emerged.
HTTP/2. The first DASH dataset was released by Lederer et al. [19]
and comprises various genres (i.e., animation, sport, movie),
encoded using up to 20 representations (up to 1080p resolu-
the increasing availability of 360-degree cameras and head tion), and different segment lengths (i.e., 1, 2, 4, 6, 10, and
mounted displays (HMD). VR applications range from 3D 15 seconds). Additionally, for some representations per frame
video gaming to 360-degree video streaming and teleimmer- PSNR values are provided. Initial evaluations of the dataset
sion. In this survey, we highlight the use of HAS in 360-degree provide recommendations for an optimal segment length based
video streaming. on the coding efficiency (i.e., 4s) and the influence of enabled
1) Characteristics of 360-Degree Videos: 360-degree versus disabled persistent connections.
videos are recorded using multiple specialized high-resolution A distributed DASH dataset has been released by Lederer et
cameras that capture a sphere around the user. The resulting al. [184], which distributes the dataset across multiple loca-
video is typically stitched and mapped onto a 2D plane using tions and utilizes multiple BaseURL elements within the media
various projection formats due to a lack of coding tools presentation description (MPD). It can be used to simulate
for the spherical domain. At the client side, the 2D plane different content distribution network (CDN) locations and
is mapped back on a surface mesh and rendered based on bitstream switching across multiple CDNs.
the device capabilities. Characteristically, they allow users Le Feuvre et al. [185] provide an ultra high definition
to freely navigate within the media presentation but only a (UHD) HEVC DASH dataset targeting UHD services (i.e.,
fraction of the actual content is presented to the user at any resolutions up to 3840x2160, framerate up to 60 fps, and
given point in time. This is referred to as the viewport, or up to 10 bpp) using HEVC, which is the major difference
field of view. Considering the high resolution nature of the compared to previously proposed datasets. Kreuzberger et
full spherical content, the amount of data to be streamed may al. [30] provides a DASH dataset focusing on scalable video
be significantly higher than the one for conventional, non-360- coding (SVC) and experimenting with in-network adaptation
degree videos. in named data networks and information-centric networking,
2) Adaptive Streaming Challenges: Most adaptive stream- respectively. Unfortunately, support for SVC in end user
ing schemes for 360-degree videos merely adopt traditional devices is still limited. Quinlan et al. [186] propose a dataset
non-360-degree video delivery schemes. The entire 360-degree comprising AVC and HEVC for the evaluation of DASH
scene is adaptively delivered without taking the user’s viewport systems.
into account. For example, the content outside the user’s Finally, Zabrovskiy et al. [187] provide a multi-codec
current viewport is delivered at the same quality as the DASH dataset comprising multiple state-of-the-art as well as
content within the user’s viewport, wasting bandwidth, and emerging video codecs, i.e., AVC, HEVC, VP9, and AV1 to
thus, network resources. Viewport-adaptive [177] and tile- enable interoperability testing and allow for experimenting
based adaptive streaming techniques [178], [179] are currently with adaptation strategies of DASH clients supporting multiple
suggested in the literature to overcome this disadvantage. The video codecs. A similar dataset is provided by Quinlan et
former provides pre-encoded versions of a given viewport al. [188] focusing on AVC and HEVC for UHD (4K) res-
based on the user’s device orientation, which requires addi- olutions.
tional content versions to be prepared, stored, and distributed
within the delivery network. The latter uses the tiling feature
VI. C ONCLUSIONS
available in modern video codecs (e.g., in HEVC, VP9, and
AV1) that enables spatial segmentation of videos. Each tile can Since the emergence of HTTP adaptive streaming (HAS),
be projected in different representations to allow for quality many bitrate adaption schemes have been proposed. Each is
adaptation. However, requesting each tile individually may trying to address certain HAS-related problems and striving to
increase the number of requests tremendously, which could be achieve a set of goals. In fact, most state-of-the-art schemes
addressed by the server push feature of HTTP/2 as suggested share a common main objective, which is to improve viewer

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X

QoE. In this survey, we examined a set of well-known schemes operating regimes (i.e., different network environments, chunk
and heuristics for their applicability. sizes, content types, etc.), and may require parameter tuning.
Firstly, we classified the bitrate adaptation schemes into four A common set of test conditions might reveal significantly
main categories, namely, client-based, server-based, network- different results than the ones reported in the original papers.
assisted and hybrid. In a client-based scheme, the client strives In the broad area of adaptive streaming, there are many open
to optimize the viewer QoE individually and considers one challenges and issues that need more attention:
of the many heuristics based on the available bandwidth, • Understanding the main factors that degrade the viewer
playback buffer size, segment size, and duration. Server-based QoE through subjective and objective tests; then, design-
schemes, in contrast, do not require any cooperation from the ing a standardized QoE function.
clients, and they use a server traffic shaping mechanism. In • Designing placement algorithms for CDN, proxies and
network-assisted schemes, the clients use information coming SDN controllers.
from in-network devices, like proxies, together with their own • Understanding the trade-off between content-aware en-
observations for bitrate adaptation. Finally, the hybrid solu- coding versus content-aware streaming (generating vari-
tions consist of many entities like clients, central managers, able bitrate encoded segments is easy, but streaming them
servers, and network devices that are involved in the bitrate is not).
decision process. • Designing a robust solution that achieves fair resource
Secondly, we offered a description of each scheme by sharing among concurrent HAS clients when they com-
presenting the problems they are trying to solve, their goals, pete in a bottleneck network.
findings, main components and critical acclaims. Although the • Understanding multi-path benefits and adding its capabil-
described schemes in each category provide noteworthy ben- ities to HAS delivery systems.
efits and efficiency in some specific network characteristics, • Studying the interaction between HAS and non-HAS
many shared challenges exist in every category, especially traffic, and its impact on the QoE.
when multiple clients compete for the shared bandwidth: • Mixing client-based and hybrid solutions without intro-
• Client-based schemes likely suffer from HAS stability ducing extra overhead.
issues and QoE variations due to the HAS’ ON-OFF • Providing a solution to deliver 360-degree videos that
pattern. These issues are aggravated when the number reduces bandwidth consumption while not hampering the
of geographically-distributed clients keeps growing. QoE.
• Server-based schemes introduce overhead and complex- • Leveraging machine learning and deep learning tech-
ity, limiting the system scalability with the increasing niques to analyze and classify encrypted HAS traffic,
number of clients. which can help monitor and mitigate QoE impairments.
• Network-assisted and hybrid adaptation schemes use cen-
tralized entities to assist the clients in their decisions,
improve the viewer QoE, and avoid HAS scalability R EFERENCES
issues. However, they are difficult to deploy over the fully [1] Cisco Systems, Inc., “Cisco Visual Networking Index: Forecast and
Methodology, 2016-2021,” White Paper, 2017.
decentralized nature of real-world network infrastructures
[2] Adobe, “Real-Time Messaging Protocol(RTMP),” http://www.adobe.
and they do not support large-scale deployments where com/devnet/rtmp.html, 2014, online; accessed on Nov. 21, 2017.
many HAS players are geographically distributed. [3] V. Jacobson, R. Frederick, S. Casner, and H. Schulzrinne, “Real-Time
Transport Protocol (RTP),” https://www.ietf.org/rfc/rfc3550.txt, 2014,
Thirdly, we provided a comparison between the surveyed online; accessed on Nov. 21, 2017.
schemes in terms of a set of QoE and networking aspects. [4] H. Schulzrinne, “Real Time Streaming Protocol version 2.0,” https:
Our comparison may help researchers in the area of adaptive //tools.ietf.org/html/rfc7826, 2016, online; accessed on Nov. 21, 2017.
[5] J. Goldberg, M. Westerlund, and T. Zeng, “A Network Address Trans-
streaming where it offers a general consistent framework that lator (NAT) Traversal Mechanism for Media Controlled by the Real-
can formally evaluate and compare different bitrate adaptation Time Streaming Protocol (RTSP),” https://tools.ietf.org/html/rfc7825,
logic categories, and test the efficiency of their components. 2014, online; accessed on Nov. 21, 2017.
Finally, we concluded the survey by a general discussion on [6] L. Popa, A. Ghodsi, and I. Stoica, “HTTP As the Narrow Waist
of the Future Internet,” in Proceedings of the 9th ACM SIGCOMM
the recent developments in HAS systems, such as the use of Workshop on Hot Topics in Networks, ser. Hotnets-IX. New
HTTP/2 and QUIC as well as HAS of VR content. York, NY, USA: ACM, 2010, pp. 6:1–6:6. [Online]. Available:
In general, certain limitations still exist when conducting a http://doi.acm.org/10.1145/1868447.1868453
[7] X. Liu, F. Dobrian, H. Milner, J. Jiang, V. Sekar, I. Stoica,
comprehensive survey. The lack of standardized benchmarks and H. Zhang, “A Case for a Coordinated Internet Video
and frameworks (i.e., datasets, test conditions and QoE met- Control Plane,” in Proceedings of the ACM SIGCOMM 2012
rics) makes any performance comparison a difficult task. For Conference on Applications, Technologies, Architectures, and Protocols
for Computer Communication, ser. SIGCOMM ’12. New York,
example, a fair comparison between client-based adaptation NY, USA: ACM, 2012, pp. 359–370. [Online]. Available: http:
schemes in terms of performance (i.e., resource utilization) //doi.acm.org/10.1145/2342356.2342431
and QoE (i.e., video stalls, stabilization, quality oscillations), [8] Microsoft, “Microsoft Smooth Streaming,” http://www.iis.net/
downloads/microsoft/smooth-streaming, 2015, online; accessed on
requires that they undergo similar experimentation configura- Nov. 21, 2017.
tion, including a unified bandwidth trace, certain networking [9] Apple, “Apple HTTP Live Streaming,” https://developer.apple.com/
setups and similar device capabilities. The surveyed schemes streaming/, 2015, online; accessed on Nov. 21, 2017.
[10] Adobe, “Adobe HTTP Dynamic Streaming,” http://www.adobe.com/
may have perform well under certain conditions, but they all products/hds-dynamic-streaming.html, 2015, online; accessed on Nov.
use various heuristics that broadly relate to specific settings, 21, 2017.

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X

[11] Akamai, “Akamai HD,” https://www.akamai.com/us/en/resources/ over HTTP,” in Proceedings of the 6th ACM Multimedia Systems
live-video-streaming.jsp, 2015, online; accessed on Nov. 21, 2017. Conference, ser. MMSys ’15. New York, NY, USA: ACM, 2015,
[12] Timmerer, Christian, “HTTP Streaming of MPEG Media,” pp. 213–218. [Online]. Available: http://doi.acm.org/10.1145/2713168.
https://multimediacommunication.blogspot.co.at/2010/05/ 2713193
http-streaming-of-mpeg-media.html, 2012, online; accessed on [31] Z. Duanmu, K. Zeng, K. Ma, A. Rehman, and Z. Wang, “A Quality-
Nov. 21, 2017. of-Experience Index for Streaming Video,” IEEE Journal of Selected
[13] Dash Industry Forum, “DASH-264 JavaScript Reference Client,” http: Topics in Signal Processing, vol. 11, no. 1, pp. 154–166, Feb 2017.
//dashif.org/reference/players/javascript/index.html, 2017, online; ac- [32] Z. Li, A. C. Begen, J. Gahm, Y. Shan, B. Osler, and D. Oran,
cessed on Nov. 21, 2017. “Streaming Video over HTTP with Consistent Quality,” in Proceedings
[14] B. Rainer, S. Lederer, C. Müller, and C. Timmerer, “A Seamless Web of the 5th ACM Multimedia Systems Conference, ser. MMSys ’14.
Integration of Adaptive HTTP Streaming,” in 2012 Proceedings of the New York, NY, USA: ACM, 2014, pp. 248–258. [Online]. Available:
20th European Signal Processing Conference (EUSIPCO), Aug 2012, http://doi.acm.org/10.1145/2557642.2557658
pp. 1519–1523. [33] L. Yu, T. Tillo, and J. Xiao, “QoE-driven Dynamic Adaptive Video
[15] J. Kua, G. Armitage, and P. Branch, “A Survey of Rate Adapta- Streaming Strategy with Future Information,” IEEE Transactions on
tion Techniques for Dynamic Adaptive Streaming over HTTP,” IEEE Broadcasting, vol. 63, no. 3, pp. 523–534, Sept 2017.
Communications Surveys Tutorials, vol. 19, no. 3, pp. 1842–1866, [34] T. Hoßfeld, S. Egger, R. Schatz, M. Fiedler, K. Masuch, and
thirdquarter 2017. C. Lorentzen, “Initial Delay vs. Interruptions: Between the Devil and
[16] E. Thomas, M. van Deventer, T. Stockhammer, A. C. Begen, and the Deep Blue Sea,” in 2012 Fourth International Workshop on Quality
J. Famaey, “Enhancing MPEG DASH Performance via Server and of Multimedia Experience, July 2012, pp. 1–6.
Network Assistance,” SMPTE Motion Imaging Journal, vol. 126, no. 1, [35] T. De Pessemier, K. De Moor, W. Joseph, L. De Marez, and L. Martens,
pp. 22–27, 2017. “Quantifying the Influence of Rebuffering Interruptions on the User’s
[17] X. Wang, “Network-Assistance and Server Management in Adaptive Quality of Experience During Mobile Video Watching,” IEEE Trans-
Streaming on the Internet,” in Proceedings of W3C Web and TV actions on Broadcasting, vol. 59, no. 1, pp. 47–61, March 2013.
Workshop, Munich, Germany, 2014. [36] Y. Qi and M. Dai, “The Effect of Frame Freezing and Frame Skipping
[18] T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview on Video Quality,” in 2006 International Conference on Intelligent
of the H.264/AVC Video Coding Standard,” IEEE Transactions on Information Hiding and Multimedia, Dec 2006, pp. 423–426.
Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560–576, [37] D. C. Robinson, Y. Jutras, and V. Craciun, “Subjective Video Quality
July 2003. Assessment of HTTP Adaptive Streaming Technologies,” Bell Lab.
[19] S. Lederer, C. Müller, and C. Timmerer, “Dynamic Adaptive Streaming Tech. J., vol. 16, no. 4, pp. 5–23, Mar. 2012. [Online]. Available:
over HTTP Dataset,” in Proceedings of the 3rd Multimedia Systems http://dx.doi.org/10.1002/bltj.20531
Conference, ser. MMSys ’12. New York, NY, USA: ACM, 2012,
[38] R. Hamberg and H. de Ridder, “Time-varying Image Quality: Modeling
pp. 89–94. [Online]. Available: http://doi.acm.org/10.1145/2155555.
the Relation between Instantaneous and Overall Quality,” SMPTE
2155570
Journal, vol. 108, no. 11, pp. 802–811, Nov 1999.
[20] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the Scalable
[39] N. Cranley, P. Perry, and L. Murphy, “User Perception of Adapting
Video Coding Extension of the H.264/AVC Standard,” IEEE Transac-
Video Quality,” International Journal of Human-Computer Studies,
tions on Circuits and Systems for Video Technology, vol. 17, no. 9, pp.
vol. 64, no. 8, pp. 637 – 647, 2006. [Online]. Available:
1103–1120, Sept 2007.
http://www.sciencedirect.com/science/article/pii/S1071581905002028
[21] G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the
High Efficiency Video Coding (HEVC) Standard,” IEEE Transactions [40] A. Hore and D. Ziou, “Image Quality Metrics: PSNR vs. SSIM,” in
2010 20th International Conference on Pattern Recognition, Aug 2010,
on Circuits and Systems for Video Technology, vol. 22, no. 12, pp.
1649–1668, Dec 2012. pp. 2366–2369.
[22] J. M. Boyce, Y. Ye, J. Chen, and A. K. Ramasubramonian, “Overview [41] Q. Huynh-Thu and M. Ghanbari, “Scope of Validity of PSNR in
of SHVC: Scalable Extensions of the High Efficiency Video Coding Image/Video Quality Assessment,” Electronics Letters, vol. 44, no. 13,
Standard,” IEEE Transactions on Circuits and Systems for Video pp. 800–801, June 2008.
Technology, vol. 26, no. 1, pp. 20–34, 2016. [42] A. Rehman, K. Zeng, and Z. Wang, “Display Device-Adapted Video
[23] C. Timmerer, “MPEG column: 122nd MPEG meeting in San Diego, Quality-of-Experience Assessment,” in Proceedings SPIE 9394, Human
CA, USA,” SIGMultimedia Records, vol. 10, no. 2, pp. 6:6–6:6, Jun. Vision and Electronic Imaging XX. Int. Society for Optics and
2018. Photonics, 2015, pp. 939 406–939 406.
[24] J. De Cock, A. Mavlankar, A. Moorthy, and A. Aaron, “A Large-Scale [43] O. Issa, F. Speranza, T. H. Falk et al., “Quality-of-Experience Per-
Video Codec Comparison of x264, x265 and libvpx for Practical VoD ception for Video Streaming Services: Preliminary Subjective and
Applications,” in SPIE Proceedings Vol. 9971: Applications of Digital Objective Results,” in Proceedings of The 2012 Asia Pacific Signal and
Image Processing XXXIX. San Diego, California, USA: Int. Society Information Processing Association Annual Summit and Conference,
for Optics and Photonics, 2016. Dec 2012, pp. 1–9.
[25] M. Seufert, S. Egger, M. Slanina, T. Zinner, T. Hoßfeld, and P. Tran- [44] B. Rainer, S. Petscharnig, C. Timmerer, and H. Hellwagner, “Sta-
Gia, “A Survey on Quality of Experience of HTTP Adaptive Stream- tistically Indifferent Quality Variation: An Approach for Reducing
ing,” IEEE Communications Surveys Tutorials, vol. 17, no. 1, pp. 469– Multimedia Distribution Cost for Adaptive Video Streaming Services,”
492, Firstquarter 2015. IEEE Transactions on Multimedia, vol. 19, no. 4, pp. 849–860, April
[26] S. Akhshabi, S. Narayanaswamy, A. C. Begen, and C. Dovrolis, “An 2017.
Experimental Evaluation of Rate-Adaptive Video Players over HTTP,” [45] A. Balachandran, V. Sekar, A. Akella, S. Seshan, I. Stoica,
Signal Processing: Image Communication, vol. 27, no. 4, pp. 271 – and H. Zhang, “Developing a Predictive Model of Quality of
287, 2012. [Online]. Available: http://www.sciencedirect.com/science/ Experience for Internet Video,” vol. 43, no. 4. New York,
article/pii/S0923596511001159 NY, USA: ACM, Aug. 2013, pp. 339–350. [Online]. Available:
[27] S. Akhshabi, L. Anantakrishnan, A. C. Begen, and C. Dovrolis, http://doi.acm.org/10.1145/2534169.2486025
“What Happens when HTTP Adaptive Streaming Players Compete for [46] M. Montagud, F. Boronat, H. Stokking, and R. van Brandenburg,
Bandwidth?” in Proceedings of the 22Nd International Workshop on “Inter-Destination Multimedia Synchronization: Schemes, Use Cases
Network and Operating System Support for Digital Audio and Video, and Standardization,” Multimedia systems, vol. 18, no. 6, pp. 459–482,
ser. NOSSDAV ’12. New York, NY, USA: ACM, 2012, pp. 9–14. 2012.
[Online]. Available: http://doi.acm.org/10.1145/2229087.2229092 [47] B. Rainer and C. Timmerer, “Self-Organized Inter-Destination
[28] S. Bae, D. Jang, and K. Park, “Why Is HTTP Adaptive Streaming So Multimedia Synchronization For Adaptive Media Streaming,” in
Hard?” in Proceedings of the 6th Asia-Pacific Workshop on Systems, Proceedings of the 22Nd ACM International Conference on
ser. APSys ’15. New York, NY, USA: ACM, 2015, pp. 12:1–12:8. Multimedia, ser. MM ’14. New York, NY, USA: ACM, 2014, pp. 327–
[Online]. Available: http://doi.acm.org/10.1145/2797022.2797031 336. [Online]. Available: http://doi.acm.org/10.1145/2647868.2654938
[29] G. Cermak, M. Pinson, and S. Wolf, “The Relationship Among Video [48] B. Rainer, S. Petscharnig, and C. Timmerer, “Merge and Forward:
Quality, Screen Resolution, and Bit Rate,” IEEE Transactions on Self-organized Inter-destination Multimedia Synchronization,” in
Broadcasting, vol. 57, no. 2, pp. 258–262, June 2011. Proceedings of the 6th ACM Multimedia Systems Conference, ser.
[30] C. Kreuzberger, D. Posch, and H. Hellwagner, “A Scalable Video MMSys ’15. New York, NY, USA: ACM, 2015, pp. 77–80. [Online].
Coding Dataset and Toolchain for Dynamic Adaptive Streaming Available: http://doi.acm.org/10.1145/2713168.2713185

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X

[49] B. Rainer, S. Petscharnig, C. Timmerer, and H. Hellwagner, “Is One Conference, ser. MM ’16. New York, NY, USA: ACM, 2016, pp. 888–
Second Enough? Evaluating QoE for Inter-Destination Multimedia 897. [Online]. Available: http://doi.acm.org/10.1145/2964284.2964333
Synchronization using Human Computation and Crowdsourcing,” in [65] C. Mueller, S. Lederer, R. Grandl, and C. Timmerer, “Oscillation
2015 Seventh International Workshop on Quality of Multimedia Expe- Compensating Dynamic Adaptive Streaming over HTTP,” in 2015
rience (QoMEX), 2015, pp. 1–6. IEEE International Conference on Multimedia and Expo (ICME), June
[50] T. Stockhammer, “Dynamic Adaptive Streaming over HTTP –: 2015, pp. 1–6.
Standards and Design Principles,” in Proceedings of the Second [66] T.-Y. Huang, R. Johari, N. McKeown, M. Trunnell, and M. Watson,
Annual ACM Conference on Multimedia Systems, ser. MMSys ’11. “A Buffer-based Approach to Rate Adaptation: Evidence from a
New York, NY, USA: ACM, 2011, pp. 133–144. [Online]. Available: Large Video Streaming Service,” SIGCOMM Comput. Commun.
http://doi.acm.org/10.1145/1943552.1943572 Rev., vol. 44, no. 4, pp. 187–198, Aug. 2014. [Online]. Available:
[51] C. Liu, I. Bouazizi, and M. Gabbouj, “Rate Adaptation for http://doi.acm.org/10.1145/2740070.2626296
Adaptive HTTP Streaming,” in Proceedings of the Second Annual [67] K. Spiteri, R. Urgaonkar, and R. K. Sitaraman, “BOLA: Near-optimal
ACM Conference on Multimedia Systems, ser. MMSys ’11. New bitrate adaptation for online videos,” in IEEE INFOCOM 2016 – The
York, NY, USA: ACM, 2011, pp. 169–174. [Online]. Available: 35th Annual IEEE International Conference on Computer Communi-
http://doi.acm.org/10.1145/1943552.1943575 cations, April 2016, pp. 1–9.
[52] C. Liu, I. Bouazizi, M. M. Hannuksela, and M. Gabbouj, “Rate [68] C. Sieber, T. Hoßfeld, T. Zinner, P. Tran-Gia, and C. Timmerer, “Im-
Adaptation for Dynamic Adaptive Streaming over HTTP in Content plementation and user-centric comparison of a novel adaptation logic
Distribution Network,” Signal Processing: Image Communication, for DASH with SVC,” in 2013 IFIP/IEEE International Symposium
vol. 27, no. 4, pp. 288 – 311, 2012, modern Media Transport âĂŞ on Integrated Network Management (IM 2013), May 2013, pp. 1318–
Dynamic Adaptive Streaming over HTTP (DASH). [Online]. Available: 1323.
http://www.sciencedirect.com/science/article/pii/S0923596511001135 [69] P. K. Yadav, A. Shafiei, and W. T. Ooi, “QUETRA: A Queuing
[53] Z. Li, X. Zhu, J. Gahm, R. Pan, H. Hu, A. C. Begen, and D. Oran, Theory Approach to DASH Rate Adaptation,” in Proceedings of
“Probe and Adapt: Rate Adaptation for HTTP Video Streaming At the 2017 ACM on Multimedia Conference, ser. MM ’17. New
Scale,” IEEE Journal on Selected Areas in Communications, vol. 32, York, NY, USA: ACM, 2017, pp. 1130–1138. [Online]. Available:
no. 4, pp. 719–733, April 2014. http://doi.acm.org/10.1145/3123266.3123390
[54] X. Xie, X. Zhang, S. Kumar, and L. E. Li, “piStream: Physical Layer [70] R. Huysegems, B. De Vleeschauwer, T. Wu, and W. Van Leekwijck,
Informed Adaptive Video Streaming over LTE,” in Proceedings of “SVC-based HTTP Adaptive Streaming,” Bell Labs Technical Journal,
the 21st Annual International Conference on Mobile Computing and vol. 16, no. 4, pp. 25–41, 2012.
Networking, ser. MobiCom ’15. New York, NY, USA: ACM, 2015, [71] C. Mueller, “Microsoft Smooth Streaming,” https://bitmovin.com/
pp. 413–425. [Online]. Available: http://doi.acm.org/10.1145/2789168. microsoft-smooth-streaming-mss/, 2015, online; accessed on Nov. 21,
2790118 2017.
[55] T. Andelin, V. Chetty, D. Harbaugh, S. Warnick, and D. Zappala, [72] R. Pantos and W. May, “HTTP Live Streaming,” https://www.ietf.org/
“Quality Selection for Dynamic Adaptive Streaming over HTTP rfc/rfc8216.txt, 2017, online; accessed on Dec. 20, 2017.
with Scalable Video Coding,” in Proceedings of the 3rd Multimedia [73] Apple, “QuickTime,” http://www.apple.com/sg/quicktime/, 2016, on-
Systems Conference, ser. MMSys ’12. New York, NY, USA: ACM, line; accessed on Nov. 21, 2017.
2012, pp. 149–154. [Online]. Available: http://doi.acm.org/10.1145/
[74] “Simplified Adaptive Video Streaming: Announcing support for HLS
2155555.2155580
and DASH in Windows 10,” https://goo.gl/gZM3mQ, 2015, online;
[56] M. Xiao, V. Swaminathan, S. Wei, and S. Chen, “DASH2M:
accessed on Dec. 20, 2017.
Exploring HTTP/2 for Internet Streaming to Mobile Devices,” in
[75] “Supported Media Formats | Android Developers,” https://developer.
Proceedings of the 2016 ACM on Multimedia Conference, ser. MM
android.com/guide/topics/media/media-formats.html, online; accessed
’16. New York, NY, USA: ACM, 2016, pp. 22–31. [Online].
on Dec. 20, 2017.
Available: http://doi.acm.org/10.1145/2964284.2964313
[57] S. Wei and V. Swaminathan, “Low Latency Live Video Streaming [76] Adobe, “ActionScript,” http://www.adobe.com/devnet/actionscript.
over HTTP 2.0,” in Proceedings of Network and Operating System html, 2016, online; accessed on Nov. 21, 2017.
Support on Digital Audio and Video Workshop. New York, [77] T. Cloonan and J. Allen, “Competitive Analysis of Adaptive Video
NY, USA: ACM, 2014, pp. 37:37–37:42. [Online]. Available: Streaming Implementations,” in SCTE Cable-Tec Expo Technical
http://doi.acm.org/10.1145/2578260.2578277 Wksp., 2011.
[58] K. Miller, A.-K. Al-Tamimi, and A. Wolisz, “QoE-Based Low- [78] D. Wu, Y. T. Hou, W. Zhu, Y.-Q. Zhang, and J. M. Peha, “Streaming
Delay Live Streaming Using Throughput Predictions,” ACM Trans. Video over the Internet: Approaches and Directions,” IEEE Transac-
Multimedia Comput. Commun. Appl., vol. 13, no. 1, pp. 4:1–4:24, tions on Circuits and Systems for Video Technology, vol. 11, no. 3, pp.
Oct. 2016. [Online]. Available: http://doi.acm.org/10.1145/2990505 282–300, 2001.
[59] J. Hao, R. Zimmermann, and H. Ma, “GTube: Geo-predictive Video [79] C. Müller, S. Lederer, and C. Timmerer, “An Evaluation of Dynamic
Streaming over HTTP in Mobile Environments,” in Proceedings of Adaptive Streaming over HTTP in Vehicular Environments,” in
the 5th ACM Multimedia Systems Conference, ser. MMSys ’14. New Proceedings of the 4th Workshop on Mobile Video, ser. MoVid ’12.
York, NY, USA: ACM, 2014, pp. 259–270. [Online]. Available: New York, NY, USA: ACM, 2012, pp. 37–42. [Online]. Available:
http://doi.acm.org/10.1145/2557642.2557647 http://doi.acm.org/10.1145/2151677.2151686
[60] H. Riiser, H. S. Bergsaker, P. Vigmostad, P. Halvorsen, and [80] Bitmovin, “MPEG-DASH vs. Commercial Players,” http://www.goo.
C. Griwodz, “A Comparison of Quality Scheduling in Commercial gl/TmazZ8, 2015, online; accessed on Nov. 21, 2017.
Adaptive HTTP Streaming Solutions on a 3G Network,” in [81] X. Yin, A. Jindal, V. Sekar, and B. Sinopoli, “A Control-Theoretic
Proceedings of the 4th Workshop on Mobile Video, ser. MoVid ’12. Approach for Dynamic Adaptive Video Streaming over HTTP,”
New York, NY, USA: ACM, 2012, pp. 25–30. [Online]. Available: SIGCOMM Comput. Commun. Rev., vol. 45, no. 4, pp. 325–338,
http://doi.acm.org/10.1145/2151677.2151684 Aug. 2015. [Online]. Available: http://doi.acm.org/10.1145/2829988.
[61] J. Yao, S. S. Kanhere, and M. Hassan, “Improving QoS in High- 2787486
Speed Mobility Using Bandwidth Maps,” IEEE Transactions on Mobile [82] C. Zhou, X. Zhang, L. Huo, and Z. Guo, “A Control-Theoretic
Computing, vol. 11, no. 4, pp. 603–617, April 2012. Approach to Rate Adaptation for Dynamic HTTP Streaming,” in 2012
[62] J. Yao, S. S. Kanhere, I. Hossain, and M. Hassan, “Empirical Eval- Visual Communications and Image Processing, Nov 2012, pp. 1–6.
uation of HTTP Adaptive Streaming under Vehicular Mobility,” in [83] A. Sobhani, A. Yassine, and S. Shirmohammadi, “A Video
NETWORKING 2011, J. Domingo-Pascual, P. Manzoni, S. Palazzo, Bitrate Adaptation and Prediction Mechanism for HTTP Adaptive
A. Pont, and C. Scoglio, Eds. Berlin, Heidelberg: Springer Berlin Streaming,” ACM Trans. Multimedia Comput. Commun. Appl.,
Heidelberg, 2011, pp. 92–105. vol. 13, no. 2, pp. 18:1–18:25, Mar. 2017. [Online]. Available:
[63] V. Singh, J. Ott, and I. D. Curcio, “Predictive Buffering for Streaming http://doi.acm.org/10.1145/3052822
Video in 3G Networks,” in 2012 IEEE International Symposium on a [84] L. De Cicco, V. Caldaralo, V. Palmisano, and S. Mascolo, “ELASTIC:
World of Wireless, Mobile and Multimedia Networks (WoWMoM), June A Client-Side Controller for Dynamic Adaptive Streaming over HTTP
2012, pp. 1–10. (DASH),” in 2013 20th International Packet Video Workshop, Dec
[64] B. Taani and R. Zimmermann, “Spatio-Temporal Analysis of 2013, pp. 1–8.
Bandwidth Maps for Geo-Predictive Video Streaming in Mobile [85] J. C. Doyle, B. A. Francis, and A. R. Tannenbaum, Feedback Control
Environments,” in Proceedings of the 2016 ACM on Multimedia Theory. Courier Corporation, 2013.

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X

[86] K. Miller, E. Quacchio, G. Gennari, and A. Wolisz, “Adaptation Algo- [104] Y. Li, “Deep Reinforcement Learning: An Overview,” CoRR, vol.
rithm for Adaptive Streaming over HTTP,” in 2012 19th International abs/1701.07274, 2017. [Online]. Available: http://arxiv.org/abs/1701.
Packet Video Workshop (PV), May 2012, pp. 173–178. 07274
[87] J. Jiang, V. Sekar, and H. Zhang, “Improving Fairness, Efficiency, and [105] J. K. Rowling, Harry Potter and the Goblet of Fire. Bloomsbury,
Stability in HTTP-based Adaptive Video Streaming with FESTIVE,” 2000.
in Proceedings of the 8th International Conference on Emerging [106] S. Akhshabi, L. Anantakrishnan, C. Dovrolis, and A. C. Begen,
Networking Experiments and Technologies, ser. CoNEXT ’12. New “Server-based Traffic Shaping for Stabilizing Oscillating Adaptive
York, NY, USA: ACM, 2012, pp. 97–108. [Online]. Available: Streaming Players,” in Proceeding of the 23rd ACM Workshop on
http://doi.acm.org/10.1145/2413176.2413189 Network and Operating Systems Support for Digital Audio and Video,
[88] C. Zhou, C. Lin, X. Zhang, and Z. Guo, “TFDASH: A Fairness, ser. NOSSDAV ’13. New York, NY, USA: ACM, 2013, pp. 19–24.
Stability, and Efficiency Aware Rate Control Approach for Multiple [Online]. Available: http://doi.acm.org/10.1145/2460782.2460786
Clients over DASH,” CoRR, vol. abs/1704.08535, 2017. [Online]. [107] R. Houdaille and S. Gouache, “Shaping HTTP Adaptive Streams for a
Available: http://arxiv.org/abs/1704.08535 Better User Experience,” in Proceedings of the 3rd Multimedia Systems
[89] G. Tian and Y. Liu, “Towards Agile and Smooth Video Adaptation in Conference, ser. MMSys ’12. New York, NY, USA: ACM, 2012, pp.
Dynamic HTTP Streaming,” in Proceedings of the 8th International 1–9. [Online]. Available: http://doi.acm.org/10.1145/2155555.2155557
Conference on Emerging Networking Experiments and Technologies, [108] A. Detti, B. Ricci, and N. Blefari-Melazzi, “Tracker-Assisted Rate
ser. CoNEXT ’12. New York, NY, USA: ACM, 2012, pp. 109–120. Adaptation for MPEG DASH Live Streaming,” in IEEE INFOCOM
[Online]. Available: http://doi.acm.org/10.1145/2413176.2413190 2016 - The 35th Annual IEEE International Conference on Computer
Communications, April 2016, pp. 1–9.
[90] A. J. Smola and B. Schölkopf, “A Tutorial on Support Vector
[109] L. De Cicco, S. Mascolo, and V. Palmisano, “Feedback Control
Regression,” Statistics and Computing, vol. 14, no. 3, pp. 199–
for Adaptive Live Video Streaming,” in Proceedings of the Second
222, Aug. 2004. [Online]. Available: https://doi.org/10.1023/B:
Annual ACM Conference on Multimedia Systems, ser. MMSys ’11.
STCO.0000035301.49549.88
New York, NY, USA: ACM, 2011, pp. 145–156. [Online]. Available:
[91] C. Wang, A. Rizk, and M. Zink, “SQUAD: A Spectrum-based http://doi.acm.org/10.1145/1943552.1943573
Quality Adaptation for Dynamic Adaptive Streaming over HTTP,” [110] J. Bruneau-Queyreix, M. Lacaud, D. Negru, J. M. Batalla, and E. Bor-
in Proceedings of the 7th International Conference on Multimedia coci, “MS-Stream: A Multiple-Source Adaptive Streaming Solution
Systems, ser. MMSys ’16. New York, NY, USA: ACM, 2016, pp. 1:1– Enhancing Consumer’s Perceived Quality,” in 2017 14th IEEE Annual
1:12. [Online]. Available: http://doi.acm.org/10.1145/2910017.2910593 Consumer Communications Networking Conference (CCNC), Jan 2017,
[92] D. Havey, R. Chertov, and K. Almeroth, “Receiver Driven Rate pp. 427–434.
Adaptation for Wireless Multimedia Applications,” in Proceedings [111] E. Thomas, M. van Deventer, T. Stockhammer, A. C. Begen, M.-L.
of the 3rd Multimedia Systems Conference, ser. MMSys ’12. New Champel, and O. Oyman, “Applications and Deployments of Server
York, NY, USA: ACM, 2012, pp. 155–166. [Online]. Available: and Network Assisted DASH (SAND),” in Proc. Int. Broadcasting
http://doi.acm.org/10.1145/2155555.2155582 Convention Conf. (IBC), Amsterdam, The Netherlands, 2016.
[93] P. Juluri, V. Tamarapalli, and D. Medhi, “SARA: Segment-Aware Rate [112] R. K. P. Mok, X. Luo, E. W. W. Chan, and R. K. C.
Adaptation Algorithm for Dynamic Adaptive Streaming over HTTP,” Chang, “QDASH: A QoE-aware DASH System,” in Proceedings of
in 2015 IEEE International Conference on Communication Workshop the 3rd Multimedia Systems Conference, ser. MMSys ’12. New
(ICCW), June 2015, pp. 1765–1770. York, NY, USA: ACM, 2012, pp. 11–22. [Online]. Available:
[94] A. Beben, P. Wiśniewski, J. M. Batalla, and P. Krawiec, “ABMA+: http://doi.acm.org/10.1145/2155555.2155558
Lightweight and Efficient Algorithm for HTTP Adaptive Streaming,” [113] N. Bouten, R. de O. Schmidt, J. Famaey, S. LatrÃl’, A. Pras, and F. D.
in Proceedings of the 7th International Conference on Multimedia Turck, “QoE-driven In-Network Optimization for Adaptive Video
Systems, ser. MMSys ’16. New York, NY, USA: ACM, 2016, pp. 2:1– Streaming based on Packet Sampling Measurements,” Computer
2:11. [Online]. Available: http://doi.acm.org/10.1145/2910017.2910596 Networks, vol. 81, pp. 96 – 115, 2015. [Online]. Available:
[95] A. Bentaleb, A. C. Begen, R. Zimmermann, and S. Harous, “Want to http://www.sciencedirect.com/science/article/pii/S1389128615000468
Play DASH? A Game Theoretic Approach for Adaptive Streaming over [114] V. Krishnamoorthi, N. Carlsson, E. Halepovic, and E. Petajan,
HTTP,” in ACM MMSys, 2018. “BUFFEST: Predicting Buffer Conditions and Real-time Requirements
[96] R. B. Myerson, Game Theory. Harvard University Press, 2013. of HTTP(S) Adaptive Streaming Clients,” in Proceedings of the 8th
[97] M. Xing, S. Xiang, and L. Cai, “A Real-Time Adaptive Algorithm ACM on Multimedia Systems Conference, ser. MMSys’17. New
for Video Streaming over Multiple Wireless Access Networks,” IEEE York, NY, USA: ACM, 2017, pp. 76–87. [Online]. Available:
Journal on Selected Areas in Communications, vol. 32, no. 4, pp. 795– http://doi.acm.org/10.1145/3083187.3083193
805, April 2014. [115] D. P. Palomar and M. Chiang, “A Tutorial on Decomposition Methods
[98] A. Bokani, M. Hassan, S. Kanhere, and X. Zhu, “Optimizing HTTP- for Network Utility Maximization,” IEEE Journal on Selected Areas
Based Adaptive Streaming in Vehicular Environment Using Markov in Communications, vol. 24, no. 8, pp. 1439–1451, Aug 2006.
Decision Process,” IEEE Transactions on Multimedia, vol. 17, no. 12, [116] T. L. Stefano, D’Aronco and P. Frossard, “Price-Based Controller for
pp. 2297–2309, Dec 2015. Utility-Aware HTTP Adaptive Streaming,” IEEE MultiMedia, vol. 24,
no. 2, pp. 20–29, Apr 2017.
[99] S. Petrangeli, M. Claeys, S. Latré, J. Famaey, and F. De Turck, “A
[117] S. Petrangeli, J. Famaey, M. Claeys, S. Latré, and F. De Turck, “QoE-
Multi-Agent Q-Learning-based Framework for Achieving Fairness in
driven Rate Adaptation Heuristic for Fair Adaptive Video Streaming,”
HTTP Adaptive Streaming,” in 2014 IEEE Network Operations and
ACM Transactions on Multimedia Computing, Communications, and
Management Symposium (NOMS), May 2014, pp. 1–9.
Applications (TOMM), vol. 12, no. 2, p. 28, 2016.
[100] F. Chiariotti, S. D’Aronco, L. Toni, and P. Frossard, “Online Learning [118] V. Joseph and G. de Veciana, “NOVA: QoE-driven Optimization of
Adaptation Strategy for DASH Clients,” in Proceedings of the 7th DASH-Based Video Delivery in Networks,” in IEEE INFOCOM 2014
International Conference on Multimedia Systems, ser. MMSys ’16. - IEEE Conference on Computer Communications, April 2014, pp. 82–
New York, NY, USA: ACM, 2016, pp. 8:1–8:12. [Online]. Available: 90.
http://doi.acm.org/10.1145/2910017.2910603 [119] R. M. Gray and R. Gray, Probability, Random Processes, and Ergodic
[101] C. Zhou, C.-W. Lin, and Z. Guo, “mDASH: A Markov Decision- Properties. Springer, 1988.
Based Rate Adaptation Approach for Dynamic HTTP Streaming,” [120] J. Chen, R. Mahindra, M. A. Khojastepour, S. Rangarajan, and
IEEE Transactions on Multimedia, vol. 18, no. 4, pp. 738–751, April M. Chiang, “A Scheduling Framework for Adaptive Video Delivery
2016. over Cellular Networks,” in Proceedings of the 19th Annual
[102] H. Mao, R. Netravali, and M. Alizadeh, “Neural Adaptive Video International Conference on Mobile Computing & Networking, ser.
Streaming with Pensieve,” in Proceedings of the Conference of the MobiCom ’13. New York, NY, USA: ACM, 2013, pp. 389–400.
ACM Special Interest Group on Data Communication, ser. SIGCOMM [Online]. Available: http://doi.acm.org/10.1145/2500423.2500433
’17. New York, NY, USA: ACM, 2017, pp. 197–210. [Online]. [121] A. El Essaili, D. Schroeder, D. Staehle, M. Shehada, W. Kellerer, and
Available: http://doi.acm.org/10.1145/3098822.3098843 E. Steinbach, “Quality-of-Experience driven Adaptive HTTP Media
[103] M. Gadaleta, F. Chiariotti, M. Rossi, and A. Zanella, “D-DASH: Delivery,” in 2013 IEEE International Conference on Communications
A Deep Q-Learning Framework for DASH Video Streaming,” IEEE (ICC), June 2013, pp. 2480–2485.
Transactions on Cognitive Communications and Networking, vol. 3, [122] J. W. Kleinrouweler, S. Cabrero, R. van der Mei, and P. Cesar,
no. 4, pp. 703–718, Dec 2017. “Modeling Stability and Bitrate of Network-Assisted HTTP Adaptive

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X

Streaming Players,” in Proceedings of the 2015 27th International on Selected Areas in Communications, vol. 34, no. 8, pp. 2130–2140,
Teletraffic Congress, ser. ITC ’15. Washington, DC, USA: IEEE Aug 2016.
Computer Society, 2015, pp. 177–184. [Online]. Available: http: [140] S. Petrangeli, N. Bouten, M. Claeys, and F. D. Turck, “Towards
//dx.doi.org/10.1109/ITC.2015.28 SVC-based Adaptive Streaming in Information Centric Networks,” in
[123] V. Ramamurthi and O. Oyman, “Video-QoE Aware Radio Resource 2015 IEEE International Conference on Multimedia Expo Workshops
Allocation for HTTP Adaptive Streaming,” in 2014 IEEE International (ICMEW), June 2015, pp. 1–6.
Conference on Communications (ICC), June 2014, pp. 1076–1081. [141] C. Xu, W. Quan, A. V. Vasilakos, H. Zhang, and G.-M. Muntean,
[124] V. Ramamurthi, O. Oyman, and J. Foerster, “Video-QoE Aware Re- “Information-Centric Cost-Efficient Optimization for Multimedia
source Management at Network Core,” in 2014 IEEE Global Commu- Content Delivery in Mobile Vehicular Networks,” Computer
nications Conference, Dec 2014, pp. 1418–1423. Communications, vol. 99, pp. 93 – 106, 2017. [Online]. Available:
[125] B. Han, F. Qian, L. Ji, V. Gopalakrishnan, and N. Bedminster, “MP- http://www.sciencedirect.com/science/article/pii/S0140366416302729
DASH: Adaptive Video Streaming Over Preference-Aware Multipath,” [142] A. Detti, B. Ricci, and N. Blefari-Melazzi, “Mobile Peer-to-Peer
in Proceedings of the 12th International on Conference on Emerging Video Streaming over Information-centric Networks,” Comput. Netw.,
Networking EXperiments and Technologies, ser. CoNEXT ’16. New vol. 81, no. C, pp. 272–288, Apr. 2015. [Online]. Available:
York, NY, USA: ACM, 2016, pp. 129–143. [Online]. Available: http://dx.doi.org/10.1016/j.comnet.2015.02.018
http://doi.acm.org/10.1145/2999572.2999606 [143] D. Kreutz, F. M. V. Ramos, P. E. Verissimo, C. E. Rothenberg,
[126] Z. Yan, J. Xue, and C. W. Chen, “Prius: Hybrid Edge Cloud and Client S. Azodolmolky, and S. Uhlig, “Software-Defined Networking: A
Adaptation for HTTP Adaptive Streaming in Cellular Networks,” IEEE Comprehensive Survey,” Proceedings of the IEEE, vol. 103, no. 1, pp.
Transactions on Circuits and Systems for Video Technology, vol. 27, 14–76, Jan 2015.
no. 1, pp. 209–222, Jan 2017. [144] J. Yang, K. Zhu, Y. Ran, W. Cai, and E. Yang, “Joint Admission Control
[127] A. H. Zahran, J. J. Quinlan, K. K. Ramakrishnan, and C. J. and Routing via Approximate Dynamic Programming for Streaming
Sreenan, “SAP: Stall-Aware Pacing for Improved DASH Video Video over Software-Defined Networking,” IEEE Transactions on
Experience in Cellular Networks,” in Proceedings of the 8th Multimedia, vol. 19, no. 3, pp. 619–631, March 2017.
ACM on Multimedia Systems Conference, ser. MMSys’17. New [145] P. Georgopoulos, Y. Elkhatib, M. Broadbent, M. Mu,
York, NY, USA: ACM, 2017, pp. 13–26. [Online]. Available: and N. Race, “Towards Network-wide QoE Fairness Using
http://doi.acm.org/10.1145/3083187.3083199 Openflow-assisted Adaptive Video Streaming,” in Proceedings
[128] T. T. Nguyen and G. Armitage, “A Survey of Techniques for Internet of the 2013 ACM SIGCOMM Workshop on Future Human-
Traffic Classification using Machine Learning,” IEEE Communications centric Multimedia Networking, ser. FhMN ’13. New York,
Surveys Tutorials, vol. 10, no. 4, pp. 56–76, Fourth 2008. NY, USA: ACM, 2013, pp. 15–20. [Online]. Available:
[129] M. D. F. De Grazia, D. Zucchetto, A. Testolin, A. Zanella, M. Zorzi, http://doi.acm.org/10.1145/2491172.2491181
and M. Zorzi, “QoE Multi-Stage Machine Learning for Dynamic Video [146] A. Farshad, P. Georgopoulos, M. Broadbent, M. Mu, and N. Race,
Streaming,” IEEE Transactions on Cognitive Communications and “Leveraging SDN to provide an in-network QoE measurement frame-
Networking, vol. 4, no. 1, pp. 146–161, March 2018. work,” in 2015 IEEE Conference on Computer Communications Work-
shops (INFOCOM WKSHPS), April 2015, pp. 239–244.
[130] G. Urvoy-Keller, “On the Stationarity of TCP Bulk Data Transfers,”
[147] H. Nam, K.-H. Kim, J. Y. Kim, and H. Schulzrinne, “Towards QoE-
in Proceedings of the 6th International Conference on Passive and
aware Video Streaming using SDN,” in 2014 IEEE Global Communi-
Active Network Measurement, ser. PAM’05. Berlin, Heidelberg:
cations Conference, Dec 2014, pp. 1317–1322.
Springer-Verlag, 2005, pp. 27–40. [Online]. Available: http://dx.doi.
[148] Q. Wang, K. Xu, R. Izard, B. Kribbs, J. Porter, K.-C. Wang, A. Prakash,
org/10.1007/978-3-540-31966-5_3
and P. Ramanathan, “GENI Cinema: an SDN-assisted scalable live
[131] Z. Akhtar, Y. S. Nam, R. Govindan, S. Rao, J. Chen, E. Katz-
video streaming service,” in 2014 IEEE 22nd International Conference
Bassett, B. Ribeiro, J. Zhan, and H. Zhang, “Oboe: Auto-
on Network Protocols, Oct 2014, pp. 529–532.
tuning Video ABR Algorithms to Network Conditions,” SIGCOMM
[149] P. Stefano, W. Tim, H. Rafael, B. Tom, and D. Filip, “Software-
Comput. Commun. Rev., Aug. 2018. [Online]. Available: https:
Defined Network-based Prioritization to avoid Video Freezes in
//engineering.purdue.edu/~isl/papers/sigcomm18-final128.pdf
HTTP Adaptive Streaming,” International Journal of Network
[132] P. Georgopoulos, M. Broadbent, A. Farshad, B. Plattner, and N. Race, Management, vol. 26, no. 4, pp. 248–268. [Online]. Available:
“Using Software Defined Networking to enhance the delivery of https://onlinelibrary.wiley.com/doi/abs/10.1002/nem.1931
Video-on-Demand,” Computer Communications, vol. 69, pp. 79 – [150] J. W. Kleinrouweler, S. Cabrero, and P. Cesar, “Delivering Stable
87, 2015. [Online]. Available: http://www.sciencedirect.com/science/ High-quality Video: An SDN Architecture with DASH Assisting
article/pii/S0140366415002315 Network Elements,” in Proceedings of the 7th International
[133] G. Cofano, L. De Cicco, T. Zinner, A. Nguyen-Ngoc, P. Tran-Gia, and Conference on Multimedia Systems, ser. MMSys ’16. New
S. Mascolo, “Design and Experimental Evaluation of Network-assisted York, NY, USA: ACM, 2016, pp. 4:1–4:10. [Online]. Available:
Strategies for HTTP Adaptive Streaming,” in Proceedings of the 7th http://doi.acm.org/10.1145/2910017.2910599
International Conference on Multimedia Systems, ser. MMSys ’16. [151] A. Bentaleb, A. C. Begen, and R. Zimmermann, “SDNDASH:
New York, NY, USA: ACM, 2016, pp. 3:1–3:12. [Online]. Available: Improving QoE of HTTP Adaptive Streaming Using Software Defined
http://doi.acm.org/10.1145/2910017.2910597 Networking,” in Proceedings of the 2016 ACM on Multimedia
[134] D. Bhat, A. Rizk, M. Zink, and R. Steinmetz, “Network Assisted Conference, ser. MM ’16. New York, NY, USA: ACM, 2016, pp.
Content Distribution for Adaptive Bitrate Video Streaming,” in 1296–1305. [Online]. Available: http://doi.acm.org/10.1145/2964284.
Proceedings of the 8th ACM on Multimedia Systems Conference, ser. 2964332
MMSys’17. New York, NY, USA: ACM, 2017, pp. 62–75. [Online]. [152] A. Bentaleb, A. C. Begen, R. Zimmermann, and S. Harous, “SDNHAS:
Available: http://doi.acm.org/10.1145/3083187.3083196 An SDN-Enabled Architecture to Optimize QoE in HTTP Adaptive
[135] A. Seema, L. Schwoebel, T. Shah, J. Morgan, and M. Reisslein, Streaming,” IEEE Transactions on Multimedia, vol. 19, no. 10, pp.
“WVSNP-DASH: Name-Based Segmented Video Streaming,” IEEE 2136–2151, Oct 2017.
Transactions on Broadcasting, vol. 61, no. 3, pp. 346–355, Sept 2015. [153] A. Bentaleb, A. C. Begen, and R. Zimmermann, “ORL-SDN: Online
[136] D. E. Knuth, “Backus Normal Form vs. Backus Naur Form,” Commun. Reinforcement Learning for SDN-Enabled HTTP Adaptive Streaming,”
ACM, vol. 7, no. 12, pp. 735–736, Dec. 1964. [Online]. Available: ACM Multimedia Computing, Communications, and Applications, to
http://doi.acm.org/10.1145/355588.365140 appear, 2018.
[137] S. Lederer, C. Mueller, C. Timmerer, and H. Hellwagner, “Adaptive [154] A. Bentaleb and A. C. Begen and R. Zimmermann, “QoE-Aware
Multimedia Streaming in Information-Centric Networks,” IEEE Net- Bandwidth Broker for HTTP Adaptive Streaming Flows in an SDN-
work, vol. 28, no. 6, pp. 91–96, Nov 2014. Enabled HFC Network,” IEEE Transactions on Broadcasting, pp. 1–15,
[138] C. Westphal, S. Lederer, D. Posch, C. Timmerer, A. Azgin, W. S. 2018.
Liu, C. Mueller, A. Detti, D. Corujo, J. Wang, M.-J. Montpetit, [155] C.-F. Lai, R.-H. Hwang, H.-C. Chao, M. M. Hassan, and A. Alamri,
and N. Murray, “Adaptive Video Streaming over Information-Centric “A Buffer-Aware HTTP Live Streaming Approach for SDN-enabled
Networking (ICN) – RFC 7933,” Internet Engineering Task Force, 5G Wireless Networks,” IEEE Network, vol. 29, no. 1, pp. 49–55, Jan
5177 Brandin Court Fremont, California 94538 USA, Tech. Rep., aug 2015.
2016. [Online]. Available: http://www.ietf.org/rfc/rfc7933.txt [156] A. Ganjam, F. Siddiqui, J. Zhan, X. Liu, I. Stoica, J. Jiang, V. Sekar,
[139] B. Rainer, D. Posch, and H. Hellwagner, “Investigating the Performance and H. Zhang, “C3: Internet-Scale Control Plane for Video Quality
of Pull-Based Dynamic Adaptive Streaming in NDN,” IEEE Journal Optimization,” in 12th USENIX Symposium on Networked Systems

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X

Design and Implementation (NSDI 15). Oakland, CA: USENIX on Network and Operating Systems Support for Digital Audio and
Association, 2015, pp. 131–144. [Online]. Available: https://www. Video. New York, NY, USA: ACM, 2015, pp. 25–30. [Online].
usenix.org/conference/nsdi15/technical-sessions/presentation/ganjam Available: http://doi.acm.org/10.1145/2736084.2736088
[157] J. Jiang, V. Sekar, H. Milner, D. Shepherd, I. Stoica, and H. Zhang, [174] R. Huysegems, J. van der Hooft, T. Bostoen, P. Rondao Alface,
“CFA: A Practical Prediction System for Video QoE Optimization,” S. Petrangeli, T. Wauters, and F. De Turck, “HTTP/2-Based Methods to
in 13th USENIX Symposium on Networked Systems Design and Improve the Live Experience of Adaptive Streaming,” in Proceedings
Implementation (NSDI 16). Santa Clara, CA: USENIX Association, of the 23rd ACM International Conference on Multimedia. New
2016, pp. 137–150. [Online]. Available: https://www.usenix.org/ York, NY, USA: ACM, 2015, pp. 541–550. [Online]. Available:
conference/nsdi16/technical-sessions/presentation/jiang http://doi.acm.org/10.1145/2733373.2806264
[158] Y. Sun, X. Yin, J. Jiang, V. Sekar, F. Lin, N. Wang, T. Liu, [175] G. Carlucci, L. De Cicco, and S. Mascolo, “HTTP over UDP:
and B. Sinopoli, “CS2P: Improving Video Bitrate Selection and An Experimental Investigation of QUIC,” in Proceedings of the
Adaptation with Data-Driven Throughput Prediction,” in Proceedings 30th Annual ACM Symposium on Applied Computing. New
of the 2016 ACM SIGCOMM Conference, ser. SIGCOMM ’16. New York, NY, USA: ACM, 2015, pp. 609–614. [Online]. Available:
York, NY, USA: ACM, 2016, pp. 272–285. [Online]. Available: http://doi.acm.org/10.1145/2695664.2695706
http://doi.acm.org/10.1145/2934872.2934898 [176] R. Netravali, A. Sivaraman, S. Das, A. Goyal, K. Winstein,
[159] J. Jiang, S. Sun, V. Sekar, and H. Zhang, “Pytheas: Enabling J. Mickens, and H. Balakrishnan, “Mahimahi: Accurate Record-and-
Data-Driven Quality of Experience Optimization Using Group- Replay for HTTP,” in 2015 USENIX Annual Technical Conference
Based Exploration-Exploitation,” in 14th USENIX Symposium (USENIX ATC 15). Santa Clara, CA: USENIX Association, 2015,
on Networked Systems Design and Implementation (NSDI pp. 417–429. [Online]. Available: https://www.usenix.org/conference/
17). Boston, MA: USENIX Association, 2017, pp. 393– atc15/technical-session/presentation/netravali
406. [Online]. Available: https://www.usenix.org/conference/nsdi17/ [177] X. Corbillon, A. Devlic, G. Simon, and J. Chakareski, “Viewport-
technical-sessions/presentation/jiang Adaptive Navigable 360-Degree Video Delivery,” CoRR, vol.
[160] E. Thomas, M. van Deventer, T. Stockhammer, A. C. Begen, and abs/1609.08042, 2016. [Online]. Available: http://arxiv.org/abs/1609.
J. Famaey, “Enhancing MPEG DASH Performance via Server and 08042
Network Assistance,” in Proceedings of the International Broadcasting [178] R. Ghaznavi-Youvalari, A. Zare, H. Fang, A. Aminlou, Q. Xie, M. M.
Convention (IBC) Conference. Amsterdam, The Netherlands: IET, Hannuksela, and M. Gabbouj, “Comparison of HEVC Coding Schemes
2015. for Tile-based Viewport-adaptive Streaming of Omnidirectional Video,”
[161] J. W. Kleinrouweler, B. Meixner, and P. Cesar, “Improving Video in 2017 IEEE 19th International Workshop on Multimedia Signal
Quality in Crowded Networks Using a DANE,” in Proceedings of Processing (MMSP), Oct 2017, pp. 1–6.
the 27th Workshop on Network and Operating Systems Support [179] M. Graf, C. Timmerer, and C. Mueller, “Towards Bandwidth Efficient
for Digital Audio and Video, ser. NOSSDAV’17. New York, Adaptive Streaming of Omnidirectional Video over HTTP: Design,
NY, USA: ACM, 2017, pp. 73–78. [Online]. Available: http: Implementation, and Evaluation,” in Proceedings of the 8th ACM on
//doi.acm.org/10.1145/3083165.3083167 Multimedia Systems Conference. New York, NY, USA: ACM, 2017,
[162] J. Famaey, S. Latré, N. Bouten, W. Van de Meerssche, pp. 261–271. [Online]. Available: http://doi.acm.org/10.1145/3083187.
B. De Vleeschauwer, W. Van Leekwijck, and F. De Turck, “On the 3084016
Merits of SVC-based HTTP Adaptive Streaming,” in 2013 IFIP/IEEE
[180] S. Petrangeli, V. Swaminathan, M. Hosseini, and F. De Turck,
International Symposium on Integrated Network Management (IM
“Improving Virtual Reality Streaming Using HTTP/2,” in Proceedings
2013), May 2013, pp. 419–426.
of the 8th ACM on Multimedia Systems Conference. New
[163] Y. Sanchez, T. Schierl, C. Hellge, T. Wiegand, D. Hong, D. D.
York, NY, USA: ACM, 2017, pp. 225–228. [Online]. Available:
Vleeschauwer, W. V. Leekwijck, and Y. L. Louedec, “Efficient
http://doi.acm.org/10.1145/3083187.3083224
HTTP-based Streaming using Scalable Video Coding,” Signal
[181] C. Timmerer, “Immersive Media Delivery: Overview of Ongoing Stan-
Processing: Image Communication, vol. 27, no. 4, pp. 329–342, 2012.
dardization Activities,” IEEE Communications Standards Magazine,
[Online]. Available: http://www.sciencedirect.com/science/article/pii/
vol. 1, no. 4, pp. 71–74, Dec 2017.
S0923596511001147
[164] U. S. M. Dayananda and V. Swaminathan, “Investigating Scalable [182] B. Choi, Y.-K. Wang, M. M. Hannuksela, Y. Lim, and A. Murtaza,
High Efficiency Video Coding for HTTP streaming,” in 2015 IEEE “Information Technology – Coded Representation of Immersive Media
International Conference on Multimedia Expo Workshops (ICMEW), (MPEG-I) – Part 2: Omnidirectional Media Format,” ISO/IEC 23090-2
June 2015, pp. 1–6. FDIS, Dec. 2017.
[165] C. Timmerer, “MPEG Column: 120th MPEG Meeting in Macau, [183] D. Podborski, E. Thomas, M. Hannuksela, S. Oh, T. Stockhammer,
China,” SIGMultimedia Rec., vol. 9, no. 3, pp. 4:4–4:4, Jan. 2018. and S. Pham, “Virtual Reality and DASH,” in Proceedings of the
[Online]. Available: http://doi.acm.org/10.1145/3178422.3178426 International Broadcasting Convention (IBC) Conference, 2017.
[166] M. Belshe and R. Peon, “SPDY protocol,” https://www.chromium.org/ [184] S. Lederer, C. Mueller, C. Timmerer, C. Concolato, J. Le Feuvre,
spdy/spdy-whitepaper, 2012. and K. Fliegel, “Distributed DASH Dataset,” in Proceedings of the
[167] M. Belshe, M. Thomson, and R. Peon, “Hypertext Transfer Protocol 4th ACM Multimedia Systems Conference, ser. MMSys ’13. New
version 2 (HTTP/2),” https://tools.ietf.org/html/rfc7540, 2015. York, NY, USA: ACM, 2013, pp. 131–135. [Online]. Available:
[168] J. Iyengar and M. Thomson, “QUIC: A UDP-Based http://doi.acm.org/10.1145/2483977.2483994
Multiplexed and Secure Transport,” https://tools.ietf.org/html/ [185] J. Le Feuvre, J.-M. Thiesse, M. Parmentier, M. Raulet, and C. Daguet,
draft-ietf-quic-transport-08, 2017, online; accessed on Dec. 18, 2017. “Ultra High Definition HEVC DASH Data Set,” in Proceedings
[169] C. Mueller, S. Lederer, C. Timmerer, and H. Hellwagner, “Dynamic of the 5th ACM Multimedia Systems Conference, ser. MMSys ’14.
Adaptive Streaming over HTTP/2.0,” in 2013 IEEE International New York, NY, USA: ACM, 2014, pp. 7–12. [Online]. Available:
Conference on Multimedia and Expo (ICME), July 2013, pp. 1–6. http://doi.acm.org/10.1145/2557642.2563672
[170] C. Timmerer and A. Bertoni, “Advanced Transport Options for the [186] J. J. Quinlan, A. H. Zahran, and C. J. Sreenan, “Datasets
Dynamic Adaptive Streaming over HTTP,” CoRR, vol. abs/1606.00264, for AVC (H.264) and HEVC (H.265) Evaluation of Dynamic
2016. [Online]. Available: http://arxiv.org/abs/1606.00264 Adaptive Streaming over HTTP (DASH),” in Proceedings of the 7th
[171] D. Bhat, A. Rizk, and M. Zink, “Not So QUIC: A Performance International Conference on Multimedia Systems, ser. MMSys ’16.
Study of DASH over QUIC,” in Proceedings of the 27th Workshop on New York, NY, USA: ACM, 2016, pp. 51:1–51:6. [Online]. Available:
Network and Operating Systems Support for Digital Audio and Video. http://doi.acm.org/10.1145/2910017.2910625
New York, NY, USA: ACM, 2017, pp. 13–18. [Online]. Available: [187] A. Zabrovskiy, C. Feldmann, and C. Timmerer, “Multi-codec DASH
http://doi.acm.org/10.1145/3083165.3083175 Dataset,” in Proceedings of the 9th ACM Multimedia Systems
[172] M. Xiao, V. Swaminathan, S. Wei, and S. Chen, “Evaluating and Conference, ser. MMSys ’18. New York, NY, USA: ACM, 2018,
Improving Push Based Video Streaming with HTTP/2,” in Proceedings pp. 438–443. [Online]. Available: http://doi.acm.org/10.1145/3204949.
of the 26th International Workshop on Network and Operating Systems 3208140
Support for Digital Audio and Video, ser. NOSSDAV ’16. New [188] J. J. Quinlan and C. J. Sreenan, “Multi-profile Ultra High Definition
York, NY, USA: ACM, 2016, pp. 3:1–3:6. [Online]. Available: (UHD) AVC and HEVC 4K DASH Datasets,” in Proceedings of the
http://doi.acm.org/10.1145/2910642.2910652 9th ACM Multimedia Systems Conference, ser. MMSys ’18. New
[173] W. Cherif, Y. Fablet, E. Nassor, J. Taquet, and Y. Fujimori, “DASH York, NY, USA: ACM, 2018, pp. 375–380. [Online]. Available:
Fast Start Using HTTP/2,” in Proceedings of the 25th ACM Workshop http://doi.acm.org/10.1145/3204949.3208130

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2018.2862938, IEEE
Communications Surveys & Tutorials
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL.XX, NO.X, MONTH 201X

ACKNOWLEDGMENTS Christian Timmerer (M’08–SM’16) is an Asso-


ciate Professor with Alpen-Adria-Universität Kla-
The authors would like to thank Prof. Saad Harous for his genfurt, Klagenfurt, Austria. He is a co-founder
valuable feedback. This research has been supported in part by of Bitmovin Inc., San Francsico, CA, USA, as
the National Natural Science Foundation of China under Grant well as the CIO and the Head of Research and
Standardization. He coauthored seven patents and
no. 61472266 and by the National University of Singapore more than 190 publications in workshops, confer-
(Suzhou) Research Institute, 377 Lin Quan Street, Suzhou ences, journals, and book chapters. He participated
Industrial Park, Jiang Su, People’s Republic of China, 215123. in several EC-funded projects, notably DANAE,
ENTHRONE, P2P-Next, ALICANTE, SocialSensor,
Additionally, this work was supported in part by the Austrian ICoSOLE, and the COST Action IC1003 QUA-
Research Promotion Agency (FFG) under the Next Generation LINET. He also participated in ISO/MPEG work for several years, notably
Video Streaming project "PROMETHEUS". in the areas of MPEG- 21, MPEG-M, MPEG-V, and MPEG-DASH. His
research interests include immersive multimedia communications, streaming,
adaptation, and quality of experience. Prof. Timmerer was the General Chair
of WIAMIS 2008, QoMEX 2013, ACM MMSys 2016, and Packet Video
2018. Further information can be found at http://blog.timmerer.com.

Abdelhak Bentaleb received the M.S. degree in


computing (network and multimedia) from Mo-
hamed El Bachir El Ibrahimi University, Bordj Bou
Arreridj, Algeria in 2011. He is currently working
towards his Ph.D. degree in computer science at
the School of Computing, the National University
of Singapore (NUS), Singapore. His research in-
terests include multimedia systems and communi-
cation, video streaming architectures, content de-
livery, distributed computing, computer networks
and protocols, wireless communications, and mobile
networks.

Bayan Taani received her B.Sc. degree in Network


Engineering and Security from Jordan University of Roger Zimmermann (M’93–SM’07) received his
Science and Technology (JUST), Irbid, Jordan in M.S. and Ph.D. degrees from the University of
2014. She is currently a Ph.D. candidate in Computer Southern California (USC), Los Angeles, USA, in
Science, at the National University of Singapore 1994 and 1998, respectively. He is currently an As-
(NUS), Singapore. Her research interests include sociate Professor with the Department of Computer
multimedia systems and communications, adaptive Science at the National University of Singapore
video streaming, and virtual reality. (NUS), Singapore. He is also a Deputy Director with
the Smart Systems Institute (SSI), and previously
co-directed the Centre of Social Media Innovations
for Communities at NUS. He has coauthored a
book, seven patents, and more than 200 conference
publications, journal articles, and book chapters. His research interests include
streaming media architectures, distributed systems, mobile and geo-referenced
video management, collaborative environments, spatio-temporal information
management, and mobile location-based services. He is a distinguished
member of the ACM and a senior member of the IEEE. Further information
Ali C. Begen (S’98-M’07-SM’12) recently joined can be found at http://www.comp.nus.edu.sg/~rogerz/.
the computer science department at Ozyegin Uni-
versity, Turkey. Previously, he was a research and
development engineer at Cisco, where he designed
and developed algorithms, protocols, products, and
solutions in the service provider and enterprise video
domains. Currently, in addition to teaching and re-
search, he provides consulting services to industrial,
legal, and academic institutions through Networked
Media, a company he co-founded. Begen has a PhD
in electrical and computer engineering from Georgia
Tech. He received a number of scholarly and industry awards, and he has
editorial positions in prestigious magazines and journals in the field. He
is a senior member of both the IEEE and ACM. In January 2016, he
was elected distinguished lecturer by the IEEE Communications Society.
Further information on his projects, publications, talks, teaching, standards,
and professional activities can be found at http://ali.begen.net.

1553-877X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like