Transitioning Broadcast To Cloud
Transitioning Broadcast To Cloud
sciences
Article
Transitioning Broadcast to Cloud
Yuriy Reznik, Jordi Cenzano and Bo Zhang *
Brightcove Inc., 290 Congress Street, Boston, MA 02210, USA; [email protected] (Y.R.);
[email protected] (J.C.)
* Correspondence: [email protected]; Tel.: +1-888-882-1880
Abstract: We analyze the differences between on-premise broadcast and cloud-based online video de-
livery workflows and identify technologies needed for bridging the gaps between them. Specifically,
we note differences in ingest protocols, media formats, signal-processing chains, codec constraints,
metadata, transport formats, delays, and means for implementing operations such as ad-splicing,
redundancy and synchronization. To bridge the gaps, we suggest specific improvements in cloud
ingest, signal processing, and transcoding stacks. Cloud playout is also identified as critically needed
technology for convergence. Finally, based on all such considerations, we offer sketches of several
possible hybrid architectures, with different degrees of offloading of processing in cloud, that are
likely to emerge in the future.
Keywords: broadcast; video encoding and streaming; cloud-based services; cloud playout
1. Introduction
Terrestrial broadcast TV has been historically the first and still broadly used technology
for delivery of visual information to the masses. Cable and DHT (direct-to-home) satellite
TV technologies came next, as highly successful evolutions and extensions of the broadcast
TV model [1,2].
Yet, broadcast has some limits. For instance, in its most basic form, it only enables
linear delivery. It also provides direct reach to only one category of devices: TV sets. To
Citation: Reznik, Y.; Cenzano, J.; reach other devices, such as mobiles, tablets, PCs, game consoles, etc., the most practical
Zhang, B. Transitioning Broadcast to option currently available is to send streams Over the Top (OTT). The OTT model utilizes
Cloud. Appl. Sci. 2021, 11, 503. IP-connections that many of such devices already have, and internet streaming as a de-
https://doi.org/10.3390/app11020503 livery mechanism [3–7]. The use of OTT/streaming also makes it possible to implement
interactive, non-linear, time-shifted TV, or DVR types of services.
Received: 13 November 2020
Considering all such benefits and conveniences, many companies in the broadcast
Accepted: 28 December 2020
ecosystem are now increasingly adding OTT services, complementing their traditional (e.g.,
Published: 6 January 2021
terrestrial, cable, satellite) services or distribution models [8–11]. At a broader scale, we
must also recognize new standards and industry initiatives such as HbbTV [12], as well as
Publisher’s Note: MDPI stays neu-
tral with regard to jurisdictional clai-
ATSC 3.0 [13,14], which are further blurring the boundaries between traditional broadcast
ms in published maps and institutio-
and OTT.
nal affiliations.
In other words, we are now living in an era where hybrid broadcast + OTT distribution
becomes a norm, and this brings us to a question of how such hybrid systems can be
deployed and operated most efficiently?
At present time, the two extreme choices are:
Copyright: © 2021 by the authors. Li-
• on-prem: everything, including playout systems, encoders, multiplexers, servers, and
censee MDPI, Basel, Switzerland.
other equipment for both broadcast and OTT distribution is installed and operated on
This article is an open access article
premises, and
distributed under the terms and con-
• cloud-based: almost everything is turned into software-based solutions, and operated
ditions of the Creative Commons At-
tribution (CC BY) license (https://
using infrastructure of cloud service providers, such as AWS, GCP, Azure, etc.
creativecommons.org/licenses/by/ On-prem model is indeed well-known. This is how all traditional broadcast systems
4.0/). have always been built and operated. Cloud-based approach is a more recent development.
systems [24–32]. When streams are sent over IP, real-time UDP (User Datagram Protocol)-
based delivery protocols are normally used. Examples of such protocols include RTP [33],
SMPTE 2022-1 [34], SMPTE 2022-2 [35], Zixi [36], etc.
Pre-recorded content usually comes in form of files, produced by studio encoders.
Again, only standard TV/broadcast video formats are used, and specific codec- and
container-level restrictions are applied (see e.g., [37]). Moreover, in most cases, the contri-
bution (or so-called “mezzanine”) encodings are done at rates that are considerably higher
than rates used for final distribution. This allows broadcast systems to start with “cleaner”
versions of the content.
In case of online video platforms, input content generally comes from a much broader
and more diverse set of sources—from professional production studios and broadcast
workflows to user-generated content. Consequently, the bitrates, formats, and quality of
such streams can also vary greatly. This forces OVPs to be highly versatile, robust, and
tolerant on the ingest end.
The quality of links used to deliver content to OVPs may also vary greatly. From
dedicated connections to datacenters, to public Internet over some local ISPs. UDP may or
may not be available.
In such a context, the live ingest protocol that become most commonly used, remark-
ably enough, is RTMP (Real-Time Messaging Protocol) [38]. This is an old, Flash-era proto-
col, with many known limitations, but it works over TCP (Transmission Control Protocol),
and remains a popular choice for live to cloud ingest. Other protocols for live ingest include
SRT (Secure, Reliable Transport) [39] and RIST (Reliable Internet Stream Transport) [40].
First, we notice that the number of encoded streams is different. In broadcast, each
channel is encoded as a single stream. In streaming, each input is encoded into several
output streams with different resolutions and bitrates. This is needed to accommodate
adaptive bitrate (ABR) delivery.
There are also differences in codecs, encoding modes, and codec constraints. For exam-
ple, in broadcast, the use of constant bitrate (CBR) encoding is most common [24]. It forces
codec to operate at a certain target bitrate, matching the amount of channel bandwidth
allocated for a particular channel. The use of variable bitrate (VBR) encoding in broadcast
is rare and only allowed in so-called statistical multiplexing (or statmux) regime [59,60],
where the multiplexer is effectively driving dynamic bandwidth allocation across all chan-
nels in a way that the total sum of their bitrates remains constant. In streaming, there is
no need for CBR or statmux modes. All streams are typically VBR-encoded with some
additional constraints applied on decoder buffer size and maximum bitrate [44].
Appl. Sci. 2021, 11, 503 6 of 23
Table 2. Comparison of encoded video streams used in broadcast and Internet streaming.
Online Platforms/Web
Stream Characteristic Broadcast Systems
Streaming
multiple (usually 3–10) as
Number of outputs single needed to support ABR
delivery [4,42,43]
Preprocessing denoising, MCTF-type filters [24,53] rarely used
MPEG-4/AVC—most
deployments
Video codecs MPEG-2 [54], MPEG-4/AVC [55]
HEVC [56], AV1
[57]—special cases
based on target set of
fixed for each format. applicable
devices [42]
standards:
Codec profiles, levels guidelines: Apple HLS [44],
ATSC A/53 P4 [26], ATSC A/72 P1 [29],
ETSI TS 103 285 [46], CTA
ETSI TS 101 154 [30], SCTE 128 [33]
5001 [58]
GOP length 0.5 s 2–10 s
GOP type open, closed closed
Error resiliency
mandatory slicing of I/IDR pictures N/A
features
capped VBR
CBR, VBR (with statmux)
max. bitrate is typically
many additional constraints apply, see
capped to 1.1x–1.5x target
ATSC A/53 P4 [26], ATSC A/54 A [27],
Encoding modes bitrate;
ATSC A/72 P1 [29], ETSI TS 101
HRD buffer size is typically
154 [30],
limited by codec profile+
SCTE 43 [31], SCTE 128 [33]
level constraints;
VUI/HRD parameters required optional, usually omitted
VUI/colorimetry data required optional, usually included
VUI/aspect ratio required optional, usually included
required in some cases (e.g., in film
Picture timing SEI optional, usually omitted
mode)
Buffering period SEI optional optional, usually omitted
ADF/bar data/T.35 required and carried in video ES not used
optional, maybe carried
Closed captions required and carried in video ES
out-of-band
Significantly different are also GOP (Group of Pictures) lengths. In broadcast, GOPs
are typically 0.5 s, as required for channel switching. In streaming, GOPs can be 2–10 s
long, typically limited by lengths of segments used for delivery.
Broadcast streams also carry more metadata. They typically include relevant video
bitstream verifier (VBV) [54] or hypothetical reference decoder (HRD) parameters [55],
picture structure-, picture timing-, and colorimetry-related information [55]. They also
carry CEA 608/708 closed captions [61,62] and active format descriptor (AFD)/bar data
information [63–65]. For streaming most may be omitted.
Finally, there are also important differences in pre-processing. Broadcast encoders are
famous for the use of denoisers, MCTF-filters, and other pre-processing techniques applied
to make compression more efficient [24,53]. In streaming, the use of such techniques is only
beginning to emerge.
Appl. Sci. 2021, 11, 503 7 of 23
Table 3. Combination of streaming formats and DRMs that can be used to reach different devices. Orange tick marks
indicate possible, but less commonly used choices.
Descriptor) files) are provided, describing locations and properties of all such segments.
Such manifests are used by players (or streaming clients) to retrieve and play the content.
The carriage of metadata in streaming systems is also more diverse. Some meta-
data can be embedded in media segments, while others may also be embedded in man-
ifests, carried as additional “sidecar” tracks of segment files, or as “event” messages [6],
or ID3 tags [71].
For example, in addition to “broadcast-style” carriage of CEA 608/708 [62,63] closed
captions in video elementary streams, it is also possible to carry captions as separate tracks
of WebVTT [72] or TTML [73] segments, or as IMSC1 timed text data [74] encapsulated
in XML or ISOBMFF formats [45]. The preferred way of carriage depends on player
capabilities, and may vary for different platforms.
The SCTE-35 information is allowed to be carried only at manifest level in HLS, by
either manifest of in-band events in MPEG-DASH, and only in-band in MSS [7,45,75].
To manage such broad diversity of formats, DRMs, and metadata representations,
online video platforms are commonly deploying so-called dynamic or just-in-time (JIT)
packaging mechanisms [23]. This is illustrated by an architecture shown in Figure 2. Instead
of proactively generating and storing all possible permutations of packaged streams on
origin server, such system stores all VOD content in a single intermediate representation,
that allows fast transmux to all desired formats. The origin server works as a cache/proxy,
invoking JIT transmuxers to produce each version of content only if there is a client device
that requests it. Such logic is commonly accompanied by dynamic manifest generation,
matching the choices of formats, DRMs, and metadata representation to capabilities of
devices requesting them. This reduces amount of cloud storage needed and also increases
the efficiency of use of CDNs when handing multiple content representations [23].
As easily observed, delivery formats and their support system in case of OTT/streaming
is completely different as compared to broadcast.
2.5. Ad Processing
In broadcast systems there are several types of ad slots, where some are local and
anticipated to be filled by local stations, and some are regional or global and are filled
earlier in the delivery chain.
In all cases, insertions are done by splicing ads in the distribution TS streams, aided by
SCTE-35 [19] ad markers. Such markers (or cue tones) are inserted earlier—at playout or
even production stages [20]. Ad splicers subsequently look for SCTE-35 markers embedded
in the TS, and then communicate with ad servers (normally over SCTE 30 [76]) to request
and receive ad content that needs to be inserted. Then they update TS streams to insert
segments of ad content. Such TS update is actually a fairly tedious process, involving
re-mux, regeneration of timestamps, etc. It also requires both main content and ads
to be consistently encoded: have the same exact codec parameters, HRD model, etc.
(see e.g., [77]).
In the online/streaming world, ad-related processing is quite different. The ads
are usually inserted/personalized on a per-stream/per-client basis, and the results of
viewers watching the ads (so-called ad-impressions) are also registered, collected, and
subsequently used for monetization. It is all fully automated and has to work in real-time
and at mass scale.
There are two models for ad-insertion that are used in streaming currently: server-side
ad-insertion (SSAI) and client-side ad insertion (CSAI) [45]. In case of CSAI, most ad-
related processing resides in a client. The cloud only needs to deliver content and SCTE-35
cue tones to the client. This scales well regardless of how cue tones are delivered—both
in-band, or in-manifest carriage methods are adequate.
In case of SSAI, most ad-related processing resides in cloud. To operate it at high
scale and reasonable costs, such processing has to be extremely simple. In this context,
in-manifest carriage of SCTE-35 cue tones is strongly preferred, as it allows ad-insertions
to be done by manipulation of manifests.
Appl. Sci. 2021, 11, 503 9 of 23
For example, in case of HLS, SCTE-35 markers in HLS playlists become substituted
with sections containing URLs to ad-content segments, with extra EXT-X-DISCONTINUITY
markers added at beginning and end of such sections [75]. In case of MPEG DASH, essen-
tially the same functionality is achieved by using multiple periods [45]. The discontinuity
markers or changing periods are effectively forcing clients to reset decoders when switch-
ing between program and ad content. This prevents possible HRD buffer overflows and
other decodability issues during playback.
The observed differences in delays, random access granularity, and also possible
discontinuities in signals coming from cloud-based workflows are among most critical
factors that must be considered in planning migration of signal processing functionality
in cloud.
detection of segment cuts and identifies types of temporal sampling patterns and artifacts
in each segment. Such information, along with bitstream metadata is then passed to a chain
of filters, including artifact removal, temporal sampling conversion, color space conversion,
and scaling filters.
Figure 4. Decoding and format conversion chain needed to operate with cable/broadcast content.
The artifact removal filters, such as deblocking and denoising operations are among
most basic techniques needed to work with broadcast signals. Deblocking filters are needed,
e.g., in working with MPEG-2 encoded content, as MPEG-2 codec [54] does not have in-
loop filters, and passes all such artifacts to the output. In Figure 5, we show how such
artifacts look, along with cleaned output produced by our deblocking filer. Denoising is
also needed, especially when working with older (analog-converted) SD signals. Removal
of low-magnitude noise not only makes signal cleaner, but also makes the job of the
subsequent encoder easier, enabling it to achieve better quality or lower rate. We illustrate
this effect in Figure 6.
Temporal sampling conversion filter in Figure 4 performs conversions between pro-
gressive, telecine, and interlace formats, as well as temporal interpolation and resampling
operations. As discussed earlier, this filter is driven by information from the content analy-
sis module. This way, e.g., telecine segment can be properly converted back to progressive,
interlaced, properly deinterlaced, etc.
Figure 5. Example of MPEG2 blocking artifacts (a) and their removal by deblocking filter (b).
Appl. Sci. 2021, 11, 503 12 of 23
Figure 6. Example of using denoising filter for improving quality of final transcoded signal.
The quality of temporal sampling conversion operations is very critical. For example,
in Figure 7, we show the outputs of a basic deinterlacing filter (FFMPEG “yadif” filter [83])
and more advanced optical-flow-based algorithm [84]. It can be seen that a basic dein-
terlacer cannot maintain continuity of field lines under high motion. The effects of such
nature can be very prominent in sports broadcast content.
Figure 7. Comparison of outputs of basic (a) and advanced (b) deinterlacing filters.
Appl. Sci. 2021, 11, 503 13 of 23
The use of subsequent filters in Figure 5, such as color space conversion and scaling
filters, is driven by possible differences in color spaces, SARs, and resolutions in input and
output formats.
All such conversion operations need to be state of the art. Or at least they must be
comparable in quality with Teranex [85], Snell–Willcox/Grass Valley KudosPro [86], and
other standards converter boxes commonly used in post-production and broadcast.
Figure 9. Hybrid architecture with Over the Top (OTT)/streaming workflow offloaded to cloud.
To route streams to cloud, broadcast workflow produces contribution streams, one for
each channel, and then sends them over IP (e.g., using RTP+FEC or SRT) to cloud.
The cloud platform receives such streams, performs necessary conversions, transcodes
them, and distributes them over CDNs to clients. As shown in Figure 9, the cloud platform
may also be used to implement DVR or time-shift TV-type functionality, DRM protection,
SSAI, analytics, etc. All standard techniques for optimizing multi-format/multi-screen
streaming delivery (dynamic packaging, dynamic manifest generation, optimized profiles,
etc. [23]) can also be employed in this case.
To make such system work well, the main technologies/improvements that are
needed, include:
• reliable real-time ingest, e.g., using RTP+FEC, SRT, RIST, or Zixi-type of protocols
and/or a dedicated link, such as AWS Direct connect [81];
• improvements in signal processing stack—achieving artifact-free conversion of broad-
cast formats to ones used in OTT/streaming;
• improvements in metadata handling, including full pass-through of SCTE-35 and
compliant implementation of SSAI and CSAI functionality based on it.
Generally however, hybrid architectures of this kind have already been deployed and
proven to be effective in practice. Some of the above-mentioned close-gap technologies
have also been implemented. For instance, cloud ingest using RTP, SMPTE 2022-1, SMPTE
Appl. Sci. 2021, 11, 503 15 of 23
Figure 10. Hybrid architecture with ingest, playout and OTT/streaming workflow offloaded to cloud.
In addition to running playout, this system also runs broadcast transcoders and the
multiplexer in cloud. The final multiplex TS is then sent back to on-prem distribution
system, but mostly only to be relayed to modulators and amplifiers or (via IP or ASI) to
next tier stations or MVPD headends.
To make this system work, in addition to all improvements mentioned earlier, what
further needed is:
• broadcast-grade transcoders and multiplexer should be natively implemented in cloud
• this includes implementation of statmux capability, generation and insertion of all related
program and system information, possible addition of datacast service capability, etc.
This architecture is indeed an extreme example, where pretty much all data- and
processing- intensive operations are migrated to cloud. It is most technically challenging
to implement, but is also most promising, as it enables best utilization of cloud and
appreciation of all benefits that it brings.
Figure 12. The numbers of concurrent live encoders run by different users of cloud-based OVP.
Figure 12 also shows that the numbers of concurrent live encoders in cloud-based
system may change over time. There were periods when the system scaled them up, and
there were periods when it scaled them down. Such fluctuations generally related to
either creation of new or termination of existing live streaming events, or management of
redundant streams or management of a pool of “available” encoders that were instantiated
speculatively to enable new streams to be launched and streamed right away, without any
additional delays.
Appl. Sci. 2021, 11, 503 18 of 23
The dynamics of creation of new live streams in this system are further illustrated in
Figure 13. Here, we also observed high variation. From a single job to almost 100 jobs can
be created at each instance of time. This indicates that composition of work processed by
this system includes not only constantly-running 24/7 channels, but also a good volume of
special events (e.g., concerts, sports events, etc.), triggering creations of new live delivery
chains. Cloud-based architectures are highly suitable for handing variable workload of
this kind.
Figure 13. The numbers of new live streams created at each instance of time.
Next, in Figure 14 we show CDN bandwidth statistics reported for same set of users
of cloud-based delivery platform. The graphs show the amounts of data delivered at CDN
edge, as well as data pulled from origin servers, and the amounts of CDN midgress traffic.
Here, we also observed significant variations. Around 12/20 8:00 to 11:00 (US eastern
standard time) we saw a cascade of two significant spikes in traffic. The volume of data
increased almost 10× during this period. However, we also noticed, that the amount of
origin traffic during the same period did not increase much. This illustrates that CDNs
managed to absorb these spikes in traffic successfully. More generally, however, the amount
of orgin traffic may also fluctuate significantly. To support such a variable load, cloud-based
delivery platforms provide means for auto-scaling and balancing of load on origin servers.
Finally, in Figure 15, we also plot the numbers of concurrent players that are pulling
live streaming content. It can be observed that this figure looks somewhat similar to
CDN-reported statistics. Around 12/20 8:00 to 11:00 (US eastern standard time) we also
saw a cascade of two spikes, where almost 1 M concurrent viewers joined. This explains
spikes in CDN traffic noted earlier. Such spikes are pretty common for popular events, and
the system is designed to handle them efficiently.
Appl. Sci. 2021, 11, 503 19 of 23
Figure 15. The numbers of concurrent streaming players pulling the content.
Naturally, the above statistics capture only a small example set of use cases of today’s
existing cloud-based OTT delivery platforms. However, they show that such platfoms are
operational, mass deployed, and capable of handling volumes of transcoding and media
delivery streams comparable to ones used in broadcast distribution systems.
As also shown, the amounts of concurrent live events/streams, transcoders, trans-
muxers, origins, and other elements of delivery chan can be highly variable in practice.
Cloud-based deployments are ideally suited for handling such variablity in resource re-
quirements in cost-efficient manner. Transition of additional components of broadcsast
workflows to cloud will lead to additional reduction in investments in hardware, will
simplify management and maintenance, and will make overall design of such systems
much more flexible, extensible, and future-proof.
6. Conclusions
In this paper, we have studied the differences between on-premise broadcast and
cloud-based online video delivery workflows and identified means needed for bridging the
gaps between them. Such means include: improvements in cloud ingest, signal processing
stacks, transcoder capabilities, and most importantly, a broadcast-grade cloud playout
system. To implement a cloud playout system, we have suggested an architecture em-
ploying intra-only mezzanine format and associated processing blocks that can be easily
replicated and operated in fault-tolerant fashion. We finally considered possible evolu-
tions of broadcast and cloud-based video systems and suggested several possible hybrid
architectures, with different degrees of offloading of processing in cloud, that are likely to
emerge in the future. Examples of operational statistics observed in today’s mass-deployed
cloud-based media delivery systems were also shown. These statistics confirm that such
systems can indeed handle the load required for transitioning of additional elements of
broadcast systems to cloud.
Author Contributions: Conceptualization, Y.R. and J.C.; methodology, Y.R., J.C. and B.Z.; software,
J.C. and B.Z.; validation, Y.R., J.C. and B.Z.; formal analysis, Y.R.; investigation, Y.R. and J.C.;
resources, Y.R., J.C. and B.Z.; data curation, B.Z. and Y.R.; writing—original draft preparation, Y.R.;
writing—review and editing, J.C. and B.Z.; visualization, Y.R. and B.Z.; supervision, Y.R.; project
administration, Y.R.; funding acquisition, Y.R.; All authors have read and agreed to the published
version of the manuscript.
Funding: This research received no external funding
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data sharing not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
Appl. Sci. 2021, 11, 503 20 of 23
References
1. Pizzi, S.; Jones, G. A Broadcast Engineering Tutorial for Non-Engineers, 4th ed.; National Association Broadcasters (NAB): Chicago,
IL, USA, 2014; 354p, ISBN 13:978-0415733380/10:0415733380.
2. Luther, A.; Inglis, A. Video Engineering, 3rd ed.; McGraw-Hill: New York, NY, USA, 1999; 549p.
3. Wu, D.; Hou, Y.T.; Zhu, W.; Zhang, Y.; Peha, J.M. Streaming video over the internet: Approaches and directions. IEEE Trans.
Circuits Syst. Video Technol. 2001, 11, 282–300.
4. Conklin, G.J.; Greenbaum, G.S.; Lillevold, K.O.; Lippman, A.F.; Reznik, Y.A. Video coding for streaming media delivery on the
internet. IEEE Trans. Circuits Syst. Video Technol. 2001, 11, 269–281. [CrossRef]
5. Pantos, R.; May, W. HTTP Live Streaming, RFC 8216. IETF. Available online: https://tools.ietf.org/html/rfc8216 (accessed on
1 November 2020).
6. ISO/IEC 23009-1:2014. Information Technology—Dynamic Adaptive Streaming Over HTTP (DASH)—Part 1: Media Presen-
tation Description and Segment Formats. ISO/IEC, 2014. Available online: https://www.iso.org/about-us.html (accessed on
1 November 2020).
7. Microsoft Smooth Streaming. Available online: https://www.iis.net/downloads/microsoft/smooth-streaming (accessed on
1 November 2020).
8. Evens, T. Co-opetition of TV broadcasters in online video markets: A winning strategy? Int. J. Digit. Telev. 2014, 5, 61–74.
[CrossRef]
9. Nielsen Holdings Plc, Total Audience Report. 2020. Available online: https://www.nielsen.com/us/en/client-learning/tv/
nielsen-total-audience-report-february-2020/ (accessed on 1 November 2020).
10. Sandvine. The Global Internet Phenomena Report. 2019. Available online: https://www.sandvine.com/hubfs/Sandvine_
Redesign_2019/Downloads/Internet%20Phenomena/Internet%20Phenomena%20Report%20Q32019%2020190910.pdf (accessed
on 1 November 2020).
11. Frost & Sullivan. Analysis of the Global Online Video Platforms Market. Frost & Sullivan. 2014. Available online: https:
//store.frost.com/analysis-of-the-global-online-video-platforms-market.html (accessed on 1 November 2020).
12. ETSI TS 102 796. Hybrid Broadcast Broadband TV. ETSI, 2016. Available online: https://www.etsi.org/deliver/etsi_ts/102700_1
02799/102796/01.04.01_60/ts_102796v010401p.pdf (accessed on 1 November 2020).
13. ATSC A/331:2020. Signaling, Delivery, Synchronization, and Error Protection. ATSC, 2020. Available online: https://www.atsc.
org/atsc-documents/3312017-signaling-delivery-synchronization-error-protection/ (accessed on 1 November 2020).
14. Stockhammer, T.; Sodagar, I.; Zia, W.; Deshpande, S.; Oh, S.; Champel, M. Dash in ATSC 3.0: Bridging the gap between OTT
and broadcast. In Proceedings of the IET Conference Proceedings, IBC 2016 Conference, Amsterdam, The Netherlands, 8–12
September 2016; Volume 1, p. 24.
15. AWS Elemental. Video Processing and Delivery Moves to the Cloud, e-book. 2018. Available online: https://www.elemental.
com/resources/white-papers/e-book-video-processing-delivery-moves-cloud/ (accessed on 1 November 2020).
16. Fautier, T. Cloud Technology Drives Superior Video Encoding; In SMPTE 2019; SMPTE: Los Angeles, CA, USA, 2019; pp. 1–9.
17. ISO/IEC 13818-1:2019. Information Technology—Generic Coding of Moving Pictures and Associated Audio Information:
Systems—Part 1: Systems. ISO/IEC, 2019; Available online: https://www.iso.org/standard/75928.html (accessed on 1 November
2020).
18. ATSC A/65B. Program and System Information Protocol for Terrestrial Broadcast and Cable (PSIP). ATSC, 2013. Available
online: https://www.atsc.org/wp-content/uploads/2015/03/Program-System-Information-Protocol-for-Terrestrial-Broadcast-
and-Cable.pdf (accessed on 1 November 2020).
19. ANSI/SCTE 35 2007. Digital Program Insertion Cueing Message for Cable. SCTE, 2007. Available online: https://webstore.ansi.
org/standards/scte/ansiscte352007 (accessed on 1 November 2020).
20. ANSI/SCTE 67 2010. Recommended Practice for SCTE 35 Digital Program Insertion Cueing Message for Cable. SCTE, 2010.
Available online: https://webstore.ansi.org/standards/scte/ansiscte672010 (accessed on 1 November 2020).
21. Lechner, B.J.; Chernock, R.; Eyer, M.; Goldberg, A.; Goldman, M. The ATSC transport layer, including Program and System
Information (PSIP). Proc. IEEE 2006, 94, 77–101. [CrossRef]
22. Reznik, Y.; Lillevold, K.; Jagannath, A.; Greer, J.; Corley, J. Optimal design of encoding profiles for ABR streaming. In Proceedings
of the Packet Video Workshop, Amsterdam, The Netherlands, 12–15 June 2018. [CrossRef]
23. Reznik, Y.; Li, X.; Lillevold, K.; Peck, R.; Shutt, T.; Marinov, R. Optimizing Mass-Scale Multi-Screen Video Delivery. In Proceedings
of the 2019 NAB Broadcast Engineering and Information Technology Conference, Las Vegas, NV, USA, 6–11 April 2019.
24. Davidson, G.A.; Isnardi, M.A.; Fielder, L.D.; Goldman, M.S.; Todd, C.C. ATSC Video and Audio Coding. Proc. IEEE 2006,
94, 60–76. [CrossRef]
25. ATSC A/53 Part 1: 2013. ATSC Digital Television Standard: Part 1—Digital Television System. ATSC, 2013. Available online:
https://www.atsc.org/wp-content/uploads/2015/03/A53-Part-1-2013-1.pdf (accessed on 1 November 2020).
26. ATSC A/53 Part 4:2009. ATSC Digital Television Standard: Part 4—MPEG-2 Video System Characteristics. ATSC, 2009. Available
online: https://www.atsc.org/wp-content/uploads/2015/03/a_53-Part-4-2009-1.pdf (accessed on 1 November 2020).
27. ATSC A/54A. Recommended Practice: Guide to the Use of the ATSC Digital Television Standard. ATSC, 2003. Available online:
https://www.atsc.org/wp-content/uploads/2015/03/a_54a_with_corr_1.pdf (accessed on 1 November 2020).
Appl. Sci. 2021, 11, 503 21 of 23
28. ATSC A/72 Part 1:2015. Video System Characteristics of AVC in the ATSC Digital Television System. ATSC, 2015. Available
online: https://www.atsc.org/wp-content/uploads/2015/03/A72-Part-1-2015-1.pdf (accessed on 1 November 2020).
29. ATSC A/72 Part 2:2014. AVC Video Transport Subsystem Characteristics. ATSC, 2014. Available online: https://www.atsc.org/
wp-content/uploads/2015/03/A72-Part-2-2014-1.pdf (accessed on 1 November 2020).
30. ETSI TS 101 154. Digital Video Broadcasting (DVB): Implementation Guidelines for the use of MPEG-2 Systems, Video and
Audio in Satellite, Cable and Terrestrial Broadcasting Applications. Doc. ETSI TS 101 154 V1.7.1. Annex, B., Ed.; 2019. Available
online: https://standards.iteh.ai/catalog/standards/etsi/af36a167-779e-4239-b5a7-89356c6c2dde/etsi-ts-101-154-v2.6.1-2019-
09 (accessed on 1 November 2020).
31. ANSI/SCTE 43 2015. Digital Video Systems Characteristics Standard for Cable Television. SCTE, 2015. Available online:
https://webstore.ansi.org/standards/scte/ansiscte432015 (accessed on 1 November 2020).
32. ANSI/SCTE 128 2010-a. AVC Video Systems and Transport Constraints for Cable Television. SCTE, 2010. Available online:
https://webstore.ansi.org/standards/scte/ansiscte1282010 (accessed on 1 November 2020).
33. Schulzrinne, H.; Casner, S.; Frederick, R.; Jacobson, V. RTP: A Transport Protocol for Real-Time Applications. RFC 1889. IETF,
1996. Available online: https://tools.ietf.org/html/rfc1889 (accessed on 1 November 2020).
34. SMPTE ST 2022-1:2007. Forward Error Correction for Real-Time Video/Audio Transport over IP Networks. ST 2022-1. SMPTE,
2007. Available online: https://ieeexplore.ieee.org/document/7291470/versions#versions (accessed on 1 November 2020).
35. SMPTE ST 2022-2:2007. Unidirectional Transport of Constant Bit Rate MPEG-2 Transport Streams on IP Networks. ST 2022-2;
SMPTE, 2007. Available online: https://ieeexplore.ieee.org/document/7291740 (accessed on 1 November 2020).
36. Zixi, LLC. Streaming Video over the Internet and Zixi. 2015. Available online: http://www.zixi.com/PDFs/Adaptive-Bit-Rate-
Streaming-and-Final.aspx (accessed on 1 November 2020).
37. OC-SP-MEZZANINE-C01-161026. Mezzanine Encoding Specification. Cable Television Laboratories, Inc., 2016. Available
online: https://community.cablelabs.com/wiki/plugins/servlet/cablelabs/alfresco/download?id=1d76e930-6d98-4de3-89ee-
9d0fb4b5292a (accessed on 1 November 2020).
38. Adobe Systems, Real-Time Messaging Protocol (RTMP) Specification. Version 1.0. 2012. Available online: https://www.adobe.
com/devnet/rtmp.html (accessed on 1 November 2020).
39. Haivision. Secure Reliable Transport (SRT). 2019. Available online: https://github.com/Haivision/srt (accessed on
1 November 2020).
40. VSF TR-06-1. Reliable Internet Stream Transport (RIST) Protocol Specification—Simple Profile. Video Services Forum.
2018. Available online: http://vsf.tv/download/technical_recommendations/VSF_TR-06-1_2018_10_17.pdf (accessed on
1 November 2020).
41. OC-SP-CEP3.0-I05-151104. Content Encoding Profiles 3.0 Specification. Cable Television Laboratories, Inc., 2015. Available
online: https://community.cablelabs.com/wiki/plugins/servlet/cablelabs/alfresco/download?id=c7eb769e-1020-402c-b2f2
-d839ee532945 (accessed on 1 November 2020).
42. Ozer, J. Encoding for Multiple Devices. Streaming Media Magazine. 2013. Available online: http://www.streamingmedia.com/
Articles/ReadArticle.aspx?ArticleID=88179&fb_comment_id=220580544752826_937649 (accessed on 1 November 2020).
43. Ozer, J. Encoding for Multiple-Screen Delivery. Streaming Media East. 2013. Available online: https://www.streamingmediablog.
com/wp-content/uploads/2013/07/2013SMEast-Workshop-Encoding.pdf (accessed on 1 November 2020).
44. Apple. HLS Authoring Specification for Apple Devices. 2019. Available online: https://developer.apple.com/documentation/
http_live_streaming/hls_authoring_specification_for_apple_devices (accessed on 1 November 2020).
45. DASH-IF. DASH-IF Interoperability Points, v4.3. 2018. Available online: https://dashif.org/docs/DASH-IF-IOP-v4.3.pdf
(accessed on 1 November 2020).
46. ETSI TS 103 285 v1.2.1. Digital Video Broadcasting (DVB); MPEG-DASH Profile for Transport of ISO BMFF Based DVB Services
Over IP Based Networks. 2020. Available online: https://www.etsi.org/deliver/etsi_ts/103200_103299/103285/01.03.01_60/ts_
103285v010301p.pdf (accessed on 1 November 2020).
47. SMPTE RP 145. Color Monitor Colorimetry. SMPTE, 1987. Available online: https://standards.globalspec.com/std/1284848/
smpte-rp-145 (accessed on 1 November 2020).
48. SMPTE 170M. Composite Analog Video Signal—NTSC for Studio Applications. SMPTE, 1994. Available online: https://
standards.globalspec.com/std/892300/SMPTE%20ST%20170M (accessed on 1 November 2020).
49. EBU Tech. 3213-E. E.B.U. Standard for Chromaticity Tolerances for Studio Monitors. EBU, 1975. Available online: https:
//tech.ebu.ch/docs/tech/tech3213.pdf (accessed on 1 November 2020).
50. ITU-R Recommendation BT.601. Studio Encoding Parameters of Digital Television For standard 4:3 and Wide Screen 16:9 Aspect
Ratios. ITU-R, 2011. Available online: https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.601-7-201103-I!!PDF-E.pdf
(accessed on 1 November 2020).
51. ITU-R Recommendation BT.709. Parameter Values for the HDTV Standards for Production and International Programme
Exchange. ITU-R. ITU-R, 2015. Available online: https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.709-6-201506-I!
!PDF-E.pdf (accessed on 1 November 2020).
52. IEC 61966-2-1:1999. Multimedia Systems and Equipment—Colour Measurement and Management—Part 2-1: Colour
Management—Default RGB Colour Space—sRGB. IEC, 1999. Available online: https://webstore.iec.ch/publication/6169
(accessed on 1 November 2020).
Appl. Sci. 2021, 11, 503 22 of 23
53. Brydon, N. Saving Bits—The Impact of MCTF Enhanced Noise Reduction. SMPTE J. 2002, 111, 23–28. [CrossRef]
54. ISO/IEC 13818-2:2013. Information Technology—Generic Coding of Moving Pictures and Associated Audio Information—Part 2:
Video. ISO/IEC, 2013. Available online: https://www.iso.org/standard/61152.html (accessed on 1 November 2020).
55. ISO/IEC 14496-10:2003. Information Technology—Coding of Audio-Visual Objects—Part 10: Advanced Video Coding. ISO/IEC,
2003. Available online: https://www.iso.org/standard/37729.html (accessed on 1 November 2020).
56. ISO/IEC 23008-2:2013. Information Technology –High Efficiency Coding and Media Delivery in Heterogeneous Environments—
Part 2: High Efficiency Video Coding. ISO/IEC, 2013. Available online: https://www.iso.org/standard/35424.html (accessed on
1 November 2020).
57. AOM AV1. AV1 Bitstream & Decoding Process Specification, v1.0.0. Alliance for Open Media. 2019. Available online: https:
//aomediacodec.github.io/av1-spec/av1-spec.pdf (accessed on 1 November 2020).
58. CTA 5001. Web Application Video Ecosystem—Content Specification. CTA WAVE, 2018. Available online: https://cdn.cta.tech/
cta/media/media/resources/standards/pdfs/cta-5001-final_v2_pdf.pdf (accessed on 1 November 2020).
59. Perkins, M.; Arnstein, D. Statistical multiplexing of multiple MPEG-2 video programs in a single channel. SMPTE J. 1995,
4, 596–599. [CrossRef]
60. Boroczky, L.; Ngai, A.Y.; Westermann, E.F. Statistical multiplexing using MPEG-2 video encoders. IBM J. Res. Dev. 1999,
43, 511–520. [CrossRef]
61. CEA-608-B. Line 21 data services. Consumer Electronics Association. 2008. Available online: https://webstore.ansi.org/standards/
cea/cea6082008r2014ansi (accessed on 1 November 2020).
62. CEA-708-B. Digital television (DTV) closed captioning. Consumer Electronics Association. 2008. Available online: https://www.
scribd.com/document/70239447/CEA-708-B (accessed on 1 November 2020).
63. CEA CEB16. Active Format Description (AFD) & Bar Data Recommended Practice. CEA, 2006. Available online: https:
//webstore.ansi.org/standards/cea/ceaceb162012 (accessed on 1 November 2020).
64. SMPTE 2016-1. Standard for Television—Format for Active Format Description and Bar Data. SMPTE, 2007. Available online:
https://www.techstreet.com/standards/smpte-2016-1-2009?product_id=1664006 (accessed on 1 November 2020).
65. OC-SP-EP-I01-130118. Encoder Boundary Point Specification. Cable Television Laboratories, Inc., 2013. Available online: https://
community.cablelabs.com/wiki/plugins/servlet/cablelabs/alfresco/download?id=2a4f4cc6-3763-40b9-9ace-7de923559187 (ac-
cessed on 1 November 2020).
66. FairPlay Streaming. Available online: https://developer.apple.com/streaming/fps/ (accessed on 1 November 2020).
67. PlayReady. Available online: https://www.microsoft.com/playready/overview/ (accessed on 1 November 2020).
68. Widevine. Available online: https://www.widevine.com/solutions/widevine-drm (accessed on 1 November 2020).
69. ISO/IEC 14496-12:2015. Information Technology—Coding of Audio-Visual Objects—Part 12: ISO Base Media File Format. 2015.
Available online: https://www.iso.org/standard/68960.html (accessed on 1 November 2020).
70. ISO/IEC 23000-19:2018. Information Technology—Coding of Audio-Visual Objects—Part 19: Common Media Application
Format (CMAF) for Segmented Media. 2018. Available online: https://www.iso.org/standard/71975.html (accessed on
1 November 2020).
71. ID3 Tagging System. Available online: http://www.id3.org/id3v2.3.0 (accessed on 1 November 2020).
72. W3C WebVTT. The Web Video Text Tracks. W3C, 2018. Available online: http://dev.w3.org/html5/webvtt/ (accessed on
1 November 2020).
73. W3C TTML1. Timed Text Markup Language 1. W3C, 2019. Available online: https://www.w3.org/TR/2018/REC-ttml1-20181
108/ (accessed on 1 November 2020).
74. W3C IMSC1. TTML Profiles for Internet Media Subtitles and Captions 1.0. W3C, 2015. Available online: https://dvcs.w3.org/
hg/ttml/raw-file/tip/ttml-ww-pro-33files/ttml-ww-profiles.html (accessed on 1 November 2020).
75. Apple. Incorporating Ads into A Playlist. Available online: https://developer.apple.com/documentation/http_live_streaming/
example_playlists_for_http_live_streaming/incorporating_ads_into_a_playlist (accessed on 1 November 2020).
76. ANSI/SCTE 30 2017. Digital Program Insertion Splicing API. SCTE, 2017. Available online: https://webstore.ansi.org/standards/
scte/ansiscte302017 (accessed on 1 November 2020).
77. ANSI/SCTE 172 2011. Constraints on AVC Video Coding for Digital Program Insertion. SCTE, 2011. Available online: https:
//webstore.ansi.org/preview-pages/SCTE/preview_SCTE+172+2011.pdf (accessed on 1 November 2020).
78. SMPTE 259:2008. SMPTE Standard—For Television—SDTV—Digital Signal/Data—Serial Digital Interface. SMPTE, 2008.
Available online: https://ieeexplore.ieee.org/document/7292109 (accessed on 1 November 2020).
79. SMPTE 292-1:2018. SMPTE Standard—1.5 Gb/s Signal/Data Serial Interface. SMPTE, 2018. Available online: https://ieeexplore.
ieee.org/document/8353270 (accessed on 1 November 2020).
80. SMPTE 2110-20:2017. SMPTE Standard—Professional Media over Managed IP Networks: Uncompressed Active Video. SMPTE,
2017. Available online: https://ieeexplore.ieee.org/document/8167389 (accessed on 1 November 2020).
81. AWS Direct Connect. Available online: https://aws.amazon.com/directconnect/ (accessed on 1 November 2020).
82. Azure ExpressRoute. Available online: https://azure.microsoft.com/en-us/services/expressroute/ (accessed on 1 November 2020).
83. FFMPEG Filter Documentation. Available online: https://ffmpeg.org/ffmpeg-filters.html (accessed on 1 November 2020).
84. Vanam, R.; Reznik, Y. Temporal Sampling Conversion using Bi-directional Optical Flows with Dual Regularization. In Proceedings
of the IEEE International Conference on Image Processing (ICIP), Abu Dhabi, UAE, 25–28 October 2020.
Appl. Sci. 2021, 11, 503 23 of 23