0% found this document useful (0 votes)

150 views

Encoding and Transcoding Guide V3

This document provides guidance on understanding and fine-tuning video compression parameters for Appear TV's new encoding solutions. It discusses tools for video pre-processing including motion compensated temporal filtering (MCTF) to remove noise and restore detail, and a pre-deblocking filter to reduce blocking artifacts from heavily compressed sources. It also covers video encoding, additional encoding tools, entropy coding, statistical multiplexing, and output formats. The goal is to describe key tools and parameters for advanced users to optimize their systems.

Uploaded by

IONUT SIMA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

150 views

Encoding and Transcoding Guide V3

Uploaded by

IONUT SIMA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Application Note

Encoding and Transcoding

parameters

A guide to help understand and fine tune Appear TV’s

new compression solutions
Confidential

Table of Contents
TABLE OF CONTENTS ............................................................................................................................................. 2
1 INTRODUCTION ............................................................................................................................................. 3
2 VIDEO PRE-PROCESSING TOOLS.............................................................................................................. 4
3 VIDEO ENCODING (CORE) .......................................................................................................................... 6
4 VIDEO ENCODING (ADDITIONAL TOOLS) ........................................................................................... 17
5 ENTROPY CODING ...................................................................................................................................... 19
6 STATISTICAL MULTIPLEXING ................................................................................................................ 20
7 OUTGOING VIDEO FORMATS (BROADCAST MODE) ........................................................................ 23

09/02/2015 2
Confidential

1 Introduction
The new compression product launches occurring during 2014 and 2015 heralds a new era of
excellence for Appear TV. Already a very strong player in video compression markets, these
revolutionary new modules build upon the considerable achievements that have already been
made, and enable all of the advantages of a fully modular solution to be delivered with best in
class performance.
And by best in class, we mean any class from any competing provider.

During the past few years, Appear TV has grown to bring innovation into several new markets.
From the introduction of high performance modular encoding and transcoding solutions which
brought a new dimension to the capability of our revolutionary chassis concept in 2010, we have
continued to add;

 Sophisticated statistical multiplexing functions

 New hardware controlled redundancy schemes
 Hardware solutions for linear OTT transcoding
 Many functionality enhancements to develop the solution into a true, 2nd generation
headend architecture capable of bettering traditional systems in multiple key areas.

During 2014 the addition of new, ultra-high performance compression options will raise the
video performance to ‘best on market’ standards, making the platform equally applicable for the
most demanding tier 1 operators. Based on a new, ‘hybrid’ hardware platform that combines the
best efficiency and VQ properties of a hardware encoder with the flexibility associated with
software encoding, the new UNIVERSAL module series can be tailored to virtually any task.
Providing many times the processing power of current modules, the new UNIVERSAL range
either enables significant density improvements, for turn-around markets such as Cable, or
significant VQ improvements to beat even the best performing competing encoders on the
market today.

The full UNIVERSAL product family will include the following;

 Broadcast transcoder for high density and high VQ applications.

 OTT (multiscreen) transcoder, for best linear transcoding density and performance.
 OTT (multiscreen) encoder, to deliver the benefits of the new platform when content is
presented as a baseband source.
 Broadcast encoder for high density and high VQ applications.

The superb performance of the new model line-up goes beyond the video compression stage
alone, and for broadcast applications also introduces a new and improved stat mux mechanism.
A new quad-core audio DSP will also be introduced to bring significant capability and density
benefits, such as the native ability to encode multi-channel Dolby audio.

Appear TV has always offered class leading packaging with a ‘do-anything’ capability to extract
maximum advantage from a fully modular solution, and without question, this is revolutionizing
the art of headend design. Today, Appear TV provides powerful systems that are more flexible,
space efficient and easy to use / maintain than ever before. Now, Appear TV can also provide a
standard of compression performance that is second to none.

09/02/2015 3
Confidential

This document moves through the encoding process to provide a stage by stage account of many
of the important tools that the new platform offers. It describes key basic principles, and well as
many of the available features, including internal parameters that advanced users may at some
stage wish to access to fine tune their systems.

If you are just looking for a summary to best configure the encoder / transcoder and wish to
omit the theory / background, please skip through to the summary table (section 8).

2 Video Pre-processing tools

i. MCTF (Motion Compensated temporal filtering)
The presence of noise in content also removes the natural redundancy that exists. This is very
serious because it is the redundancy within content that enables MPEG compression to achieve
the dramatic reductions in bitrate with minimal impact on VQ that is normally expected. Noise
can occur in several forms and frequency ranges, but will results in the encoder (and stat mux
system, if used) treating it as content that must be faithfully encoded. Consequently, MPEG
encoding mode decisions will be made to attempt to encode the noise, and where applicable the
stat mux system will allocate more bandwidth than would otherwise be the case to encode what
it considers to be difficult content. This causes three primary consequences.

1. The mode decisions made to try and encode the noise result in sub-optimal encoding of
the real picture content.
2. If stat mux is used, the higher than necessary bitrate allocations waste bandwidth, not
only for the noisy service but for all services since the bandwidth is taken from a
common resource pool.
3. The noise will usually not be successfully encoded anyway and high frequency noise co-
efficients especially are likely to still be quantized out. This means the outgoing video
will appear to be noise free, even though its presence on the input will have caused severe
reductions in encoding efficiency. The service will appear noise free, but will have
consumed a greater bitrate to encode than would otherwise be necessary and will also
have lower VQ.

High frequency noise will only survive the DCT stage if very high video bitrates are used. This
is why a suspected noise problem should only be investigated by looking at the source of an
encoder, and never its output (unless it can be operated at maximum video bitrate as a test).
Since both mode decisions and stat mux allocation occurs early on during encoding, noise
problems can only be tackled by pre-filtering specifically designed to remove it at source.
The MCTF filter is the tool designed to achieve this in Appear TV encoders, and has been re-
developed for the new Appear TV universal compression range to provide bi-directional noise
reduction that also restores the true detail within content, especially textures and edges and
especially in low lighting conditions.
MCTF is always enabled by default. However, the default setting is very non-aggressive. The
MctfStrength parameter adjusts this from 0 (weakest and default setting) to 7 (strongest). It is
possible to disable MCTF completely using the BYPASS command where 0 enables the MCTF
engine and 1 disables it. The default is 0 (MCTF enabled).

The following illustrations show source content with noise, and how application of MCTF can
remove this. The first image is with noise, and the second is after processing with MCTF.

09/02/2015 4
Confidential

MCTF can cause noticeable softening of the image, which is likely to occur for aggressive
values higher than +3. For this reason, it is important to use the tool appropriately, with 0 or 1
covering the majority of typical scenarios.

ii. Pre-deblocking filter

When performing aggressive compression for DTH transmission, providing a good quality video
source into the encoder is important. However, many customers need to turn around poor sources
that have already been aggressively compressed before and have suffered a considerable VQ
reduction as a result. One of the most noticeable artefacts is blocking, especially on material that
has been encoded with MPEG2. Macroblock edges are sharp features, containing a wide
spectrum of high frequency content. The encoding stage cannot distinguish this from the wanted
image, and the presence of macroblock edges will remove the natural redundancy that exists in
the content by presenting a complex spectrum of wide ranging frequency components which the
encoder will try and preserve. The result creates inefficiencies for very similar reasons to the
noise case described above.
The most complex concatenation scenario occurs when a heavily compressed MPEG2 DTH feed
is used as the source for MPEG4 turnaround. In this scenario, the relative inefficiency of MPEG2
combines with a fixed macroblock structure to make visible blocking artefacts easily induced
and therefore prevalent on the MPEG4 encoder’s video source. Since MPEG4 encoders are
usually operated at very low video bitrates, exploiting redundancy correctly is central to making
this possible and therefore the presence of macroblocks will seriously impair this process and is
reflected by the fact that block edges will cause multiple sub-optimal mode decisions to be made.
This will also negatively affect stat mux. Fortunately, Appear TV has provided a pre-processing
tool that can smooth MPEG2 block edges prior to re-encoding. This filter is the PRE
DEBLOCKING filter.
Unlike the alpha / beta deblocking filter which is specific to AVC encoding scenarios, the pre-
deblocking filter can be used with MPEG2 outputs. It can also be used to provide some HF
softening if aggressively applied.

The following illustrations show a pre-compressed source exhibiting visible MPEG2 macroblock
edges before and after being passed through the de-blocking filter.

09/02/2015 5
Confidential

As with noise reduction, applying the deblocking filter too aggressively will soften and remove
detail, although with less danger of being extremely aggressive (as can be the case with MCTF).
The tool has adjustment between 0 and 3 and with MPEG2 sources especially, should be used at
an aggressive setting

iii. Detelecine
The encoder is provided with a de-telecine (3-2 pulldown) feature to help optimize encoding
efficiency when video has been through a telecine process. This inserts additional frames to take
the frame rate up from native film rate (24p) to NTSC (60i), it is possible to reverse this process
to get the original 24 frame progressive source back. This is often preferable when a progressive
source is required for the following reasons;

 If the telecine 60i source is converted to 30p progressive then the interlaced fields will be
‘merged’ by performing temporal filtering. This will include the additional fields added
by the Telecine process.

 If the Telecine 60i source is converted to progressive by removing the additional fields
(inverse telecine) then the original 24p source will be restored.

Inverse telecine should only be applied to material that has been through the telecine process.
The encoder has two automatic triggers which can be used to activate de-telecine. The first is the
IT flag that signals whether telecine has been applied. If this is set, the encoder will apply de-
telecine automatically. The second is if the encoder detects repeated fields. Both of these
functions are set by enabling de-telecine in the GUI.

3 Video Encoding (core)

iv. Coding Mode (Field / Frame)

The coding mode defines how interlaced pictures are scanned prior to being encoded, with the
objective of best exploiting the redundancy between fields.

An interlaced field consists of two frames. The ‘top’ frame carries even numbered lines and the
‘bottom’ frame carries odd numbered lines. The reason for interlacing is historical, and dates
back to the CRT days when it provided a convenient ‘analogue’ method of video compression by
allowing the effective picture rate to double and therefore reduce flicker.
Currently, interlaced video is a legacy problem for modern progressive displays and finely tuned
compression schemes that want to exploit commonality between frames. In this setting,
interlaced video does not fit since it does the reverse by presenting a frame as two offset fields.
The offset is not just by line, but also by time making the difference between the two fields
largely dependent upon the degree of motion present.

Two legacy fixed methods exist for scanning interlaced content. The first is to treat each
interlaced field separately, and encode them both. This is FIELD mode, and when selected it will
result in the top frame and bottom frame of interlaced content being permanently encoded as
separate pictures. The second legacy option is FRAME mode, in which case the two fields will
be merged and encoded as one frame.

09/02/2015 6
Confidential

Field mode works best if there is motion throughout the entire picture. Frame mode works best if
there is no motion in the picture at all.
Obviously content will vary between these two extremes, and very often there will be motion
only within selective areas of the picture. Two adaptive options have been introduced into the
AVC toolkit for encoding interlaced content more intelligently. The first is Picture Adaptive
Field Frame (PAFF) which assesses the degree of motion and makes a frame by frame decision
on whether to encode in FIELD or FRAME mode.
The second is Macroblock adaptive field frame, or MBAFF. When a frame is encoded, it is
encoded into a matrix of 16x16 macroblocks. MBAFF enables the field/frame encoding decision
to be made at macroblock level, to optimise for any type of content especially where there are
static areas as well as areas of movement. MBAFF sets either FRAME or FIELD mode for two
macroblock pairs, and therefore operates with a granularity of 32x16 blocks.
The picture below illustrates how macroblocks have been encoded in these two different modes,
with FRAME mode (un-highlighted areas) marking stationary areas and FIELD mode
(highlighting the macroblock edges) for areas of movement.

The Appear TV internal options are;

 Field only
 Frame only
 PAFF
 MBAFF

The default for AVC is PAFF and MPEG2 is MBAFF

Some early MPEG 2 decoders do not have sufficient memory to support field based encoding,
and so may fail if PAFF is enabled. Usually, this is seen as motion judder when there is
sufficient movement to trigger the encoder to use FIELD mode. This is why MBAFF is the
preferred option for MPEG2.
Progressive sources should always be encoded using FIELD mode.

The optimum modes are as follows;

 For interlaced content using MPEG4, use PAFF

 For interlaced content using MPEG2, use MBAFF

09/02/2015 7
Confidential

 For progressive content, use FRAME ONLY

i. Encoding Profile
Profiles exist within complex encoding standards to define a subset of allowed tools which can
H264 H264 H264 H264 H264 MPEG2 MPEG2
Con. Baseline Main High Hi10 Hi422 Main High
B Frames No Yes Yes Yes Yes Yes Yes
CABAC No Yes Yes Yes Yes No No
PAFF or MBAFF No Yes Yes Yes Yes Yes Yes
Reference B Frames No Yes Yes Yes Yes No No
4:2:2 No No No No Yes No Yes
10-bit No No No Yes Yes No No
be used to set the complexity of the encoded stream and therefore match it to the capability of
the decoder.
The following chart illustrates the profiles within H.264.

Within the H.264 standard, three profiles are commonly used for broadcast. These are;

 Constrained Baseline. Often used in OTT applications with older and / or lower priced
smartphones. To provide an indication, the Apple Iphone 3S was the first Apple device
capable of supporting main profile decoding. Usually, with the simplest devices, either
constrained baseline or baseline profile is all that is supported.
 Main Profile. The complexity difference between constrained baseline and main profile
is very significant, because main profile introduces B frames. The encoding efficiency
(resulting in improved video quality at a given bitrate) is therefore much improved. Large
screen broadcast viewing requires this step-up in terms of efficiency, hence for AVC,
main profile is the most commonly used profile for broadcast.
 High profile introduces further tools to improve the compression gain, at the expense of
additional complexity. The key tool is the 8x8 transform. Most modern decoders support
high profile fully, and so its use is becoming more common.

Appear TV can support the H.264 Hi10P and HI422P modes in hardware, but does not support
either in standard broadcast mode. The Hi10P profile introduces 10 bit luminance quantization
and so increases luminance granularity from 256 to 1,024 quantization levels. This benefits both
AVC broadcast and contribution applications by reducing contouring on blanket backgrounds,
which AVC is prone to display because it uses an integer DCT and does not provide the DCT
noise that dithered out these artefacts with MPEG2. Since luminance sampling occurs before the
DCT stage, no increase in video bitrate is necessary as a result of using 10-bit precision: All that
is required is to ensure that the content is not continually compressed to its limits, enabling static
scenes to be moderately quantized to realize the benefits of improved luminance granularity on
reducing contouring. Although highly applicable to DTH broadcast, 10 bit precision is poorly
supported within the consumer decoder domain. HEVC will change this by making native 10 bit
support almost universal. Many 422 AVC encoders / decoders do support 10 bit, hence it has
become very popular for contribution.
The HI422P profile is almost exclusively used in high-end events contribution. 420 profiles sub-
sample chroma, where as 422 profiles do not. The benefits of 422 sampling become apparent
after multiple decode / encode generations, since chroma depth is maintained. By contract,

09/02/2015 8
Confidential

multiple 420 generations will lose chroma definition and suffer chroma ‘bleed’. The extent of the
problem will largely depend upon how accurately the downscaling and upscaling is done
between generations, which was a particular problem with MPEG2 (because the filter
characteristics between the encoder and decoder was poorly defined) compared to H.264 which
defines the filter characteristics fully.

Appear TV broadcast encoders enable easy user selection between constrained baseline, main
and high profiles for AVC, and main profile for MPEG2.

i. Level
Defining a profile is still too wide to ensure compatibility with decoders, and the concept of
‘levels’ within a profile has been adopted to narrow the specification down further.
Levels primarily indicate the range of bitrates, frame rates and resolutions that should be
supported within a profile. The general concept is that the simpler the profile, the lower the
minimum resolution, bitrate and framerate options provided within the levels will be to support
the intended use of that device type (eg. Baseline profile for low cost simple decoders powering
small, low resolution mobile phone displays).

The following table, courtesy of Wikipedia, lists all of the levels mandated within AVC.

Levels with maximum property values

Max video bit rate for video coding layer Examples for high
Max decoding speed Max frame size
(VCL) kbit/s resolution
@ highest frame rate
(max stored frames)
Level
Baseline, Toggle additional details
Luma Luma Extended High High 10
Macroblocks/s Macroblocks
samples/s samples and Main Profile Profile
Profiles

128×[email protected] (8)
1 380,160 1,485 25,344 99 64 80 192
176×[email protected] (4)

128×[email protected] (8)
1b 380,160 1,485 25,344 99 128 160 384
176×[email protected] (4)

176×[email protected] (9)
320×[email protected] (3)
1.1 768,000 3,000 101,376 396 192 240 576
352×[email protected] (2)

1.2 1,536,000 6,000 101,376 396 384 480 1,152 320×[email protected] (7)

09/02/2015 9
Confidential

352×[email protected] (6)

320×[email protected] (7)
1.3 3,041,280 11,880 101,376 396 768 960 2,304
352×[email protected] (6)

320×[email protected] (7)
2 3,041,280 11,880 101,376 396 2,000 2,500 6,000
352×[email protected] (6)

352×[email protected] (7)
2.1 5,068,800 19,800 202,752 792 4,000 5,000 12,000
352×[email protected] (6)

352×[email protected](10)
352×[email protected] (7)
2.2 5,184,000 20,250 414,720 1,620 4,000 5,000 12,000 720×[email protected] (6)

720×[email protected] (5)

352×[email protected] (12)
352×[email protected] (10)
3 10,368,000 40,500 414,720 1,620 10,000 12,500 30,000 720×[email protected] (6)

720×[email protected] (5)

720×[email protected] (13)
720×[email protected] (11)
3.1 27,648,000 108,000 921,600 3,600 14,000 17,500 42,000
1280×[email protected] (5)

1,280×[email protected] (5)
3.2 55,296,000 216,000 1,310,720 5,120 20,000 25,000 60,000
1,280×1,[email protected] (4)

1,280×[email protected] (9)
1,920×1,[email protected] (4)
4 62,914,560 245,760 2,097,152 8,192 20,000 25,000 60,000
2,048×1,[email protected] (4)

1,280×[email protected] (9)
1,920×1,[email protected] (4)
4.1 62,914,560 245,760 2,097,152 8,192 50,000 62,500 150,000
2,048×1,[email protected] (4)

1,280×[email protected] (9)
4.2 133,693,440 522,240 2,228,224 8,704 50,000 62,500 150,000
1,920×1,[email protected] (4)

09/02/2015 10
Confidential

2,048×1,[email protected] (4)

1,920×1,[email protected] (13)
2,048×1,[email protected] (13)
2,048×1,[email protected] (12)
5 150,994,944 589,824 5,652,480 22,080 135,000 168,750 405,000 2,560×1,[email protected] (5)

3,672×1,[email protected] (5)

1,920×1,[email protected] (16)
2,560×1,[email protected] (9)
3,840×2,[email protected] (5)
5.1 251,658,240 983,040 9,437,184 36,864 240,000 300,000 720,000 4,096×2,[email protected] (5)
4,096×2,[email protected] (5)

4,096×2,[email protected] (5)

1,920×1,[email protected] (16)
2,560×1,[email protected] (9)
3,840×2,[email protected] (5)
5.2 530,841,600 2,073,600 9,437,184 36,864 240,000 300,000 720,000 4,096×2,[email protected] (5)
4,096×2,[email protected] (5)

4,096×2,[email protected] (5)

The levels that Appear TV broadcast encoders support are as follows;

MPEG2: MPEG2 High. MPEG2 High 1440. MPEG2 Main. MPEG2 Auto mode.

AVC: H.264 1b. H.264 1.0. H.264 1.1. H.264 1.2. H.264 1.3. H.264 2.0. H.264 2.1. H.264 2.2.
H.264 3.0. H.264 3.1. H.264 3.2. H.264 4.0. H.264 4.1. H.264 4.2. H.264 Auto mode

The definition of auto mode may be perplexing, but simply means that the level will be
automatically selected from the parameters (video resolution, bitrate etc.) that have been entered
into the encoder. This is the default mode and means that you can almost forget about levels
when configuring the encoder, and will certainly not be bound by the constraints of any
particular level.

v. GOP Structure
The GOP structure determines how many B frames will be used within a GOP. B frames are
extremely efficient because they are predicted in both forward and past temporal directions and
therefore just convey change between two other references. The two references used to derive a
B frame can be I, P or (in H.264 only) even two other B’s if the ‘hierarchical B frame’ mode is
enabled. Maximum use of B frames results in the most efficient coding, as long as the content is
easily predictable (Contains limited motion). For high motion sequences, an over reliance on B
frames will introduce errors and will decrease VQ. In this case, a GOP Structure with less
reliance on B frames is required. Before it was possible for an encoder to adjust GOP structure
dynamically, this setting was a compromise between maximizing efficiency and VQ for static
content and coping with dynamic content. It explains why Appear TV encoders can adjust GOP
structure dynamically, within a maximum and minimum B frame range selected by the following

09/02/2015 11
Confidential

parameters. Of course, by defining the B frame ratio the GOP structure also defines how regular
P frames occur.

vi. Maximum B frames

The GOP structure used to be critically important because it was a static setting. It can now be
made dynamic, so that the encoder chooses the optimum GOP structure depending upon the
degree of motion within the content. This makes it possible to select GOP structures with long
maximum B frame lengths quite safely. MPEG2 is a notable exception, since B frames are much
more prone to error and VQ will drop quickly if B frames are over used. For this reason, a
maximum B frame GOP length of IBBBP is supported by the MPEG-2 standard.

For Appear TV encoders, the general recommendation is to use a maximum setting of IBBP for
typical MPEG2 applications (2 B frames).

For AVC, you can start at IBBBP (3 B frames) but this can be increased to up to 7 B frames if
the set top boxes will support this. Most recent STB’s will now support up to 7 B frames.
The ability to use up to 7 B’s efficiently is a strong differentiating factor for Appear TV encoders
which can help out perform competing products as long as the decoders being used support this.

vii. Minimum B frames

This is the reverse of the above (maximum B frames). Together, the maximum and minimum B
frame values provide the constraints for the adaptive GOP management process.

The recommendation for Appear TV encoders is to set this to IP, for both MPEG2 and MPEG4
applications.

i. Gop Size (length) and structure (open and closed)

A GOP (group of pictures) will contain one I frame, and this will always be at or near the start of
the GOP to provide a reference picture. If the GOP is defined as ‘Closed’, the I frame will be the
first picture within the GOP. If the GOP is defined as open, then it can have preceding B pictures
which will use the last picture of the preceding GOP (a P frame) as a reference. The start of a
GOP with ‘open’ and ‘closed’ options has been illustrated below.

09/02/2015 12
Confidential

By definition, a closed GOP is built entirely of references that are contained within the GOP, but
an open GOP can also use a reference (last P frame) from the previous GOP. An open GOP
structure is therefore more efficient, because operating in closed GOP mode requires an
additional P frame to be included where as in open mode, the use of external P references allows
this to be replaced with a B frame which is more efficient.

The GOP length parameter will define for how long the GOP structure repeats until another I
frame is inserted and the process repeats again. Only one I frame is present per GOP, and so the
GOP length sets the naturally repeating I frame interval. Normally, the GOP length is defined in
terms of number of frames but an internal setting (GOP SIZE MODE) can allow alternative
counting via frame rate.
An I frame at the start of each GOP is ‘naturally occurring’, because Appear TV encoders can
insert unplanned I frames to start a new GOP early as a result of the scene change detection
feature. When features such as scene change detection are enabled, GOP lengths are always
approximate and will be liable to change with video content.
Typically, long GOP settings result in the efficient encoding of predictable sequences but non-
predictable events (such as scene changes) require a reference I frame to be provided, ideally as
an IDR. Since GOP planning mechanisms differ between encoder manufacturers, setting equal
GOP lengths may produce very different comparative results between different encoder
manufacturers when dynamic tools such as scene change detection are enabled. For this reason,
GOP lengths should be set as recommended by the manufacturer, and should not be based on
values taken from dissimilar equipment.
Since Appear TV encoders are usually set to alter GOP length dynamically to changing content,
it is no longer necessary to be conservative about setting short GOP lengths unless required for
other reasons. For example, it might be necessary to use fixed short GOPs to provide regular
random access points (I frames) for quick channel changes, or to enhance video editing or
provide trick play modes. Viewing content in fast forward / reverse modes relies on frequent I
frame access points to make the video decodable, and therefore the I frame interval can set the
granularity of ‘scrubbing’ through content in trick play mode.

Generally, Appear TV encoders will perform best if long maximum GOP lengths are used.
Compared to many encoders, Appear TV GOP planning works optimally if longer GOP lengths
are set than may be considered normal. For H264, a GOP length of 44 will work well for many
broadcast scenarios. Making the GOP length divisible by 4 will optimize efficiency when
hierarchical B frame mode is used (AVC only), and this practice is recommended. For MPEG2,
the maximum GOP length should be shorter and 24 works optimally in many scenarios. Some
older MPEG2 decoders may be limited by the maximum GOP length of 15 specified in the
MPEG 2 DVD specification, where a GOP length of 12 was recommended for PAL frame rates
with 15 for NTSC.

There are further internal controls that also define GOP length. These include the MAX GOP
SIZE parameter which constrains dynamic GOP planning to ensure a preset GOP interval is
never exceeded. This can be useful in limiting the worst case channel change time, but does not
affect the average channel change time (which is determined by the average GOP length).

ii. GOP control

09/02/2015 13
Confidential

If an Appear TV transcoder is being used, then the internal ‘GOP Control’ parameter can set the
re-encode stage to follow the incoming GOP structure if it is detected that the original GOP
structure is closed. The default setting is disabled, so the transcoder will use the GOP structure
that has been manually defined.

iii. Reference B frames

The H.264 standard allows a B frame to be used as a reference for another B frame. In GOPs
with a high ratio of B frames, it enables the reference point to be brought closer within the GOP
than it would otherwise be if P frames were the only reference option, since these can become
distant when multiple B frames are present. As a result, VQ can become significant improved for
low motion sequences in particular. The RefBFrames parameter controls this with 0 disabling B
references (may be required to address some legacy set top box issues) and 1 enabling B
references (default). For best VQ, it is recommended that this feature is enabled and that the
GOP length is a value that is divisible by 4 (eg. 40)

iv. 8x8 transform

This parameter is only available in H.264 high profile mode. The 8x8 transform is a high profile
feature that adds this as an option to 4x4 transform. When enabled, the encoder can select 8x8 to
provide VQ improvements, particularly the preservation of detail in complex sequences.
This is controlled by the ‘8x8 Transform’ setting which can be set to 0 (off) or 1 (enabled, the
default setting for AVC high profile mode).
It is recommended that 8x8 transform is always enabled when in high profile mode.

v. Scaling Matrix
An MPEG 2 encoder will divide each macroblock into 8x8 pixel blocks, and each will undergo
DCT to derive the vertical and horizontal frequency components that describe these 8x8 pixels in
frequency terms. At the left is the DC component, followed by the low frequency components
which get higher in frequency as you move right. At this stage, it is possible to transform
between the amplitude representation and frequency representation by performing DCT / inverse
DCT stages quite freely, because the DCT stage is the precursor to the lossy compression that
will take place and is not the source of it by itself.
What the DCT does do is prepare the co-efficients for lossy compression by presenting what
were discreet pixel amplitude values as a series of frequency co-effieicnts that describe the 8x8
block in ascending frequency component order. This is important because lossy compression will
be performed by discriminating on a frequency basis, so that high frequency (and therefore high
detail) components of the block being processed can be eliminated if necessary to meet the video
bitrate constraints that have been set.
Following DCT, the 8x8 pixel values that formerly represented the amplitude of each pixel are
transformed into 8x8 co-efficients that represent the frequency of the entire block.

09/02/2015 14
Confidential

Horizontal Coeficients

Low Frequency High Frequency

DC value

Low Frequency High Frequency

Vertical Coeficients

The aim of the lossy compression process to favour frequency components that the eye is
sensitive too, and therefore will be missed, and eliminate the higher frequencies that are of lesser
significance objectively. In transforming the block data into the frequency domain, the DCT has
presented the data in an ideal way to process in this way. The first step is to multiply them with a
Quantisation Scale (Q scale) factor. The second step is to divide the values by a pre-defined
quantisation matrix which defines a set value for each frequency components that it will be
divided by. The result will also be rounded to a whole number.
The Q scale factor enables the degree of quantisation to be varied, and will be set by the rate
control loop. It is a global value. The quantisation matrix enables the divisor of each DCT
frequency co-efficient to be set unequally, to ‘weight’ the co-efficients in terms of importance.
Furthermore, the quantisation matrix can be defined with the objective of reducing as many
higher frequency components as possible to zero to provide efficient and targeted lossy
compression.

8 x 8 DCT co-efficients following division by the quantisation matrix.

Vales are rounded following division.

These small values are efficient to transmit. They need to be sent to the receiver where they will
be re-scaled using the same quantisation matrix and passed through an inverse DCT stage to

09/02/2015 15
Confidential

recover what will hopefully be a good objective approximation of the original pixel vales, with
the zero values being the result of the ‘lossy compression’ stage. It is the zero values that are the
contributors of errors.
For MPEG4, 4x4 is the ‘core’ transform. There is also a 4x4 and 4x2 option (using Hadamard
transform) and 8x8 in high profile only.

The ScalingMatrix parameter specifies the scaling matrix used, with 1 being optimised for use
with CABAC entropy coding. The default is 1
The Intra DC precision parameter specifies the bit depth for DC co-efficients and therefore sets
the degree of precision. It has 5 possible settings. 0=8 bits. 1=9 bits. 2 =10 bits. 3=11 bits.
4=auto, and is default. The use of 11 bits is not supported in MPEG2 mode. The use of AUTO is
strongly recommended to prevent banding or blockyness on some types of video content.

The QScale type parameter sets the Qscale values used as DCT coefficient multipliers. Two
options are available; 0= linear (and sets a linear scale from 1 to 32) and 1 = non linear (which is
also default and sets the scale from 0.5 to 56 to provide greater granularity). The default (non
linear) should always be used and will achieve noticeably improved performance when the
encoder is being pushed to its limits (requirement for high resolutions at very low bitrates).

Sending these compressed values involves further stages of compression. This is lossless data
compression (entropy coding).

vi. Rate Control Mode

The rate control mode determines whether the encoder operates in CBR (constant bitrate mode),
Capped VBR (capped variable mode) or statistical multiplexing mode.
Statistical multiplexing mode is not selected directly from the GUI, but is selected automatically
when the encoded service is added to a stat mux group. Upon leaving the stat mux group, a
service will be set to CBR mode automatically using the last CBR bitrate that was set as the
default.

CBR mode implements a feedback loop between the DCT and the output buffer (CPB or
‘constant picture buffer’) to keep the output bitrate constant. The feedback loop is generally
known as the ‘rate control’ mechanism.

Fixed Output Bitrate (eg.

DCT CPB buffer
2Mb/s)

Rate Control

The video bitrate at the output of the DCT is not constant, and will vary with content and picture
coding type. The rate control mechanism works to maintain the CPB buffer occupancy at a
consistent level (within defined limits) by adjusting the degree of quantisation performed by the
DCT. The CPB buffer characteristics are important to ensure interoperability with the decode
buffer implemented within MPEG compliant set top boxes. The standard buffer model incurs

09/02/2015 16
Confidential

appreciable delay and was specified with DTH applications in mind. It is large enough to
accommodate peaks in incoming bitrate associated with events such as I frame insertion without
having to spontaneously re-quantise to compensate for peaks which could result in I frame
pulsing. The CBP buffer size is pre-set on the Appear TV GUI to provide optimum VQ and set
top box interoperability and changing it (to try and reduce delay for example) should not be
attempted without consulting the Appear TV support team first to request assistance.

VBR mode changes the mode of operation to allow the output bitrate to vary; Outgoing video
bitrate is no longer maintained constant. In this mode, the output bitrate changes in response to
image complexity, so that the incoming material is always encoded at the correct bitrate to meet
a pre-set threshold for VQ. This is ideal for recording encoded files to disk, since video quality is
maintained to a consistently high standard whatever the difficulty of the content, and the file size
is kept minimal because there is no wastage and only the bitrate required to encode a picture to
the defined quality threshold is used. With capped mode, you can place a video bitrate cap on the
upper video bitrate that is allowed.

The BitRateFormat parameter specifies if the video bitrate sets the video elementary stream
bitrate, or the video transport stream rate which is inclusive of MPEG TS packetisation
overheads. The default (in common with most manufacturers) is elementary stream rate.
The MaxPicSize parameter can limit the maximum picture data size to comply with limitations
in some legacy devices. An example is the Motorola Cherry picker with some legacy firmware
versions. The default setting is 0 (no artificial limit) but it is possible to define a limit in bytes up
to a maximum of 4,294,967,295 bytes. There should be no reason to use this parameter except in
exceptional circumstances which should always be guided by advice from the Appear TV
engineering team.

4 Video Encoding (additional tools)

i. Adaptive Quantisation
Sometimes it is preferable to take a linear approach towards prioritising how a picture is encoded
and treat everything equally. This is certainly the approach to be taken if the encoder is
undergoing automated assessment for objective VQ using PQA or DMOS, which will compare
each pixel equally. However it is not necessarily true when a human observer is assessing VQ,
since the human visual system is fairly insensitive to certain degradations (such as if the picture
boarders are less well defined) but more focussed on others, such as the detail towards the centre
of the picture and particularly any human facial areas that are present. The adaptive quantisation
parameter is the master switch for enabling adaptive tools such as boarder processing. When
enabled, bitrate will be targeted towards areas of interest for a typical human observer. This will
mean a higher perceived VQ when viewed subjectively, but a lower DMOS or PSNR value when
assessed objectively. The default value is enabled, and so this feature MUST BE DESELECTED
when performing automated tests.
The difference between a human observer and the readings of even weighted tests such as
DMOS is the reason why objective testing should have a limited role in encoder testing, and
should not be used as the exclusive test for making encoder vendor selections.

09/02/2015 17
Confidential

In Appear TV encoders, the adaptive QP toolset covers boarder processing, saliency detection
(detects prominent features), consistency enhancer to enhance boarders / edges, eye tracker,
texture classifier / skin tone identification, low light detection, grass detection, logo and banner
detection and noise / mosquito and ringing rejection. For best VQ, it is recommended that
adaptive QP is enabled except when performing automated quality measurements.

i. Quantisation Table
This parameter applies to MPEG2 only. It will affect H.264 operation adversely if set to anything
other than 0 (default) in H.264 mode.
With MPEG2, it is possible to trade intra predicted picture artefacts between either

 A tendancy to produce sharply defined macroblock edges

OR
 Reduced macro-block edging but with greater edge ringing / mosquitoes.

This is determined by adjusting the quantisation steps within the DCT using this parameter.
Possible values range from 0 (sharp macroblock edges, default) to 4 (smoothed macroblock
edges with prevalence for ringing / mosquito artefacts.
This is an advanced, internal setting that will require careful adjustment if not left at default.

viii. Fade Detection

Fades are characterized by static scenes that simply change in luminance intensity. When a fade
is detected, this feature alters the B-picture structure to efficiently encode the fade as B pictures.
This will take place as long as the GOP structure has been placed in dynamic mode, and has not
been fixed. If a fixed GOP has been set then fade detection will automatically be deactivated.
The recommendation to achieve best VQ is to always enable fade detection.

i. Scene change detection

A series of pictures containing predictable motion can be economically encoded using the
standard MPEG processes that exploit the redundancy in this content. However, scene changes
by definition present a completely new picture with minimal or no correlation to previous
pictures. The redundancy within picture content at a scene change can be non-existent and the
decoder will require a new reference before it can start using efficient predictive based methods
again. The I (Intra) frame provides this reference. When a scene change is detected, enabling this
feature will automatically adjust the GOP structure so that the first new frame after the scene
change is I frame encoded. This provides ‘clean’ scene changes as long as the intra frame can be
supported from a bitrate perspective, since the file size of an intra-encoded picture is
comparatively high. This makes I frame insertion a very useful tool for encoders operating in
standard delay mode, where the CPB (Constant Picture Buffer) is set large enough to smooth out
the peak in absolute bitrate. The technique also works extremely well with stat mux systems with
good forward bitrate planning (‘look-ahead’).
Although Appear TV can (with an internal setting) provide the I frame as a standard I frame, the
default is to flag it as an IDR (Instantaneous Decoder Refresh). This means that the decoder will
mark any pictures that it has already decoded in its picture reference buffer as ‘unused for
reference’. The IDR frame will therefore mark a clear point in time when the receiver will only

09/02/2015 18
Confidential

predict in the forward direction, from the IDR frame onwards. Marking the I frame as an IDR is
therefore the correct option whenever there really is no correlation between the pictures before
and after the scene transition has occurred.
IDR insertion is not possible for OTT applications where IDR’s signal the chunk boundary
points. IDR insertion may also not be feasible in applications where splicers are being used.

i. IDR Frequency
The IDR Frequency parameter enables rules to be set to define repeating IDR points. In AVC
mode, it has four settings (0 to 4). Mode 0 (default for all broadcast applications) declares no I
frames as IDR. Mode 1 declares every I frame as an IDR (compromising efficiency but
providing very regular splice points). Mode 2 defines every 2nd I frame as an IDR; mode 3 every
3rd I frame and mode 4 every 4th I frame. For applications requiring regular access points, this
enables the frequency v compression efficiency trade-off to be set according to requirements.
In MPEG 2 mode, the setting has a ‘switch’ function with only mode 0 or 1 being valid. Mode 0
sets OPEN GOP mode and mode 1 sets CLOSED GOP mode.

ix. Weighted Prediction.

The H264 standard provides a tool for helping encode fades efficiently. It exploits the temporal
redundancy within content by dividing the image into blocks and can efficiently transmit areas of
motion using motion estimation to detect and communicate areas of change. However, when
luminance is changing, it can cause the motion estimation process to fail since blocks that should
be correlated may not be because luminance has significantly changed. The H264 standard
supports the use of a weighting factor to help predict luminance changes. When used correctly,
significant reductions in the bitrate required to encode pictures with varying luminance levels
can be obtained. In Appear TV encoders, Weighted prediction and fade detection should be used
together to obtain maximum benefit. The feature is available in both main and high profile
modes.

5 Entropy Coding
i. CAVLC / CABAC
Quantised transform co-efficients from the DCT (which have been ‘zig-zag scanned’) can be
further compressed using a lossless data compression stage which is undone in the decoder to
recover the original co-efficients. Lossless compression works by recognizing patterns that occur
more commonly, and represents them with shorter code words than patters occurring less
frequently. This is known as Entropy coding, and its overall contribution to the encoding gain of
the overall system is very significant.

MPEG2 uses Huffman coding, which uses a fixed coding table. The first H.264 option, CAVLC,
is very similar but can select one of four fixed VLC tables depending upon the data to be
encoded and the ratio of trailing 1’s and 0’s in the samples. The decoder needs to know what the
shortened variable codes represent, in terms of the true original word length and one of six Exp-
Goulomb codes is defined in the standard to enable this. The additional flexibility provided with
CAVLC provides a big step up in efficiency compared with Huffman coding.
The second H.264 option is even more complex and is CAVLC. This has the ability to
dynamically adapt to different content, but can result in compression gains of between 5 to 10 %
depending on content.

09/02/2015 19
Confidential

The CABAC parameter sets either CAVLC mode (0) or CABAC mode (1, which is default).

6 Statistical Multiplexing

Statistical multiplexing takes the same approach towards fixing the video quality of a service to a
consistent level and also achieves this by making the video bitrate of the service the variable
factor. However, rather than treat each service in isolation, statistical multiplexing uses the total
bandwidth allocated to multiple video services as a resource pool. For example, imagine a DTT
multiplex providing a total capacity for video of 22Mb/s, shared between 12xSD AVC services.
The system will treat the 22Mb/s as a ‘pool’ (the statmux pool) to be shared between all services.
How well the bitrate actually gets distributed between the services, so that best use is made of it,
depends upon the implementation of the statistical multiplexing system.

Appear TV’s second generation statistical multiplexing is designed to be second to none. It is an

exhaustive, performance optimised design that works as follows;

Step 1: Each encoder analyses the video content it is encoding periodically.

Encoder 1

Encoder 2

Encoder 3

Encoder 4

Encoder 5

Etc.

09/02/2015 20
Confidential

Appear TV’s second generation statistical multiplexing system analyses the incoming video at
10ms intervals to determine the instantaneous encoding complexity. This is achieved by
equipping each encoder with a pre-encode stage that exists only to perform this assessment. The
pre-encode stage performs an intra-frame encode ahead of the main encoder, which has its input
artificially delayed by the look ahead buffer.

Stat Mux
Stat Mux Metrics Controller

Encoder 1 Pre Encode stage

Video Input
Encoder 2
Look ahead buffer Main Encoder O/P

Encoder 3

Encoder 4

Encoder 5

The stat mux controller receives bitrate requests from all of the encoders and transcoders
participating in the stat mux group. Appear TV equalises the timing / delay between different
module types, meaning that encoders and transcoders working in either HD or SD mode and
with MPEG2 or MPEG4 outputs can all be mixed within a stat mux group.

The role of the stat mux controller is to decide how the total available bitrate (statmux pool) is
divided to achieve optimum results. The following is taken into account when determining the
actual bitrate that will be allocated for each encoder during the 10ms time segment;

 The metrics from the pre-encode stage which indicate the bitrate required to encode
optimally. This is the encoder’s estimation of the bitrate needed to enable it to encode the
content to a high degree of VQ. The accuracy of these metrics is critically important for
optimum operation of the system.
 The quality weighting that the customer has set for the service, which provides either a
positive or a negative bias for that service. If positive, it highlights the service as being
important and instructs the arbitration process to favour it compared to others with lower
priority weighting. This ensures the bitrate requested is usually allowed, resulting in
higher and more consistent VQ. If negative, it lowers the priority of the service so that if
necessary, the bitrate allocation for the service can be considerably lower than requested.
This can make the variation in VQ for the service less well controlled since the optimum
video bitrate may, which may fit a low priority

09/02/2015 21
Confidential

 The B (min) parameter. The service will never be allocated less than video bitrate defined
in B (min)
 The B (max) parameter. The service will never be allocated more than video bitrate
defined in B (max)

The role of B (min) is largely historical and dates back to the early days of statistical
multiplexing. Initially, there was no look ahead and the bitrate estimation and bitrate
planning stages were rudimentary by today’s standards. It was possible for systems to set
very low video bitrates during easy content, and then not be able to respond fast enough to
capture spontaneous events such as pans and scene changes which require a significant
increase in bitrate to encode without artefacts. Additionally, many systems only sampled
every few tens of frames. The B (min) parameter was provided to compensate for the lack of
accuracy and responsiveness in these early systems by preventing the bitrate of premium
channels from being set too low.
In contrast, the dedicated pre-encoding stage of Appear TV’s new statistical multiplexing
system is able to gauge real bitrate requirements very accurately. The delay before the real
encode takes place provides time for a proper bitrate arbitration process to take place, and
provides advance visibility of scene cuts and pans in particular. Very often, the encoded
picture type will have to change as well (for example, I frame insertion at a scene change).
The Appear TV system combines ultra-rapid 10ms sampling with the capability to allow the
entire process to be planned in advance, well before the primary encoder encodes the section
of content. Although Appear TV has retained the B (min) setting, in reality it is redundant
and can be safely set to its minimum value of 250kb/s.

The B (max) setting limits the maximum video bitrate that a service will be allocated and can
be useful in the following scenarios;

 When turning around and re-multiplexing individual statistically multiplexed

services, the B (max) represents the peak video bitrate that the service could attain. It
is therefore the bitrate that must be reserved to pass the service through without
problems unless it is going to be transcoded. This only applies when services are
separated from their original stat mux groups since if the whole group of services is
passed through, the total bitrate required will be fixed and will be the stat mux pool
bitrate.
 B (max) can be used as a further constraint (with the priority weighting) to limit the
overall bandwidth utilisation that a non-priority service can have. It is therefore a
useful fine tuning parameter.

The diagram below completes the stat mux system overview from a top level perspective.

09/02/2015 22
Confidential

 Bitrate requests for all services received

 Allowed bitrate determined on basis of;
Stat Mux Metrics - Bitrate request
(requested bitrate) - Stat mux group pool size
From other encoders Stat Mux - Quality weighting of the service
Controller -B (min) and B (max) of the service

Stat Mux Metrics Alloc

ated
(requested bitrate) Bitra
tes
Allocated Bitrate

Other
Encoders
Encoder 1 Pre Encode stage

Video Input
Encoder 2
Look ahead buffer Main Encoder O/P

Encoder 3

Encoder 4

Encoder 5

Although the Appear TV statistical multiplexing system is fully automatic, its role is to optimise
bandwidth distribution and getting the most out of it still requires taking great care in setting the
MPEG configuration of the encoders optimally. Additionally, it can also place a greater
emphasis on using the pre-processing tools correctly. This is because statistical multiplexing is
exploiting the variation that exists between content. To be truly effective, a stat mux group
should contain a good mixture of content types and ideally have at least 8 SD services or four
HD services within the stat mux group. The statistical variation within a service will be nullified
if it appears to have difficult content all of the time, which is what the presence of random noise
will do. Poor quality sources, or those derived from analogue feeds, must therefore be assessed
for such artefacts at the encoder source (and not at the output of the encoder) with the MCTF and
pre-deblocking filters used fairly aggressively, especially if low frequency analogue noise is
seen. At typical DTH bitrates, even harsh unfiltered noise will usually be quantised out by the
DCT during compression, so the output of the encoder will appear clean. However, it will still
have caused the statistical multiplexing system to over-allocate bitrate for the service, and the
noise and sharp edges of previous MPEG2 macroblocks will jointly cause sub-optimal MPEG
encoding mode decisions to be made within the encoder. The result will be sub-optimal encoding
efficiency for the service, with bandwidth stealing from the rest of the stat mux group. Applying
the filtering tools provided can stop this from happening.

Another method for verifying the performance of the stat mux group is to use the graphical
reporting tools provided by the system. This is able to show the instantaneous and long term
performance of services in terms of bitrate variation, QP, and longer term historical bitrate
utilisation for a service.

7 Outgoing video formats (broadcast mode)

09/02/2015 23
Confidential

Various options exist for setting the output video format. The options will be automatically
filtered by Appear TV, so that only sensible options that are supported in the mode being used
are displayed to users. The full suite is controlled by the following parameters.

WIDTH controls the desired width of the outgoing video. Options are 1920, 1280 and 720.
HEIGHT controls the height. Options are 1080, 720, 480 and 576 lines.
Frame Rate specifies the outgoing frame rate. Options are 23.967, 24, 25, 29.97, 30, 50, 59.94,
60, 12, 15, 10, 9.9, 11.99, 12.5, 14.99.
The encoder cannot convert from fractional to integer frame rates and vice versa.
VIDEO FORMAT specifies if the outgoing video is INTERLACED or PROGRESSIVE
ASPECT RATIO specifies square or widescreen. With 4x3, 16x9, 14x9 as possible options.
DETELECINE specifies if 3-2 pulldown is enabled for the outgoing video. This is either enabled
or disabled.
MADEINT enables advanced 8 field motion-adaptive pixel-adaptive low-angle de-interlacing.
De-interlacing to provide a progressive output has become a complex subject that is worthy of
some explanation. In general, there are two types of method used. INTRA FIELD interpolation is
used when motion is present, and INTER FIELD is used when motion is not present.
Considering INTRA FIELD methods first, the simplest type is line repetition. By definition,
INTRA FIELD requires only the current field to reconstruct the missing lines needed to convert
from interlaced to progressive. Simply repeating lines is crude, and is unable to exploit any
temporal redundancy that exists between successive frames, but it has been used because of its
simplicity. A slightly more complex method is linear interpolation, where missing lines are
created by averaging the lines above and below. Although this performs better, both methods are
poor and suffer from excessive aliasing and jitter.
INTER FIELD methods use other fields to provide de-interlacing. This method can de-interlace
stationary objects perfectly. For example, if an object is stationary, then simply combining the
odd and even fields (field repetition) will provide perfect results. If the object is moving, then
this will not be the case because the movement will cause the odd and even fields to no longer be
identical and representative of a single de-interlaced picture. The result will be severe blurring
around the moving object.
Bi-linear field interpolation uses the average of previous and future lines to de-interlace the
current missing line. This method can work very well for stationary objects, but will exacerbate
any motion problems because the motion over an even longer time period is taken into account.
Any high performance de-interlacer must use a combination of INTRA and INTER field
methods. It is clear that INTER FIELD can only be used with pixels that have no motion, and
this needs to be determined by comparing past and future fields. For best performance, the
comparison must be done only between identical pixels on the same line. For various reasons, if
the comparison is made between just two fields, there is a tendency to falsely declare movement
when there actually is none. If this is increased to three fields, movement can be missed. Four
fields provides a good compromise between complexity and accuracy. Since the Appear TV de-
interlacer is performance focussed, it actually uses eight fields of comparison.
Pixels where motion is detected use an advanced INTRA FIELD method which generates the
non-stationary pixels from adjacent pixels within the same field, but with a boundary edge
detection algorithm that ensures the new pixel values are correlated to real objects.
MADEINT is enabled by default, and should not be disabled. Doing so will de-activate this
advanced processing and will result in poorer de-interlacing performance.
HORSHARPNESS controls a filtering stage that is applied when changing the horizontal
resolution. This is to better conceal the artefacts that can result from resolution changes. The
filter characteristics can be set between -10 and +4, with 0 being default. High negative values

09/02/2015 24
Confidential

increases aliasing and artefacts in the vertical direction but provides increased sharpness, while
more positive values decreases artefacts but also decreases sharpness in the vertical direction.
VERSHARPNESS performs exactly the same function as HORSHARPNESS and has exactly
the same control range, but operates in the vertical direction.

8 Recommended Settings table.

This section only covers the settings that influence VQ. Some of these are only available in the
high-VQ mode.

Feature What is it? Recommended setting Recommended

(Visual / subjective) Setting (PQA /
Objective)

Horizontal A pre-filter that If the other encoder Off. Never have

Rescale when set, reduces settings have been rescaling enabled
the Horizontal video configured optimally and when performing
resolution into the you still see excessive objective tests.
encoder. MPEG artefacts on the
outgoing video, then
consider rescaling. It is
often best to set this
parameter last. When
performing dramatic
bitrate reductions
especially when encoding
to MPEG2, reducing
horizontal resolution may
provide subjective
benefits by reducing the
level of visible MPEG
artefacts.

Pre de- A pre-filter that For high quality sources Off. Never have
blocking filter removes high- can be set to off or 1. For this feature
frequency artefacts MPEG2 DTH sources, set enabled when
such as macroblock to at least 2. When performing
edges. Particularly performing dramatic objective tests.
useful for MPEG2 decreases in video
sources that have bitrate, can be set
been aggressively aggressively to provide
compressed for some high frequency
DTH. Overuse will softening before the

09/02/2015 25
Confidential

result in natural high encoder stage.

frequency details
being removed from
content.

MCTF (Motion The primary noise It is difficult to assess the MCTF should be
Compensated reduction tool. This degree of noise in set to OFF. This
Temporal pre-filter is adaptive content unless you can means completely
Filtering) and designed to view the source. For most off, which currently
remove random turnaround, its best to requires a
noise whilst leaving assume it is there and set command to
wanted content. Use moderate MCTF filtering activate MCTF
too aggressively and (setting 2) especially if bypass via Telnet.
it will remove fine you are stat muxing the
details from your source. For clean sources,
image. you can set it to 1 or off
(but OFF is a mimimum
filter setting and still
applies MCTF at its
lowest setting.

Skin tone Applies less This setting is very Irrelevant because

detection quantisation and subjective. Recommend it you must ensure
therefore expends is used mildly if at all for that ALL of the
more of the most situations. Set to 1 adaptive QP
available bitrate typically. features are
budget on ‘skin’. deactivated. This
Ideally, this should can currently only
make facial areas be done using
more detailed which Telnet. There is a
are natural ‘regions master switch for
of interest’ for the all adaptive QP
viewer. This setting tools. Also, be
belongs to a aware that if the
complex family of unit is re-powered
‘adaptive QP’ or re-booted, the
functions which original default
includes boarder database setting
processing, texture will turn the
enhancers, edge features on again!
detection etc.

GOP structure Defines the extent Set to auto unless the Set to Auto
to which temporal customer has special

09/02/2015 26
Confidential

prediction is used. reasons for wanting to

This must match the see GOP restrictions put
use case and the in place.
content

GOP length This sets the For MPEG2 24 is a good Same settings as
maximum distance place to start. For AVC, subjective
between I frames. set to 48. Use up to 60
The GOP planner for non-denanding
can and will often channels. Appear TV GOP
reset the GOP in lengths are typically set
response to the above average to achieve
action of scene best performance.
change detection
and other tools.
Some customers
genuinely need to
restrict maximum
GOP lengths to limit
channel change time
(IPTV customers
mainly). Some will
ask for GOP length
to be set the same
as another encoder
to make it a ‘fair
test’. This is not
correct and is an
invalid argument.
See main narrative
for the reasons why.

AVC high Always use high Choose HIGH profile and Choose HIGH
profile profile unless the check 8x8 transform profile and check
customer prohibits 8x8 transform
this for STB
compatibility
reasons. It enables
the 8x8 transform
which you need to
enable separately.

Weighted For AVC, these help Enable both Enable both

Prediction and motion searches

09/02/2015 27
Confidential

fade detection during predictable

transitions. See
narrative for more
info.

MPEG2 scan Determines how zig- Alternate Alternate

mode zag scanning takes
place

Reference B Enhances long GOP Enable Enable

frames accuracy by bringing
the reference
closer…although the
reference is another
B frame

Open / closed Determines whether Open, unless the Open

GOPs temporal prediction customer has a genuine
can run between reason for using closed
GOPs GOPs.

CPB buffer This is a critical Default Default.

buffer at the output
of the encoder. See
narrative for further
info. Adjusting this
buffer can cause
interoperability
issues with the
decoder. It will
change the latency,
and for transcoders
will put the lip sync
of pass through
services out. Setting
the buffer too low
will negatively affect
VQ, and will cause
visible I frame
pulsing and impair
the fidelity of intra
pictures, and
therefore references.

09/02/2015 28
Confidential

Notes:

Field Frame: This setting is now configured automatically according to the following rules;
 Progressive content. Always Frame.
 AVC encoding. PAFF
 MPEG2 encoding. MBAFF. (PAFF can also be used with good results but can accept
very old MPEG2 STB’s with low decoder RAM. If this happens, motion areas will
exhibit motion judder.

09/02/2015 29

The Art of ChatGPT Prompting - A Guide To Crafting Clear and Effective Prompts PDF
100% (3)
The Art of ChatGPT Prompting - A Guide To Crafting Clear and Effective Prompts PDF
31 pages
Redux State Management
80% (5)
Redux State Management
3 pages
AVL Trees
No ratings yet
AVL Trees
18 pages
Guide Voice & Video Over IP
No ratings yet
Guide Voice & Video Over IP
12 pages
Active Format Description
No ratings yet
Active Format Description
9 pages
Auto Script Brochure
No ratings yet
Auto Script Brochure
24 pages
Broadcasters Guide To Smpte 2022
No ratings yet
Broadcasters Guide To Smpte 2022
7 pages
Understandin Color and Gamut
No ratings yet
Understandin Color and Gamut
2 pages
2011 3 2 SMPTE Interop DCP Guidelines With Accessibility
No ratings yet
2011 3 2 SMPTE Interop DCP Guidelines With Accessibility
9 pages
AV Over IP Solutions - en
No ratings yet
AV Over IP Solutions - en
32 pages
Broadcast IP Transformation Report 2022 4
No ratings yet
Broadcast IP Transformation Report 2022 4
26 pages
DOLBY & SMPTE ST 2110-31 Applications
No ratings yet
DOLBY & SMPTE ST 2110-31 Applications
55 pages
4K Video Over SMPTE 2022-5-6 Workflows
No ratings yet
4K Video Over SMPTE 2022-5-6 Workflows
7 pages
Presentation Practical IP For Broadast Engineers V3 2018-10-29 Annotations
No ratings yet
Presentation Practical IP For Broadast Engineers V3 2018-10-29 Annotations
410 pages
PCR Measur Tektronix PDF
No ratings yet
PCR Measur Tektronix PDF
24 pages
AES67 & ST 2110 Deeper Dive
No ratings yet
AES67 & ST 2110 Deeper Dive
54 pages
Leading The Broadcast Industry
No ratings yet
Leading The Broadcast Industry
35 pages
TV
No ratings yet
TV
117 pages
Video Streaming
No ratings yet
Video Streaming
8 pages
EBU Tech 3320 User Requirements For Video Monitors PDF
No ratings yet
EBU Tech 3320 User Requirements For Video Monitors PDF
20 pages
DVB-T Measurement With TV Meter Promax
No ratings yet
DVB-T Measurement With TV Meter Promax
78 pages
Leitch Serie 6800
No ratings yet
Leitch Serie 6800
4 pages
Bitmovin - DRM - Digital Rights Management - Whitepaper - 2020
No ratings yet
Bitmovin - DRM - Digital Rights Management - Whitepaper - 2020
21 pages
Ip Live Production and The Business of Broadcasting
No ratings yet
Ip Live Production and The Business of Broadcasting
21 pages
Cisco Digital Headend Solution: Bojan Nedelcev Systems Engineer SPVTG - Emerging Markets Belgrade, November 2009
No ratings yet
Cisco Digital Headend Solution: Bojan Nedelcev Systems Engineer SPVTG - Emerging Markets Belgrade, November 2009
40 pages
Power Point Presentation On Direct To Home (DTH)
92% (13)
Power Point Presentation On Direct To Home (DTH)
16 pages
Moving To The Media Cloud
No ratings yet
Moving To The Media Cloud
12 pages
Digital Television (DTV) Is The Transmission of Audio and Video by Digitally Processed and
No ratings yet
Digital Television (DTV) Is The Transmission of Audio and Video by Digitally Processed and
5 pages
Voice Over IP
No ratings yet
Voice Over IP
32 pages
Methods For The Measurement of The Performance of Studio Monitors
No ratings yet
Methods For The Measurement of The Performance of Studio Monitors
36 pages
AVIWEST - StreamHub Product Presentation
No ratings yet
AVIWEST - StreamHub Product Presentation
35 pages
DTH (Direct To Home Television)
No ratings yet
DTH (Direct To Home Television)
13 pages
HD Standards
No ratings yet
HD Standards
10 pages
Video Technology Report - REV4
No ratings yet
Video Technology Report - REV4
4 pages
Digital Set Top Box (STB) - Open Architecture/Interoperability Issues
No ratings yet
Digital Set Top Box (STB) - Open Architecture/Interoperability Issues
10 pages
Ott CP 27032015
No ratings yet
Ott CP 27032015
118 pages
Indonesia - Roadmap For The Transition From Analogue To Digital Terrestrial TV
No ratings yet
Indonesia - Roadmap For The Transition From Analogue To Digital Terrestrial TV
140 pages
TV Whitespace
No ratings yet
TV Whitespace
2 pages
Video Transcoding
No ratings yet
Video Transcoding
16 pages
Newtec M6100 R2.6
No ratings yet
Newtec M6100 R2.6
4 pages
OTT Solution PDF
No ratings yet
OTT Solution PDF
19 pages
Sony Vaio VPCZ1
No ratings yet
Sony Vaio VPCZ1
17 pages
Digital Television Terrestrial Multimedia Broadcasting (DTMB) - A New Itu-R DTV Terrestrial Broadcasting Standard For China and Other Markets
No ratings yet
Digital Television Terrestrial Multimedia Broadcasting (DTMB) - A New Itu-R DTV Terrestrial Broadcasting Standard For China and Other Markets
37 pages
SSL System
No ratings yet
SSL System
40 pages
Design of Digital TV Receive System Based On DVB-T
No ratings yet
Design of Digital TV Receive System Based On DVB-T
4 pages
Settingup CATV Headend
No ratings yet
Settingup CATV Headend
5 pages
Digital Television Via IP Multicast: by Pradeep Patel & Vidhi Patel
No ratings yet
Digital Television Via IP Multicast: by Pradeep Patel & Vidhi Patel
34 pages
AES67 One Standard To Unite Them All
No ratings yet
AES67 One Standard To Unite Them All
11 pages
Digital Terrestrial Television Broadcasting
No ratings yet
Digital Terrestrial Television Broadcasting
12 pages
Master Antenna Television System and Satellite System
No ratings yet
Master Antenna Television System and Satellite System
6 pages
Infographic-SMPTE ST 2110 0
100% (1)
Infographic-SMPTE ST 2110 0
1 page
Ec1011 Television Video Engineering
100% (3)
Ec1011 Television Video Engineering
21 pages
Smpte Uk & Sam - ST 2110 The Basics Final
100% (1)
Smpte Uk & Sam - ST 2110 The Basics Final
50 pages
BRKSPV-1222-Cisco IP Fabric Arquitectures For Video Production and Broadcast Workflows PDF
No ratings yet
BRKSPV-1222-Cisco IP Fabric Arquitectures For Video Production and Broadcast Workflows PDF
127 pages
Tiernan AVC4000SD Manual
No ratings yet
Tiernan AVC4000SD Manual
2 pages
Error Detection and Data Recovery Architecture For Motion Estimation
100% (1)
Error Detection and Data Recovery Architecture For Motion Estimation
63 pages
Analog Dialogue, Volume 47, Number 4
From Everand
Analog Dialogue, Volume 47, Number 4
Analog Dialogue
No ratings yet
Colour Banding: Exploring the Depths of Computer Vision: Unraveling the Mystery of Colour Banding
From Everand
Colour Banding: Exploring the Depths of Computer Vision: Unraveling the Mystery of Colour Banding
Fouad Sabry
No ratings yet
White Paper - Video Quality - AcceptTV
No ratings yet
White Paper - Video Quality - AcceptTV
5 pages
HEVC Tutorial
No ratings yet
HEVC Tutorial
33 pages
AppearTV 3.8 Coder Release Notes
No ratings yet
AppearTV 3.8 Coder Release Notes
4 pages
Mpeg-2 Basics
100% (1)
Mpeg-2 Basics
17 pages
MPEG-2 - Tha Basis of How It Works
No ratings yet
MPEG-2 - Tha Basis of How It Works
17 pages
RF 9.5 Controller Setup - InterLink DX
No ratings yet
RF 9.5 Controller Setup - InterLink DX
14 pages
Vacation Resumption Request
No ratings yet
Vacation Resumption Request
9 pages
Module 3 - Part 2
No ratings yet
Module 3 - Part 2
13 pages
EyeROV - Technical Proposal - BBMB - Nangal Dam
No ratings yet
EyeROV - Technical Proposal - BBMB - Nangal Dam
23 pages
ENA&BBS Controller Reprogram Manual
No ratings yet
ENA&BBS Controller Reprogram Manual
8 pages
Creating A Fake Cryptocurrency Unit
No ratings yet
Creating A Fake Cryptocurrency Unit
106 pages
Sony HX90V Vs Sony HX99 Specifications
No ratings yet
Sony HX90V Vs Sony HX99 Specifications
13 pages
Algebra Ii With Trigonometry Exam
No ratings yet
Algebra Ii With Trigonometry Exam
8 pages
Project Bank
No ratings yet
Project Bank
62 pages
ESS LTC USER GUIDE (1)
No ratings yet
ESS LTC USER GUIDE (1)
7 pages
CSE Company Profile (Final)
No ratings yet
CSE Company Profile (Final)
24 pages
Introduction To Statistical Machine Learning
No ratings yet
Introduction To Statistical Machine Learning
84 pages
UN Women Branding Guidelines
No ratings yet
UN Women Branding Guidelines
37 pages
CTE 242 SYllabuss
No ratings yet
CTE 242 SYllabuss
6 pages
Test 1 Answers Pypy
No ratings yet
Test 1 Answers Pypy
8 pages
Mphasis Digital Use Go For System Programming
No ratings yet
Mphasis Digital Use Go For System Programming
13 pages
ICT Notes Grade 7
No ratings yet
ICT Notes Grade 7
14 pages
Keee SRS
No ratings yet
Keee SRS
10 pages
UniTartuCS Poster Template Portrait
No ratings yet
UniTartuCS Poster Template Portrait
1 page
CRUD REST API With Node - JS, Express, and PostgreS
No ratings yet
CRUD REST API With Node - JS, Express, and PostgreS
4 pages
LAB MANUAL - OS - 2021 Regulation Final-1
No ratings yet
LAB MANUAL - OS - 2021 Regulation Final-1
68 pages
sajad
No ratings yet
sajad
3 pages
Super Market Automation Software
No ratings yet
Super Market Automation Software
31 pages
Ubaid Assignment 2 (TPL) - 1
No ratings yet
Ubaid Assignment 2 (TPL) - 1
3 pages
Evolutionary Model
No ratings yet
Evolutionary Model
2 pages
Oops Using Java
No ratings yet
Oops Using Java
436 pages
Download Complete RTL Hardware Design Using VHDL Coding for Efficiency Portability and Scalability 1st Edition Chu PDF for All Chapters
100% (8)
Download Complete RTL Hardware Design Using VHDL Coding for Efficiency Portability and Scalability 1st Edition Chu PDF for All Chapters
50 pages

Encoding and Transcoding Guide V3

Uploaded by

Encoding and Transcoding Guide V3

Uploaded by

Application Note

Encoding and Transcoding

A guide to help understand and fine tune Appear TV’s

 Sophisticated statistical multiplexing functions

The full UNIVERSAL product family will include the following;

 Broadcast transcoder for high density and high VQ applications.

2 Video Pre-processing tools

ii. Pre-deblocking filter

3 Video Encoding (core)

iv. Coding Mode (Field / Frame)

The Appear TV internal options are;

The default for AVC is PAFF and MPEG2 is MBAFF

The optimum modes are as follows;

 For interlaced content using MPEG4, use PAFF

 For progressive content, use FRAME ONLY

Levels with maximum property values

The levels that Appear TV broadcast encoders support are as follows;

vi. Maximum B frames

vii. Minimum B frames

i. Gop Size (length) and structure (open and closed)

ii. GOP control

iii. Reference B frames

iv. 8x8 transform

Low Frequency High Frequency

Low Frequency High Frequency

8 x 8 DCT co-efficients following division by the quantisation matrix.

vi. Rate Control Mode

Fixed Output Bitrate (eg.

4 Video Encoding (additional tools)

 A tendancy to produce sharply defined macroblock edges

viii. Fade Detection

i. Scene change detection

ix. Weighted Prediction.

Appear TV’s second generation statistical multiplexing is designed to be second to none. It is an

Step 1: Each encoder analyses the video content it is encoding periodically.

Encoder 1 Pre Encode stage

 When turning around and re-multiplexing individual statistically multiplexed

 Bitrate requests for all services received

Stat Mux Metrics Alloc

7 Outgoing video formats (broadcast mode)

8 Recommended Settings table.

Feature What is it? Recommended setting Recommended

Horizontal A pre-filter that If the other encoder Off. Never have

result in natural high encoder stage.

Skin tone Applies less This setting is very Irrelevant because

prediction is used. reasons for wanting to

Weighted For AVC, these help Enable both Enable both

fade detection during predictable

MPEG2 scan Determines how zig- Alternate Alternate

Reference B Enhances long GOP Enable Enable

Open / closed Determines whether Open, unless the Open

CPB buffer This is a critical Default Default.

You might also like