UnderstandingHD Part1
UnderstandingHD Part1
Part One
Understanding HD with Avid 1
1
Chapter 1
Video formats
and sampling
Video formats and sampling Understanding HD with Avid 2
In many cases it is very useful to have a key (or alpha) what is now the ITU-R BT.601 standard for SD sampling.
signal associated with the pictures. A key is essentially a ‘601’ defines luminance sampling at 13.5MHz (giving 720
full image but in luminance only. So then it is logical to pixels per active line) and each of the colour difference
add a fourth number 4, as in 4:2:2:4. signals at half that rate – 6.75MHz.
Technically 4:4:4 can denote full sampling of RGB or Y, Cr, The final twist in this tale is that someone then noticed
Cb component signals – but it is rarely used for the latter. that 13.5MHz was nearly the same as 14.3MHz that was 4 x
RGB may have an associated key channel, making 4:4:4:4. NTSC subcarrier. Had he looked a little further he might
have seen a much nearer relationship to 3 x PAL SC and a
Occasionally people go off-menu and do something else –
whole swathe of today’s terminology would be that much
like over-sampling which, with good processing can
different! But so it was that the number that might have
improve picture quality. In this case you might see
been 3 and should have been 1, became 4.
something like 8:8:8 mentioned. That would be making
two samples per pixel for RGB. As HD sampling rates are 5.5 times faster than those for
SD, the commonly used studio 4:2:2 sampling actually
This sampling ratio system is used for both SD and HD.
represents 74.25MHz for Y and 37.125MHz for Cr and Cb.
Even though the sampling is generally 5.5 times bigger,
4:2:2 sampling is the standard for HD studios.
Why 4? 1080I
Logic would dictate that the first number, representing a 1:1
Short for 1080 lines, interlace scan. This is the very widely
relationship with the pixels, should be 1 but, for many good
used HD line format which is defined as 1080 lines, 1920
(and some not so good) reasons, television standards are
pixels per line, interlace scan. The 1080I statement alone
steeped in legacy. Historically, in the early 1970s, the first
does not specify the frame rate which, as defined by
television signals to be digitised were coded NTSC and
SMPTE and ITU, can be 25 and 30Hz.
PAL. In both cases it was necessary to lock the sampling
frequency to that of the colour subcarrier (SC), which itself See also: Common Image Format, Interlace, ITU-R.BT 709, Table 3
highest frequency, 5.5MHz, luminance detail information The images are a pure electronic equivalent of a film shoot
present in SD images. Digital sampling of most HD and telecine transfer – except the video recorder operates
standards samples luminance at 74.25MHz, which is 5.5 at film rate (24 fps), not at television rates. The footage has
times 13.5MHz. more of a filmic look but with the low frame rate,
movement portrayal can be poor.
See also: 2.25MHz, ITU-R BT.601
25PsF and 30PsF rates are also included in the ITU-R BT.
2.25MHz 709-4 recommendation.
See also: ITU-R BT. 709
This is the lowest common multiple of the 525/59.94 and
625/50 television line frequencies, being 15.734265kHz and
15.625kHz respectively. Although seldom mentioned, its
601
importance is great as it is the basis for all digital See ITU-R BT. 601
component sampling frequencies both at SD and HD.
See also: 13.5MHz 709
See ITU-R BT. 709
24P
Short for 24 frames, progressive scan. In most cases this 720P
refers to the HD picture format with 1080 lines and 1920
Short for 720 lines, progressive scan. Defined in SMPTE
pixels per line (1080 x 1920/24P). The frame rate is also
296M and a part of both ATSC and DVB television standards,
used for SD at 480 and 576 lines with 720 pixels per line.
the full format is 1280 pixels per line, 720 lines and 60
This is often as an offline for an HD 24P edit, or to create a
progressively scanned pictures per second. It is mainly the
pan-and-scan version of an HD down-conversion. Displays
particular broadcasters who transmit 720P that use it. Its 60
working at 24P usually use the double shuttering technique
progressive scanned pictures per second offers the benefits
– like film projectors – to show each image twice and
of progressive scan at a high enough picture refresh rate to
reduce flicker when viewing this low rate of images.
portray action well. It has advantages for sporting events,
smoother slow motion replays etc.
24PsF
24P Segmented Frame. This blurs some of the boundaries 74.25MHz
between film/video as video is captured in a film-like way,
The sampling frequency commonly used for luminance (Y)
formatted for digital recording and can pass through
or RGB values of HD video. Being 33 x 2.25MHz, the
existing HD video infrastructure. Like film, entire images
frequency is a part of the hierarchical structure used for SD
are captured at one instant rather than by the usual line-
and HD. It is a part of SMPTE 274M and ITU-R BT.709.
by-line TV scans down the image that means the bottom
can be scanned 1/24 of a second after the top. The images See also: 2.25MHz
further complicated by SD using 4:3 and 16:9 (widescreen) maintain the dynamic range. For example, if the YCrCb
images which all use the same pixel and line counts. Care colour space video is 8 bits per component then the RGB
is needed to alter pixel aspect ratio when moving between colour space video will need to be 10 bits.
systems using different pixel aspect ratios so that objects
retain their correct shape.
Component video
With both 4:3 and 16:9 images and displays in use, some
Most traditional digital television equipment handles video
thought is needed to ensure a shoot will suit its target
in the component form: as a combination of pure luminance
displays. All HD, and an increasing proportion of SD,
Y, and the pure colour information carried in the two colour
shoots are 16:9 but many SD displays are 4:3. As most HD
difference signals R-Y and B-Y (analogue) or Cr, Cb (digital).
productions will also be viewed on SD, clearly keeping the
The components are derived from the RGB delivered by
main action in the middle ‘4:3’ safe area would be a good
imaging devices, cameras, telecines, computers etc.
idea – unless the display is letterboxed.
Part of the reasoning for using components is that it allows
See also: ARC
colour pictures to be compressed. The human eye can see
much more detail in luminance than in the colour information
Chrominance (or Chroma) (chrominance). The simple task of converting RGB to Y, (R-
Common Image Format. An image format that is widely For the professional digital video applications, the colour
used and denoted ‘Common Image Format’ by the ITU. difference signals are usually sampled at half the frequency
The idea is to promote the easy exchange of image of the luminance - as in 4:2:2 sampling. There are also other
information nationally and internationally. types of component digital sampling such as 4:1:1 with less
colour detail (used in DV), and 4:2:0 used in MPEG-2.
See HD-CIF
Co-sited sampling
Colour space
Where samples of luminance and chrominance are all
The space encompassed by a colour system. Examples taken at the same instant. This is designed so that the
are: RGB, YCrCb, HSL (hue, saturation and luminance) for relative timing (phase) of all signal components is
video, CMYK for print and XYZ for film. Moving between symmetrical and not skewed by the sampling system.
media, platforms or applications can require a change of Sampling is usually co-sited but there is a case of 4:2:0
colour space. This involves complex image processing so sampling being interstitial – with chrominance samples
care is needed to get the right result. Also, repeated made between the luminance samples.
changes of colour space can lead to colours drifting off.
See also: 4:2:2 etc.
It is important to note that when converting from YCrCb to
RGB more bits are required in the RGB colour space to
Video formats and sampling Understanding HD with Avid 7
DTV As HD’s 1080 x 1920 image size is close to the 2K used for
film, there is a crossover between film and television. This
Digital Television. This is a general term that covers both is even more the case if using a 16:9 window of 2K as here
SD and HD digital formats. there is very little difference in size. It is generally agreed
that any format containing at least twice the standard
Gamut (colour) definition format on both H and V axes is high definition.
The range of possible colours available in an imaging After some initial debate about the formats available to
system. The red, blue and green phosphors on television prospective HD producers and television stations, the
screens and the RGB colour pick-up CCDs or CMOS chips acceptance of 1080-HD video at various frame rates, as a
in cameras, define the limits of the colours that can be common image format by the ITU, has made matters far
displayed – the colour gamut. Between the camera and more straightforward. While television stations may have
viewer’s screen there are many processes, many using some latitude in their choice of format, translating, if
component 4:2:2 video. However, not all component value required, from the common image formats should be
combinations relate to valid RGB colours (for example, routine and give high quality results.
combinations where Y is zero). Equipment that generates
2048
images directly in component colour space, such as some
graphics machines, can produce colours within the 2K Film
component range but that are invalid in RGB, which can
also exceed the limits allowed for PAL and NTSC.
1920
There is potential for overloading equipment – especially
transmitters which may cut out to avoid damage! There is 1080-HD
equipment that clearly shows many areas of out-of-gamut
1536
pictures, so that they can be adjusted before they cause 1280
1080
576
HD &
720
480-SD
576
480
High Definition Television. This has been defined in the
USA by the ATSC and others as having a resolution of
approximately twice that of conventional television (meaning
analogue NTSC – implying 486 visible lines) both horizontally 2K, HD and SD images sizes
System nomenclature This table lists no fewer than 18 DTV formats for SD and HD.
Initially, this led to some confusion about which should be
A term used to describe television standards. The
adopted for whatever circumstances. Now most HD
standards are mostly written in a self-explanatory form but
production and operation is centred on the 1080-line formats
there is room for confusion concerning vertical scanning
either with 24P, 25P or 60I vertical scanning, and 720-line
rates. For example, 1080/60I implies there are 60 interlaced
formats at 50P and 60P.
fields per second that make up 30 frames. Then 1080/30P
describes 30 frames per second, progressively scanned.
The general rule appears to be that the final figure always
indicates the number of vertical refreshes per second.
However, Table 3 (below) uses a different method. It defines
frame rates (numbers of complete frames) and then defines
whether they are interlaced or progressive. So here the ‘frame
rate code 5’ is 30Hz which produces 30 vertical refreshes
when progressive, and 60 when interlaced. Be careful!
See also: Interlace, Progressive
Video formats and sampling Understanding HD with Avid 10
2
Chapter 2
Video Compression:
Concepts
Video Compression: Concepts Understanding HD with Avid 12
Video compression reduces the amount of data the digital video compression schemes in use today
or bandwidth used to describe moving pictures. including AVR, DV, HDV, JPEG (but not JPEG2000) and the
I frames of MPEG-1, 2 and 4, and Windows Media 9. A
Digital video needs vast amounts of data to
further reduction is made using Huffman coding, a purely
describe it and there have long been various
mathematical process that reduces repeated data.
methods used to reduce this for SD. And as HD
MPEG-2 and the more recent MPEG-4 add another layer
has up to a six times bigger requirement of
of compression by analysing what changes form frame to
1.2Gb/s and requiring 560GB per hour of storage,
frame by analysing the movement of 16 x 16-pixel macro
the need for compression is even more pressing. blocks of the pictures. Then it can send just the movement
information, called motion vectors, that make up predictive
(B and P) frames and contain much less data than I frames,
Intro Compression – General for much of the time. Whole pictures (I frames, more data)
are sent only a few times a second. MPEG-2 compression
Exactly which type and how much compression is used
is used in all forms of digital transmission and DVDs as well
depends on the application. Consumer delivery (DVD,
as for HDV. The more refined and efficient MPEG-4 is now
transmission, etc) typically uses very high compression (low
being introduced for some HD services, and is set to
data rates) as the bandwidth of the channels is quite small.
become widely used for new television services.
For production and online editing use much lighter
compression (higher data rates) are used as good picture Each of these techniques does a useful job but needs to
quality needs to be maintained though all the stages be applied with some care when used in the production
leading to the final edited master. chain. Multiple compression (compress/de-compress)
cycles may occur while moving along the chain, causing a
Video compression methods are all based on the principle
build-up of compression errors. Also, as many compression
of removing information that we are least likely to miss –
schemes are designed around what looks good to us, they
so-called ‘redundant’ picture detail. This applied to still
may not be so good in production, post production and
images as well as video and cinema footage. This takes the
editing. This particularly applies in processes, such as
form of several techniques that may be used together.
keying and colour correction, that depend on greater
Digital technology has allowed the use of very complex
image fidelity than we can see, so disappointing results
methods which have been built into low cost mass
may ensue from otherwise good-looking compressed
produced chips.
originals.
First, our perception of colour (chroma) is not as sharp as it
See also: AVR, Component video, DV, DNxHD, Huffman coding, JPEG,
is for black and white (luminance), so the colour resolution
JPEG2000, MPEG-2, MPEG-4
is reduced to half that of luminance (as in 4:2:2). This is used
in colour television (NTSC, PAL and digital). Similarly, fine
detail with little contrast is less noticeable than bigger
objects with higher contrast. To access these a process
Blocks
called DCT resolves 8 x 8 pixel blocks of digital images into See DCT
Codec Compression-friendly
Codec is short for coder/decoder – usually referring to a Material that looks good after compression is sometimes
compression engine. Confusingly, the term is often misused referred to as ‘compression friendly’. This can become
to describe just a coder or decoder. important in transmission where very limited data
bandwidth is available and high compression ratios have to
be used. Footage with large areas of flat colour, little detail
Compression ratio and little movement compress very well: for example,
This is the ratio of the uncompressed (video or audio) data cartoons, head-and-shoulder close-ups and some dramas.
to the compressed data. It does not define the resulting As, MPEG-2 compression looks at spatial detail as well as
picture or sound quality, as the effectiveness of the movement in pictures and an excess of both may show at
compression system needs to be taken into account. Even the output as poor picture quality. This often applies to
so, if used in studio applications, compression is usually fast-moving sports – for instance football.
between 2:1 and 7:1 for SD (and D1 and D5 uncompressed Poor technical quality can be compression unfriendly.
VTRs are also available), whereas compression for HD is Random noise will be interpreted as movement by an
currently approximately between 6:1 and 14:1 – as defined MPEG-2 or MPEG-4 encoder, so it wastes valuable data
by VTR formats, and is I-frame only. For transmission, the space conveying unwanted movement information.
actual values depend on the broadcaster’s use of the Movement portrayal can also be upset by poor quality
available bandwidth but around 40:1 is common for SD frame-rate conversions that produce judder on movement,
and somewhat higher, 50 or 60:1 for HD (also depending again increasing unwanted movement data to be
on format). These use both I-frames and the predictive transmitted at the expense of spatial detail. Such
frames to give the greater compression. circumstances also increase the chance of movement
HDV records data to tape at 19-25 Mb/s – a rate going wrong – producing ‘blocking’ in the pictures.
comparable with HD transmission and a compression ratio Errors can be avoided by the use of good quality
of around 40:1, depending on the standard used. equipment throughout the production chain. Also, the
Transmission and video recorders in general work at a choice of video format can help. For example, there is less
constant bit rate so, as the original pictures may include movement in using 25 progressively scanned images than
varying amounts of detail, the quality of the compressed in 50 interlaced fields, so the former compress more easily.
images varies. DVDs usually work on a constant The efficiency increase is typically 15-20 percent.
quality/variable bit rate principle. So the compression ratio
slides up and down according to the demands of the
material, to give consistent results. This is part of the
DCT
reason why DVDs can look so good while only averaging Discrete Cosine Transform is used as a first stage of many
quite low bit rates – around 4 Mb/s. digital video compression schemes including JPEG and
MPEG-2 and –4. It converts 8 x 8 pixel blocks of pictures to
express them as frequencies and amplitudes. This may not
reduce the data but it does arrange the image information
so that it can. As the high frequency, low amplitude detail
is least noticeable their coefficients are progressively
Video Compression: Concepts Understanding HD with Avid 14
reduced, some often to zero, to fit the required file size Studio applications of MPEG-2 have very short GOPs,
per picture (constant bit rate) or to achieve a specified Betacam SX has a GOP of 2, IMX has 1, (i.e. I-frame only –
quality level. It is this reduction process, known as no predictive frames) which means cutting at any frame is
quantization, which actually reduces the data. straightforward. Other formats such as DV, DVCPRO HD
and HDCAM, D5-HD do not use MPEG but are also I-
For VTR applications the file size is fixed and the
frame only.
compression scheme’s efficiency is shown in its ability to
use all the file space without overflowing it. This is one See also: MPEG-2, MPEG-4
reason why a quoted compression ratio is not a complete
measure of picture quality.
DCT takes place within a single picture and so is intra-
I-frame only (aka I-frame)
frame (I-frame) compression. It is a part of the currently Short for intra-frame only.
most widely used compression in television.
See also: AVR, Compression ratio, DV, JPEG, MPEG-2, MPEG-4 Inter-frame compression
Video compression that uses information from several
GOP successive video frames to make up the data for its
compressed ‘predictive’ frames. The most common
Group Of Pictures – as in MPEG-2 and MPEG-4 video
example is MPEG-2 with a GOP greater than 1. Such an
compression. This is the number of frames to each integral
MPEG-2 stream contains a mix of both I-frames and
I-frame: the frames between being predictive (types B and
predictive B and P (Bi-directional predictive and Predictive)
P). ‘Long GOP’ usually refers to MPEG-2 and 4 coding. For
frames. Predictive frames cannot be decoded in isolation
transmission the GOP is often as long as half a second, 13,
from those in the rest of the GOP so the whole GOP must
or 15 frames (25 or 30fps), which helps to achieve the
be decoded. This is an efficient coding system that is good
required very high compression ratios.
for transmission but it does not offer the flexibility needed
for accurate editing as it can only be cut at the GOP
boundaries. It also requires estimation of the movement
from picture to picture, which is complex and not always
I B B P B B P B B P B B I accurate – leading to ‘blockiness’.
See also: GOP, MPEG-2, MPEG-4
Quantization
Quantizing is the process used in DCT-based compression
schemes, including AVC, JPEG, MPEG-2 and MPEG-4, to
reduce the video data in an I frame. DCT allows quantizing
to selectively reduce the DCT coefficients that represent
the highest frequencies and lowest amplitudes that make
up the least noticeable elements of the image. As many
are reduced to zero significant data reduction is realised.
Using a fixed quantizing level will produce a constant
quality of output with a data rate that varies according to
the amount of detail in the images. Alternatively
quantizing can be varied to produce a constant data rate,
but variable quality, images. This is useful where the data
must be fitted into a defined size of store or data channel
– such as a VTR or a transmission channel. The success in
nearly filling, but never overflowing, the storage is one
measure of the efficiency of DCT compression schemes.
NB: Quantization has a second meaning.
See Video Formats section
Understanding HD with Avid 17
3
Chapter 3
Video Compression:
Formats
Video Compression: Formats Understanding HD with Avid 18
DVC
DVC is the compression used in DV equipment that is
standardised in IEC 61834. It is a DCT-based, intra-frame
scheme achieving 5:1 compression so that 8-bit video
This is the practical side of compression showing sampling of 720 x 480 at 4:1:1 (NTSC) or 720 x 576 4:2:0
the systems and formats that are used. Some are (PAL) produces a 25 Mb/s video data rate. The same is
proprietary, in which case the company involved used for DV, DVCAM, Digital8 and DVCPRO (where PAL is
is mentioned. PAL 4:1:1). It achieves good compression efficiency by
applying several quantizers at the same time, selecting the
nearest result below 25Mb/s for recording to tape.
AVC
See MPEG-4 DNxHD
Avid DNxHD encoding is designed to offer quality at
significantly reduced data rate and file size and it is
AVR supported by the family of Avid editing systems.
AVR is a range of Motion-JPEG video compression Engineered for editing, it allows any HD material to be
schemes devised by Avid Technology for use in its ABVB handled on SD-original Avid systems. Any HD format can
hardware-based non-linear systems. An AVR is referred to be encoded edited, effects added, colour corrected and
as a constant quality M-JPEG resolution since the same the project finished.
quantization table (of coefficients) is applied to each frame
There is a choice of compression image formats to suit
of a video clip during digitization. For any given AVR, the
requirements. Some of the formats are:
actual compressed data rate will increase as the
complexity of the imagery increases. For example, a head
Format DNxHD DNxHD DNxHD DNxHD DNxHD
shot typically results in a low data rate while a crowd shot 220x 185x 185 145 120
from a sporting event will yield a high data rate. To avoid Bit depth 10 bit 10 bit 8 bit 8 bit 8 bit
system bandwidth problems, AVRs utilize a mode of rate Frame rate 29.92 fps 25 fps 25 fps 25 fps 25 fps
control called rollback which prevents the compressed Data rate 220 Mb/s 184 Mb/s 184 Mb/s 135 Mb/s 220 Mb/s
data rate from increasing beyond a preset limit for a
sustained period. So, when the data rate exceeds the Avid DNxHD maintains the full raster, is sampled at 4:2:2
rollback limit on a given frame, high spatial frequency and uses highly optimised coding and decoding
information is simply discarded from subsequent frames techniques, so image quality is maintained over multiple
until the rate returns to a tolerable level. generations and processes. When you’re ready, master to
See also: DCT, JPEG any format you need.
DNxHD efficiency enables collaborative HD workflow using
networks and storage designed to handle SD media. So,
for example, Avid Unity shared media networks are HD-
ready today! Cost-effective, real-time HD workflows can be
Video Compression: Formats Understanding HD with Avid 19