Introduction To Video Compression Techniques
Introduction To Video Compression Techniques
TI Training Material
Agenda
Video Compression Overview Motivation for creating standards What do the standards specify Brief review of video compression Current video compression standards H.261, H.263, MPEG-1-2-4 Advanced Video Compression Standards,
H.264, VC1, AVS
TI Training Material
TI Training Material
Data Rate
Uncompressed 30.4 Mbps 60.8 Mbps 248.8 Mbps 1.33 Gbps
TI Training Material
Video on digital storage media 1.5 Mb/s (CD-ROM) Digital Television Video telephony over PSTN > 2 Mb/s < 33.6 kb/s
Object-based coding, Variable synthetic content, interactivity From Low bitrate coding to HDVariable encoding, HD-DVD, Surveillance, Video conferencing.
5
H.264
TI Training Material
TI Training Material
TI Training Material
TI Training Material
TI Training Material
10
TI Training Material
Achieving Compression
Reduce redundancy and irrelevancy. Sources of redundancy:
Temporal Adjacent frames highly correlated. Spatial Nearby pixels are often correlated with each other. Color space RGB components are correlated among themselves.
11
TI Training Material
Scalar quantization of DCT coefficients Run-length and Huffman coding of the non-zero quantized DCT coefficients
12
TI Training Material
Video Structure
MPEG Structure
13
TI Training Material
Zig-zag
Quantize
TI Training Material
Block Encoding
DC component
139 144 150 159 144 151 155 161 149 153 160 162 153 156 163 160
DCT
-1 -17 -9 -2
-12 -6 -2 0
-5 -3 2 1
Quantize
original image
AC components
79 0 -2 -1 -1 -1 0 0 -1 0 0 0 0 0 0 0
0 1 0 0 0 2 0 79 -2 -1 -1 -1 -1 0
zigzag
79 0 -1 0 -2 -1 0 0 -1 -1 0 0 0 0 0 0
run-length code
Huffman code
10011011100011...
15
Result of Coding/decoding
139 144 150 159 144 151 155 161 149 153 160 162 153 156 163 160 144 156 155 160 146 150 156 161 149 152 157 161 152 154 158 162
original block
-5 -2 0 1 -4 1 1 2 -5 -1 3 5 -1 0 1 -2
reconstructed block
errors
16
TI Training Material
Examples
17
TI Training Material
Video Compression
Main addition over image compression
Exploit the temporal redundancy
Predict current frame based on previously coded frames Types of coded frames:
I-frame Intra-coded frame, coded independently of all other frames P-frame Predictively coded frame, coded based on previously coded frame B-frame Bi-directionally predicted frame, coded based on both previous and future coded frames
18
TI Training Material
19
TI Training Material
20
TI Training Material
Conditional Replenishment
21
TI Training Material
Residual Coding
22
TI Training Material
23
TI Training Material
24
TI Training Material
25
TI Training Material
26
TI Training Material
27
TI Training Material
H.261 (1990)
Goal: real-time, two-way video communication Key features
Low delay (150 ms) Low bit rates (p x 64 kb/s)
Technical details
Uses I- and P-frames (no B-frames) Full-pixel motion estimation Search range +/- 15 pixels Low-pass filter in the feedback loop
28
TI Training Material
H.263 (1995)
Goal: communication over conventional analog telephone lines (< 33.6 kb/s) Enhancements to H.261
Reduced overhead information Improved error resilience features Algorithmic enhancements Half-pixel motion estimation with larger motion search range Four advanced coding modes Unrestricted motion vector mode Advanced prediction mode ( median MV predictor using 3 neighbors) PB-frame mode OBMC
29
TI Training Material
MPEG-2 (1993)
Superset of MPEG-1 to support higher bit rates, higher resolutions, and interlaced pictures Original goal to support interlaced video from conventional television. Eventually extended to support HDTV Provides field-based coding and scalability tools
30
TI Training Material
31
TI Training Material
MPEG-4 (1993)
Primary goals: new functionalities, not better compression
Object-based or content-based representation Separate coding of individual visual objects Content-based access and manipulation Integration of natural and synthetic objects Interactivity Communication over error-prone environments Includes frame-based coding techniques from earlier standards
32
TI Training Material
MV Prediction- MPEG-4
33
TI Training Material
TI Training Material
35
TI Training Material
36
TI Training Material
37
TI Training Material
38
TI Training Material
Field prediction for frame pictures : the MB to be predicted is split into top field pels and bottom field pels. Each 16x8 field block is predicted separately with its own motion vectors ( Pframe ) or two motion vectors ( B-frame )
39
TI Training Material
TI Training Material
Control Data Quant. Transf. coeffs Scaling & Inv. Transform Entropy Coding
Decoder
Intra/Inter
Motion Estimation
41
TI Training Material
H.264
New elements introduced Every macroblock is split in one of 7 ways
Up to 16 mini-blocks (and as many MVs)
TI Training Material
H.264
Improved motion estimation De-blocking filter at estimation Integer 4x4 DCT approximation Eliminates Problem of mismatch between different implementation. Problem of encoder/decoder drift. Arithmetic coding for MVs & coefficients. Compute SATD (Sum of Absolute Transformed Differences) instead of SAD. Cost of transformed differences (i.e. residual coefficients) for 4x4 block using 4 x 4 Hadamard-Transformation
43
TI Training Material
H.264/AVC
Half sample positions are obtained by applying a 6-tap filter . (1,-5,20,20,-5,1) Quarter sample positions are obtained by averaging samples at integer and half sample positions
44
TI Training Material
H.264/AVC Profiles
45
TI Training Material
H.264/AVC Features
Support for multiple reference pictures. It gives significant compression when motion is periodic in nature.
46
TI Training Material
H.264/AVC Features
Combine the two fields together and to code them as one single coded frame (frame mode). Not combine the two fields and to code them as separate coded fields (field mode).
47
TI Training Material
H.264/AVC Features
48
TI Training Material
H.264/AVC Features
Arbitrary slice ordering. Since each slice can be decoded independently. It can be sent out of order Redundant pictures Encoder has the flexibility to send redundant pictures. These pictures can be used during loss of data.
49
TI Training Material
Comparison
Feature
Prediction Block size
MPEG4
16*16, 8*8
WMV9
16*16, 16*8, 8*8 , 4*4
H.264
4*4,4*8,8*4,8*8, 8*16,16*8,16*16 Intra Prediction (Spatial Domain) CAVLC,CABAC Multiple pictures Yes Yes 4*4,8*8(High Profile) Integer DCT
50
Intra Prediction
Ac Prediction (Transform Ac Prediction Domain) (Transform Domain) VLC One picture No No (Optional) 8*8 DCT VLC Two (Interlace) No Yes 4*4,4*8,8*4,8*8
TI Training Material
RD Comparison
51
TI Training Material
TI Training Material
Transform Engine
Top Neighbor
10 20 1 6 3 15
11 16
23 35
Transform Engine
Current Block
53
TI Training Material
The H.264/AVC uses a new approach to the prediction of intra blocks by doing the prediction in the spatial domain rather than in frequency domain like other codecs.
The H.264 /AVC uses the reconstructed but unfiltered macroblock data from the neighboring macroblocks to predict the current macroblock coefficients.
54
TI Training Material
Predicting from samples in the pixel domain helps in better compression for intra blocks in a inter frame.
Allows to better compression and hence a flexible bit-rate control by providing the flexibility to eliminate redundancies across multiple directions.
55
TI Training Material
H.264/AVC also has a 16 x 16 mode, which is aimed to provide better compression for flat regions of a picture at a lower computational costs.
Supports 4 direction modes. Supported for 16x16 luminance blocks and 8x8 chrominance blocks
56
TI Training Material
Intra16x16PredMode
Name of Intra16x16PredMode
DC (prediction mode)
57
TI Training Material
8 1
6
5
3 7 0 5
6 7 8
58
TI Training Material
59
TI Training Material
Intra-Prediction Process 1. Determining the prediction mode (Only for a 4x4 block size mode). 2. Determination of samples to predict the block data. 3. Predict the block data.
60
TI Training Material
Determining the prediction mode (Only for a 4x4 block size mode)
-Flag in the bit-stream indicates, whether prediction mode is present in the bit-stream or it has to be Implicitly calculated. -In case of Implicit mode, the prediction mode is the minimum of prediction modes of neighbors A and B.
61
TI Training Material
Intra-Prediction Process 1. Determining the prediction mode (Only for a 4x4 block size mode). 2. Determination of samples to predict the block data. 3. Predict the block data.
62
TI Training Material
To Predict a 4x4 block (a-p), a set of 13 samples (A-M) from the neighboring pixels have to be chosen. For a 8x8 chrominance block a set if 17 neighboring pixels are chosen as sample values. Similarly for predicting a 16x16 luminance block, a set of 33 neighboring pixels are selected as the samples
63
TI Training Material
Intra-Prediction Process
1. Determining the prediction mode (Only for a 4x4 block size mode). 2. Determination of samples to predict the block data. 3. Predict the block data.
64
TI Training Material
Intra-Prediction Process
65
TI Training Material
Intra-Prediction Process
DC prediction mode
M I J K L A X X X X B X X X X C X X X X D X X X X E F G H
X = Mean
66
TI Training Material
The dependence of the blocks prediction samples on its neighbors, which itself may a part of current MB prevent parallel processing of block data.
Each of the 16 blocks in a given MB can choose any one of the nine prediction modes, With each mode entire processing changes. Each mode has a totally different mathematical weighting function used for deriving the predicted data from the samples.
67
TI Training Material
Coarse quantization of the block-based image transform produce disturbing blocking artifacts at the block boundaries of the image. The second source of blocking artifacts is motion compensated prediction. Motion compensated blocks are generated by copying interpolated pixel data from different locations of possibly different reference frames. When the later P/B frames reference these images having blocky edges, the blocking artifacts further propagates to the interiors of the current blocks block worsening the situation further.
68
TI Training Material
Original Frame
Reference frame
TI Training Material
70
TI Training Material
Ensures a certain level of quality. No need for potentially an extra frame buffer at the decoder. Improves both objective and subjective quality of video streams. Due to the fact that filtered reference frames offer higher quality prediction for motion compensation.
71
TI Training Material
72
TI Training Material
Last process in the frame decoding, which ensures all the top/left neighbors have been fully reconstructed and available as inputs for de-blocking the current MB. Applied to all 4x4 blocks except at the boundaries of the picture. Filtering for block edges of any slice can be selectively disabled by means of flag. Vertical edges filtered first (left to right) followed by horizontal edges (top to bottom)
1 *1 M c b c 6 6 a ro lo k
1 *1 M c b c 6 6 a ro lo k
73
V rtic l e g s e a de e TIV rtical edgesMaterial Training
Of these 8 pixel samples the de-block filter updates 6 pixels for a luminance block and 4 pixels for a chrominance block.
p3
p2 p1 p0 q0 q1 q2
q3
p3
p2
p1
p0
q0
q1
q2
q3
74
TI Training Material
Is it just low pass filter? We want to filter only blocking artifacts and not genuine edges!!! Content-dependent boundary filtering strength. The Boundary strengths are a method of implementing adaptive filtering for a given edge based on certain conditions based on MB type Reference picture ID Motion vector Other MB coding parameters The Boundary strengths for a chrominance block is determined from the boundary strength of the corresponding luminance macro block.
TI Training Material
75
The blocking artifacts are most noticeable in very smooth region where the pixel values do not change much across the block edge. Therefore, in addition to the boundary strength, a filtering threshold based on the pixel values are used to determine if de-blocking process should be carried for the current edge.
76
TI Training Material
77
TI Training Material