NCG Project: Integrate x264 Into Captivate and Ensure Equivalent/better Quality
NCG Project: Integrate x264 Into Captivate and Ensure Equivalent/better Quality
Project Details
Aim: To Integrate x264 (Lossless version) into Captivate and provide video quality equivalent to or better than current output generated by Main Concept encoder. Platform: Windows 7 64 bit. Workflow: Primary aim is to create lossless MP4 from recording process. The MP4 can be played on external players. Exceptions: This MP4 will not have any audio and will NOT play in captivate edit area.
Overview of H264/x264
H. 264 or MPEG-4 Part 10: A document of standards specifying the decoding process. It specifies the way in which the decoder will receive the encoded file i.e. the format of encoded stream. Therefore, encoding process and algorithms are open to individual interpretations. x264: It is the encoder which encodes raw video data into H. 264. It includes algorithms and processes for encoding.
Keywords
Macro block Prediction Intra/Inter Motion Compensation Transformation Quantization Entropy Encoding
Application workflow
Capture: Captures the Windows screen at specified resolution and provides RGB Buffers. Conversion: x264 works only in YUV color space. Hence, a suitable conversion is required from RGB to YUV. Encoding: The raw YUV frames are encoded to H264 format using x264 libraries (API). Writing: The native H264 frames are written in a container format (mp4) which can be played back by media players.
Problems Encountered
Linking Issues: x264, ffmpeg, libav etc. being open source are configured for Linux systems. We had to build them from scratch to obtain usable libraries on windows. Performance Issues: The problem requires that as frames are being captured, these are encoded in parallel (or immediately after) since it is highly memory intensive to store the RGB Buffers. (E.g. A single RGB frame of 1920 X 1080 Resolution will take up 1920 X 1080 X 3 X 8 bits = 49766400 bits = 6220800 bytes = 5.93 MB. At 25fps, one second of video requires about 148MB of storage). We started with the libav libraries (which are an interface to the x264 libraries). However, they did not allow us to directly control the encoding process. So, that prompted us to shift to libx264 and directly use their API. We also found that the major reason for the performance lag was the amount of processing done for motion compensation. So, reducing the level of motion compensation, we were able to speed up encoding.
Results
Specifications: H264 Lossless output in YUV 4:4:4 color space using x264. Minimal motion compensation ultrafast preset for encoding. Provides non-lagging output for static as well as nonstatic content captured at all resolutions (including full HD 1920 X 1080). The application was tested to capture a full HD raw video (1920 X 1080). Approximate file size (1920 X1080): a) Static(application) -> 20 sec 47.6 MB (19966 Kbps) b) Non static(video) -> 10 min 5.36 GB (76825 Kbps)
Learning
Learning related to the Video Domain H264 (an ongoing process). Worked with mentor (Chandra) to include good design patterns in the code and decouple the individual parts as much as possible.
THANK YOU!!