0% found this document useful (0 votes)
29 views

Effective Chroma Subsampling and Luma Modification

1) The document discusses a novel chroma subsampling and luma modification method for RGB full-color images. It proposes using multiple linear regression to minimize reconstructed block distortion when subsampling chroma and modifying luma in 2x2 blocks. 2) Six traditional chroma subsampling methods and three state-of-the-art methods are reviewed as comparative methods. The proposed method is evaluated using the Kodak and IMAX datasets on the VVC platform, demonstrating substantial quality and rate-distortion improvements over existing methods. 3) Chroma subsampling traditionally converts RGB images to YUV space before compression. The proposed method performs subsampling and modification directly on the RGB values to better preserve color image quality after

Uploaded by

Mingye Wang
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Effective Chroma Subsampling and Luma Modification

1) The document discusses a novel chroma subsampling and luma modification method for RGB full-color images. It proposes using multiple linear regression to minimize reconstructed block distortion when subsampling chroma and modifying luma in 2x2 blocks. 2) Six traditional chroma subsampling methods and three state-of-the-art methods are reviewed as comparative methods. The proposed method is evaluated using the Kodak and IMAX datasets on the VVC platform, demonstrating substantial quality and rate-distortion improvements over existing methods. 3) Chroma subsampling traditionally converts RGB images to YUV space before compression. The proposed method performs subsampling and modification directly on the RGB values to better preserve color image quality after

Uploaded by

Mingye Wang
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2999910, IEEE Access

Effective Chroma Subsampling and


Luma Modification for RGB Full-color
Images Using the Multiple Linear
Regression Technique
KUO-LIANG CHUNG1 , (Senior Member, IEEE), JEN-SHUN CHENG1 , AND HONG-BIN YANG.1
1
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, 10672, Taiwan, R.O.C.
Corresponding author: Kuo-Liang Chung (e-mail: [email protected]).
This work was supported by Grants MOST-107-2221-E-011-108-MY3 and MOST-108-2221-E-011-077-MY3.

ABSTRACT Differing from the traditional chroma subsampling on the YUV image converted from a
RGB full-color image, in this paper, we propose a novel and effective chroma subsampling and luma
modification (CSLM) method. For each 2×2 YUV block, first, a newly reconstructed 2×2 RGB full-color
block-distortion model is proposed, and then we propose a multiple linear regression approach to tackle
our CSLM method such that the reconstructed 2×2 RGB full-color block-distortion can be minimized,
achieving significant quality improvement of the reconstructed RGB full-color image. Based on the Kodak
and IMAX datasets, the comprehensive experimental results demonstrated that on the versatile video coding
(VVC) platform VTM-8.0, our method achieves substantial quality and quality-bitrate tradeoff improvement
of the reconstructed RGB full-color images relative to six traditional methods and the three state-of-the-art
methods.

INDEX TERMS Chroma subsampling, Distortion model, Luma modification, Multiple linear regression,
Quality-bitrate tradeoff, RGB full-color image, Versatile video coding (VVC).

I. INTRODUCTION After decompressing the encoded subsampled YUV image


S a target image for human visual perception, the RGB by the decoder, each decoded subsampled YUV image is
A full-color image I RGB is the most important medium.
In the traditional coding system, as shown in Fig. 1, prior
upsampled at the client side, as shown in the lower part of
Fig. 1. Furthermore, each upsampled YUV pixel is converted
to compression, I RGB is first converted to the YUV image into a RGB full-color pixel by the following YUV-to-RGB
I Y U V by the following RGB-to-YUV conversion formula: conversion formula:
" # " #" #
" # " #" # " # Ri 1.164 0 1.596 Yi − 16
Yi 0.257 0.504 0.098 Ri 16 Gi = 1.164 −0.391 −0.813 Ui − 128 (2)
Ui = −0.148 −0.291 0.439 Gi + 128 (1) Bi 1.164 2.018 0 Vi − 128
Vi 0.439 −0.368 −0.071 Bi 128
As a result, the reconstructed RGB full-color image is pro-
where in each 2×2 YUV block, (Yi , Ui , Vi ), 1 ≤ i ≤ 4, duced.
denotes the converted YUV triple-value in zigzag order; (Ri , Note that all discussion in this paper can be applied to
Gi , Bi ) denotes the collocated original RGB triple-value. other color spaces, such as the YCb Cr color space, because
Traditionally, the study on the chroma subsampling for I Y U V the RGB-to-YCb Cr and YCb Cr -to-RGB color conversions
at the server side, as shown in the upper part of Fig. 1, can are also linear as the color conversion between the RGB
be classified into two categories, namely, 4:2:0 and 4:2:2. space and the YUV space appeared in Eqs. (1)-(2). The
4:2:0 subsamples the (U, V )-pair for each 2×2 UV block digital YUV data in Eqs. (1)-(2) are originally converted from
B U V ; 4:2:2 subsamples the (U, V )-pair for each row of analog signals, while the YCb Cr data are originally digital.
B U V . In this paper, our discussion focuses on 4:2:0, although The YUV color space is often used in analogue color TV
it is applicable to 4:2:2. 4:2:0 has been used in Blu-ray broadcasting; the YCb Cr color space is often used in digital
discs (BDs) and digital versatile discs (DVDs) for recording videos and BT.601.
movies, sports, and so on.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2999910, IEEE Access

FIGURE 1. The chroma subsampling model in the coding system for the input RGB full-color image.

2) Three state-of-the-art chroma subsampling methods: To


overcome the weakness existing in the traditional methods,
three state-of-the-art methods were developed.
Based on the new edge-directed interpolation (NEDI) [5],
Zhang et al. [12] proposed an interpolation-dependent image
downsampling (IDID) method for chroma subsampling, and
their combination IDID-NEDI, in which IDID is used at the
FIGURE 2. The depiction of three traditional chroma subsampling methods.
(a) 4:2:0(A). (b) 4:2:0(L). (c) 4:2:0(R).
server side and NEDI is used at the client side, can tackle the
chroma downsampling well.
Lin et al. [6] proposed a modified chroma 4:2:0(A) sub-
A. RELATED WORK sampling method, namely modified 4:2:0(A), by considering
In this subsection, we introduce six traditional chroma sub- the truncation and carry operations influence; among the four
sampling methods and three state-of-the-art chroma subsam- considered variants, they select the best subsampled (U, V )-
pling methods [12], [1], [6]. All of them will be included in pair. At the client side, they improved the previous chroma
the comparative methods. upsampling process [10] by considering the distance between
1) Six traditional chroma subsampling methods: We first each estimated upsampled chroma value and its three neigh-
introduce the six traditional chroma subsampling methods boring (TN) pixels; their combination, modified 4:2:0(A)-
which are 4:2:0(A), 4:2:0(L), 4:2:0(R), 4:2:0(MPEG-B) [7], TN, achieves good quality of the reconstructed RGB full-
4:2:0(BRIGHT), and 4:2:0(BRIGHT_MEAN). For each 2×2 color image.
UV block B U V , 4:2:0(A) subsamples the (U, V )-pair by After performing a chroma subsampling, e.g. 4:2:0(A),
averaging the four U and V components of B U V . 4:2:0(L) under the COPY-based chroma upsampling process in which
and 4:2:0(R) subsample the (U, V )-pairs by averaging the each estimated (U, V )-pair of each 2×2 UV block B U V just
chroma components in the left and right columns of B U V , copies the subsampled (U, V )-pair of B U V , Chung et al. [1]
respectively. For convenience, we have drawn 4:2:0(A), proposed a pixel-based approach to adjust each luma pixel
4:2:0(L), and 4:2:0(R), as depicted in Fig. 2. 4:2:0(MPEG- in the 2×2 luma block to improve the quality of the recon-
B) determines the subsampled (U, V )-pair by performing the structed RGB full-color image. The first weakness in [1] is
13-tap filter with mask [2, 0, -4, -3, 5, 19, 26, 19, 5, -3, - that the COPY-based chroma upsampling process used at the
4, 0, 2]/64 on the top-left location of B U V . 4:2:0(BRIGHT) server side is too simple to meet the chroma upsampling
determines the subsampled (U, V )-pair from the location capability at the client side, such as the BILI upsampling
with the largest luma value in the collocated 2×2 luma block method, thus limiting the quality improvement. The second
B Y . 4:2:0(BRIGHT_MEAN) is equal to 4:2:0(BRIGHT) weakness in [1] is the failure to consider chroma subsampling
when the ratio of the largest luma value in B Y over the and luma adjustment simultaneously to achieve better quality
smallest is larger than 2; otherwise, it is equal to 4:2:0(A). improvement.
The common weakness of the above traditional chroma
subsampling methods is the failure to take the reconstructed B. CONTRIBUTIONS
RGB full-color block distortion into account, limiting the In this paper, we propose a novel chroma subsampling and
quality improvement of the reconstructed RGB full-color luma modification (CSLM) method for I RGB . The three
image. contributions of this paper are clarified in the following
2

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2999910, IEEE Access

FIGURE 3. Notations used in the estimation of the 2×2 chroma block B rec,U V at the server side.

aspects. II. THE RECONSTRUCTED BLOCK-DISTORTION MODEL


In the first contribution of our CSLM method, for each AND THE DERIVED OVERDETERMINED SYSTEM
2×2 YUV block, a novel reconstructed 2×2 RGB full- In this section, we first present a mathematical model to esti-
color block-distortion model is proposed. Considering the mate the reconstructed 2×2 RGB full-color block-distortion,
neighboring subsampled (U, V )-pairs of the current 2×2 and then we derive the overdetermined system by deploying
UV block, we deploy the BILI interpolation into the block- the two chroma subsampled parameters and four luma modi-
distortion model to better meet the chroma upsampling capa- fication parameters into the block-distortion model.
bility at the client side.
In the second contribution of CSLM, the reconstructed
A. ESTIMATING THE RECONSTRUCTED 2×2 RGB
2×2 RGB full-color block-distortion model can be trans-
FULL-COLOR BLOCK-DISTORTION MODEL
formed to an overdetermined system by a multiple linear
regression approach. Furthermore, the matrix pseudoinverse Before presenting the proposed reconstructed 2×2 RGB full-
technique is applied to determine the subsampled (U, V )-pair color block-distortion model at the server side, we first de-
and the four luma values such that the reconstructed block- scribe how to estimate the four (U, V )-pairs of the current
distortion can be minimized. 2×2 UV block B U V by referring to the eight neighboring
In the third contribution, based on the Kodak and IMAX subsampled (U, V )-pairs of B U V .
datasets, the comprehensive experimental results demon- Because our CSLM method combines chroma subsam-
strated that on the versatile video coding (VVC) platform pling and luma modification together, and is performed on
VTM-8.0 [8], our CSLM method achieves significant quality I Y U V in a row-major order, for the current 2×2 UV block
and quality-bitrate tradeoff improvement of the reconstructed B U V , the eight neighboring subsampled (U, V )-pairs of
RGB full-color images relative to the six traditional methods B U V consist of four already known subsampled (U, V )-pairs
and three state-of-the-art methods in [12], [1], [6]. Here, the obtained by our CSLM method and four future subsampled
quality metrics used are CPSNR (color peak-signal-noise- (U, V )-pairs which can be obtained by performing any tra-
ratio), SSIM (structure similarity index) [9], and FSIMc ditional chroma subsampling scheme, e.g. 4:2:0(A), on the
(feature similarity index) [11]; the quality-bitrate tradeoff four future reference 2×2 UV blocks. As depicted in Fig. 3,
metric is illustrated by the RD-curves (rate-distortion curves) the eight reference subsampled (U, V )-pairs are denoted by
for different quantization parameter (QP) values. In addition, (Utl , Vtl ), (Ut , Vt ), (Utr , Vtr ), (Ul , Vl ), (Ur , Vr ), (Ubl , Vbl ),
based on the video sequence “Boat” which can be accessed (Ub , Vb ), and (Ubr , Vbr ).
from [6], the quality-bitrate tradeoff merit of our method is It is known that the subsampled (U, V )-pair of the current
reported. 2×2 block B U V is denoted by the parameter-pair (Us , Vs ).
The rest of this paper is organized as follows. In Section II, Let the four estimated (U, V )-pairs of B U V be denoted by
the reconstructed 2×2 RGB full-color block-distortion model (U10 , V10 ), (U20 , V20 ), (U30 , V30 ), and (U40 , V40 ), as shown in Fig.
is first presented, and then the corresponding overdetermined 3. Without loss of generality, we derive the estimation only
system is derived. In Section III, the matrix pseudoinverse for (U30 , V30 ) in detail, and then provide the general estimation
technique is applied to determine the subsampled (U, V )-pair formula for the four estimated (U, V )-pairs of B U V .
and the four modified luma values. In Section IV, the com- For estimating (U30 , V30 ), the four reference subsampled
prehensive experimental results are demonstrated to justify (U, V )-pairs are (Ul , Vl ), (Ubl , Vbl ), (Ub , Vb ), and (Us , Vs ).
the significant quality and quality-bitrate tradeoff merits of As shown in Fig. 3, the subsampled (U, V )-pair (Ubl , Vbl )
our CSLM method. In Section V, some concluding remarks is defined to be located at the original coordinate (0, 0);
are addressed. (Ul , Vl ), (Ub , Vb ), and (Us , Vs ) are thus located at (0, 1),
(1, 0), and (1, 1), respectively. According to the bilinear
3

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2999910, IEEE Access

interpolation, (U30 , V30 ) is estimated by For 1≤ i ≤4, it yields the following twelve equalities:

3 1 3 1
(U30 , V30 ) = ( )(1 − )(Us , Vs ) + (1 − )(1 − )(Ul , Vl )
4 4 4 4
3 1 3 1
+ (1 − )( )(Ubl , Vbl ) + ( )( )(Ub , Vb ) R10 = 1.164 × (Y10 − 16) + 1.596 × (V10 − 128)
4 4 4 4
=
9 3 1
(Us , Vs ) + (Ul , Vl ) + (Ubl , Vbl ) + (Ub , Vb )
3 G01 = 1.164 × (Y10 − 16) − 0.391 × (U10 − 128)
16 16 16 16 − 0.813 × (V10 − 128)
9 3 1 3 9 3 1
= ( Us + Ul + Ubl + Ub , V s + Vl + Vbl
16 16 16 16 16 16 16 B10 = 1.164 × (Y10 − 16) + 2.018 × (U10 − 128)
3 R20 = 1.164 × (Y20 − 16) + 1.596 × (V20 − 128)
+ Vb )
16 G02 = 1.164 × (Y20 − 16) − 0.391 × (U20 − 128)
9 9
= ( Us + δ(U3 ), Vs + δ(V3 )) − 0.813 × (V20 − 128)
16 16
(3) B20 = 1.164 × (Y20 − 16) + 2.018 × (U20 − 128)
In the same estimation way as for (U30 , V30 ), the estimation of (7)
R30 = 1.164 × (Y30 − 16) + 1.596 × (V30 − 128)
(U10 , V10 ), (U20 , V20 ), and (U40 , V40 ) can be followed. In general,
we have G03 = 1.164 × (Y30 − 16) − 0.391 × (U30 − 128)
− 0.813 × (V30 − 128)
9 9 B30 = 1.164 × (Y30 − 16) + 2.018 × (U30 − 128)
(Ui0 , Vi0 ) = ( Us + δ(Ui0 ), Vs + δ(Vi0 )) (4)
16 16 R40 = 1.164 × (Y40 − 16) + 1.596 × (V40 − 128)
G04 = 1.164 × (Y40 − 16) − 0.391 × (U40 − 128)
with − 0.813 × (V40 − 128)
3 1 3 B40 = 1.164 × (Y40 − 16) + 2.018 × (U40 − 128)
δ(U10 ) = Ul + Utl + Ut
16 16 16
3 1 3
δ(V10 ) = Vl + Vtl + Vt
16 16 16
3 1 3
δ(U20 ) = Ur + Utr + Ut
16 16 16
3 1 3 From Eq. (7), the reconstructed 2×2 RGB full-
δ(V20 ) = Vr + Vtr + Vt
16 16 16 (5) color block-distortion model is naturally denoted by
3 1 3 BD(Y10 , Y20 , Y30 , Y40 , U10 , U20 , U30 , U40 , V10 , V20 , V30 , V40 ). Because
δ(U30 ) = Ul + Ubl + Ub
16 16 16 by Eq. (3)-(5), we know that U10 , U20 , U30 , and U40 are the
3 1 3 functions with the parameter Us ; V10 , V20 , V30 , and V40 are the
δ(V30 ) = Vl + Vbl + Vb
16 16 16 functions with the parameter Vs , so the reconstructed 2×2
3 1 3
δ(U40 ) = Ur + Ubr + Ub RGB full-color block-distortion is defined by
16 16 16
3 1 3
δ(V40 ) = Vr + Vbr + Vb
16 16 16

B. DERIVING THE OVERDETERMINED SYSTEM


BD(Y10 , Y20 , Y30 , Y40 , Us , Vs ) =
Considering the ith entry of the current 2×2 YUV block 4
B Y U V , 1≤ i ≤4, its YUV triple-value is denoted by
X (8)
[(Ri − Ri0 )2 + (Gi − G0i )2 + (Bi − Bi0 )2 ]
(Yi , Ui , Vi ). By Eqs. (3)-(5), the ith entry of the estimated i=1
2×2 UV block of B U V is denoted by (Ui0 , Vi0 ); the ith
modified luma value is denoted by Yi0 . Replacing the three
parameters Yi , Ui , and Vi at the right ride of Eq. (2) with Yi0 ,
Ui0 , and Vi0 , 1≤ i ≤4, respectively, it yields the following
three equalities:
Ideally, the solution of X (=(Y10 , Y20 , Y30 , Y40 , Us , Vs )) aims
to zeroize the reconstructed 2×2 RGB full-color block-
Ri0 = 1.164 × (Yi0 − 16) + 1.596 × (Vi0 − 128)
distortion in Eq. (8). At the right side of Eq. (7), for 1≤
G0i = 1.164 × (Yi0 − 16) − 0.391 × (Ui0 − 128) i ≤4, we replace Ui0 and Vi0 with ( 16 9
Us + δ(Ui0 )) and
(6)
− 0.813 × (Vi0 − 128) 9 0
( 16 Vs + δ(Vi )) (see Eq. (4)), respectively. Therefore, ideally,
Bi0 = 1.164 × (Yi0 − 16) + 2.018 × (Ui0 − 128) the solution of X aims to satisfy the following overdetermined
4

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2999910, IEEE Access

R1 − 1.159δ(V10 ) + 222.912
 
system:
G1 + 0.391δ(U10 ) + 0.813δ(V10 ) − 135.488
B1 − 2.018δ(U10 ) + 276.928
 
9  
R1 = 1.164 × (Y10 − 16) + 1.596 × ( Vs + δ(V10 ) − 128)  0

16  R2 − 1.159δ(V2 ) + 222.912 
G2 + 0.391δ(U20 ) + 0.813δ(V20 ) − 135.488
 
9
G1 = 1.164 × (Y10 − 16) − 0.391 × ( Us + δ(U10 ) − 128)
B2 − 2.018δ(U20 ) + 276.928
 
16  
9 b= 0

− 0.813 × ( Vs + δ(V10 ) − 128)
 R3 − 1.159δ(V3 ) + 222.912 
G3 + 0.391δ(U30 ) + 0.813δ(V30 ) − 135.488
 
16
9 B3 − 2.018δ(U30 ) + 276.928
 
B1 = 1.164 × (Y10 − 16) + 2.018 × ( Us + δ(U10 ) − 128)
 
0
 
16  R4 − 1.159δ(V4 ) + 222.912 
9 G4 + 0.391δ(U40 ) + 0.813δ(V40 ) − 135.488
 
R2 = 1.164 × (Y20 − 16) + 1.596 × ( Vs + δ(V20 ) − 128)
16 B4 − 2.018δ(U40 ) + 276.928
9
G2 = 1.164 × (Y20 − 16) − 0.391 × ( Us + δ(U20 ) − 128) which is often called the response vector [2]. Therefore, Eq.
16
9 (10) is simplified to Eq. (11), where the matrix T is often
− 0.813 × ( Vs + δ(V20 ) − 128) called the design matrix [2].
16
9 Based on the geometry relation, (b − T X) is perpendicular
B2 = 1.164 × (Y20 − 16) + 2.018 × ( Us + δ(U20 ) − 128) to the range of T , namely R(T ), which is spanned by the
16
9 column vectors of T . Therefore, it yields T t (b − T X) = 0,
R3 = 1.164 × (Y30 − 16) + 1.596 × ( Vs + δ(V30 ) − 128) and then the normal equation T t b = T t T X holds. Because
16
9 the design matrix T is full rank and the rank is 6, the
G3 = 1.164 × (Y30 − 16) − 0.391 × ( Us + δ(U30 ) − 128)
16 pseudoinverse (T t T )−1 T t exists [2]. Therefore, with our
9 CSLM method, the solution of X for Eq. (11) can be obtained
− 0.813 × ( Vs + δ(V30 ) − 128)
16 by
9
B3 = 1.164 × (Y30 − 16) + 2.018 × ( Us + δ(U30 ) − 128)
16 X = (T t T )−1 (T t )b (12)
9
R4 = 1.164 × (Y40 − 16) + 1.596 × ( Vs + δ(V40 ) − 128) where the design matrix T and the response vector b have
16
9 been defined in Eqs. (10)-(11).
G4 = 1.164 × (Y40 − 16) − 0.391 × ( Us + δ(U40 ) − 128)
16
9 In fact, the pseudoinverse (T t T )−1 T t can be computed in
− 0.813 × ( Vs + δ(V40 ) − 128) advance as a fixed 6×12 matrix which is shown in Eq. (13),
16
9 achieving the execution-time reduction effect.
B4 = 1.164 × (Y40 − 16) + 2.018 × ( Us + δ(U40 ) − 128)
16
(9) B. THE WHOLE PROCEDURE TO REALIZE OUR CSLM
In the above overdetermined system, there are six param- METHOD
eters, namely Y10 , Y20 , Y30 , Y40 , Us , and Vs , to be solved. Consequently, using our CSLM method, for the current 2×2
Because it is intractable to solve X such that all equalities YUV block B Y U V , by Eqs. (11)-(12), the four modified
in Eq. (9) are totally satisfied, in the next subsection, a matrix luma values, Y10 , Y20 , Y30 , and Y40 , and the two subsampled
pseudoinverse technique is proposed to solve X such that the chroma values, Vs and Us , can be determined quickly such
sum of the square errors between the left side and right side that the reconstructed 2×2 RGB full-color block-distortion
of Eq. (9) could be minimized, obtaining the best solution of could be minimized in the least square errors sense. The
X. whole procedure to realize our CSLM method is listed below.
Procedure: CSLM
III. DETERMINING THE SUBSAMPLED CHROMA PAIR
AND MODIFIED LUMA VALUES Input: 2×2 RGB block B RGB and the converted 2×2
In this section, we first transform the overdetermined system YUV block B Y U V .
in Eq. (9) to a matrix form, and then we show that it can be Output: the solution of X = (Y10 , Y20 , Y30 , Y40 , Us , Vs ).
solved by the matrix pseudoinverse technique, determining Step 1: By Eq. (2.1), estimate the four reconstructed
the solution of X for each 2×2 YUV block B Y U V . Finally, (U, V )-pairs of B U V .
the whole procedure to realize our CSLM method is pro- Step 2: By Eqs. (3.2)-(3.3), obtain the response vector b and
vided. the design matrix T .
Step 3: By Eq. (3.5), obtain the pseudoinverse matrix
S = (T t T )−1 T t .
A. SOLVING THE OVERDETERMINED SYSTEM WITH Step 4: By Eq. (3.4), calculate X = Sb to determine the four
THE MATRIX PSEUDOINVERSE TECHNIQUE modified luma values and the subsampled (U, V )-pair,
(Us , Vs ).
Moving the constant terms at the right side of Eq. (9) to the Return X.
left side, in matrix form, Eq. (9) is expressed as Eq. (10). Let
5

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2999910, IEEE Access

R1 − 1.159δ(V10 ) + 222.912
   
1.164 0 0 0 0 0.8977
G1 + 0.391δ(U10 ) + 0.813δ(V10 ) − 135.488 1.164 0 0 0 −0.220 −0.4573
B1 − 2.018δ(U10 ) + 276.928
   
  1.164 0 0 0 1.135 0 
0
     0
 R2 − 1.159δ(V2 ) + 222.912   0 1.164 0 0 0  Y10
0.8977 
G2 + 0.391δ(U20 ) + 0.813δ(V20 ) − 135.488  0
  
1.164 0 0 −0.220 −0.4573 Y20 
 
B2 − 2.018δ(U20 ) + 276.928
  
  0 1.164 0 0 1.135 0  Y30 
   
 0
= (10)
 R3 − 1.159δ(V3 ) + 222.912   0 0 1.164 0 0 
Y4 
0.8977   
G3 + 0.391δ(U30 ) + 0.813δ(V30 ) − 135.488  0
  
0 1.164 0 −0.220 −0.4573 Us 
 
B3 − 2.018δ(U30 ) + 276.928
  
   0 0 1.164 0 1.135 0  Vs
0
   
 R4 − 1.159δ(V4 ) + 222.912   0 0 0 1.164 0 0.8977 
G4 + 0.391δ(U40 ) + 0.813δ(V40 ) − 135.488  0
   
0 0 1.164 −0.220 −0.4573
B4 − 2.018δ(U40 ) + 276.928 0 0 0 1.164 1.135 0

b = TX (11)

0.2790 0.3409 0.2392 −0.0074 0.0545 −0.0472 −0.0074 0.0545 −0.0472 −0.0074 0.0545 −0.0472
 
−0.0074 0.0545 −0.0472 0.2790 0.3409 0.2392 −0.0074 0.0545 −0.0472 −0.0074 0.0545 −0.0472
(T t T )−1 T t = −0.0074
 −0.0074 0.0545
0.0545
−0.0472
−0.0472
−0.0074
−0.0074
0.0545
0.0545
−0.0472
−0.0472
0.2790
−0.0074
0.3409
0.0545
0.2392
−0.0472
−0.0074
0.2790
0.0545
0.3409
−0.0472
0.2392
−0.0371 −0.0727 0.1098 −0.0371 −0.0727 0.1098 −0.0371 −0.0727 0.1098 −0.0371 −0.0727 0.1098
0.1098 −0.0920 −0.0178 0.1098 −0.0920 −0.0178 0.1098 −0.0920 −0.0178 0.1098 −0.0920 −0.0178
(13)

IV. EXPERIMENTAL RESULTS SSIM [9] is used to measure the product of the luminance,
Based on the Kodak dataset with 24 images [4] and the contrast, and structure similarity preserving effect between
IMAX dataset with 18 images [3], all the considered exper- the original image and the reconstructed image. For I RGB ,
iments are carried out on the VTM-8.0 platform. To com- the SSIM value is measured by the mean of the three SSIM
pare the quality performance among the considered chroma values for the R, G, and B color planes. To measure the
subsampling methods, the three quality metrics used are FSIMc metric value [11], we first utilize the contrast invari-
CPSNR, SSIM, and FSIMc. Besides the three quality merits ant feature “phase congruency (PC)” and the minor feature
of our CSLM method, the quality-bitrate tradeoff merit of our “gradient magnitude” to obtain the local quality map. Further,
method is also demonstrated for different QP values. In addi- we utilize PC as a weighting function to calculate the quality
tion, the luma mean-preserving effect of CSLM is reported. score as the FSIMc metric value. Note that the available code
The execution time comparison of the considered methods for FSIMc can be accessed from [11]. To justify the CPSNR,
is also made. In addition, based on the video sequence SSIM, and FSIMc merits of our CSLM method, we set QP
“Boat”, the quality-bitrate tradeoff merit of our method is to zero and the related results are computed by passing the
also reported. compression and decompression process.
All the considered methods are implemented on a com- For the reconstructed RGB full-color images, Table 1
puter with an Intel Core i7-4790 CPU 3.6 GHz and 24 GB tabulates the quality comparison in terms of CPSNR, SSIM,
RAM. The operating system is the Microsoft Windows 10 and FSIMc. Here, the three chroma upsampling processes at
64-bit operating system. The program development environ- the client side, namely COPY, BILI, and BICU, are included.
ment is Visual C++ 2017. From Table 1, we observe that our CSLM method under the
BILI chroma upsampling process has the highest CPSNR,
A. CPSNR, SSIM, AND FSIMC MERITS SSIM, and FSIMc in boldface when compared with the eigh-
The quality metric CPSNR is defined by teen combinations for the six considered traditional chroma
subsampling methods and the three considered chroma up-
N sampling processes.
1 X 2552
CPSNR = 10 log10 (14) In terms of CPSNR, SSIM, and FSIMc, Table 2 tabulates
N n=1 M SE
the quality comparison among the proposed CSLM-BILI
in which N denotes the number of test images used in combination and the other three state-of-the-art combina-
the dataset; M SE = 3W1 H p∈P c∈{R,G,B} [In,c RGB tions, IDID-NEDI [12], 4:2:0(A)-LM-BILI [1], and modified
P P
(p) −
rec,RGB
In,c 2
(p)] where “W×H” denotes the size of the test 4:2:0(A)-TN [6]. From Table 2, we observe that our combi-
RGB
image. In,c rec,RGB
(p) and In,c (p) denote the c-color values nation has the highest CPSNR, SSIM, and FSIMc in boldface
of the pixels at position p in the nth input RGB full-color when compared with the three state-of-the-art combinations.
image and the reconstructed one, respectively.
6

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2999910, IEEE Access

TABLE 1. CPSNR, SSIM, AND FSIMc MERITS (QP = 0) OF OUR CSLM METHOD RELATIVE TO THE CONSIDERED EIGHTEEN COMBINATIONS.

CPSNR SSIM FSIMc Time (s)


4:2:0(A) 40.6484 0.9761 0.99984 0.00671
4:2:0(L) 39.2445 0.9703 0.99957 0.00637
COPY 4:2:0(R) 39.1950 0.9701 0.99957 0.00532
4:2:0(MPEG-B) 38.4808 0.9664 0.99938 0.00731
4:2:0(BRIGHT) 37.6030 0.9639 0.99910 0.00325
4:2:0(BRIGHT_MEAN) 40.3534 0.9762 0.99973 0.00335
4:2:0(A) 41.8774 0.9793 0.99973
4:2:0(L) 40.9531 0.9767 0.99962
BILI 4:2:0(R) 40.6006 0.9762 0.99960
4:2:0(MPEG-B) 42.6818 0.9803 0.99972
4:2:0(BRIGHT) 39.3171 0.9718 0.99931
4:2:0(BRIGHT_MEAN) 41.6642 0.9793 0.99968
4:2:0(A) 43.1302 0.9834 0.99984
4:2:0(L) 42.1107 0.9784 0.99965
BICU 4:2:0(R) 41.0471 0.9776 0.99964
4:2:0(MPEG-B) 42.9225 0.9814 0.99968
4:2:0(BRIGHT) 38.9529 0.9700 0.99928
4:2:0(BRIGHT_MEAN) 42.5837 0.9826 0.99976
BILI Proposed CSLM 44.1290 0.9863 0.99985 0.06991
TABLE 2. CPSNR, SSIM, AND FSIMc MERITS (QP = 0) OF OUR CSLM-BILI COMBINATION RELATIVE TO THE Three STATE-OF-THE-ART COMBINATIONS.

CPSNR SSIM FSIMc Time (s)


IDID-NEDI [12] 43.0151 0.9819 0.99974 9.06112
4:2:0(A)-LM-BILI [1] 42.6105 0.9816 0.99897 0.13663
modified 4:2:0(A)-TN [6] 43.0305 0.9828 0.99985 0.01119
Proposed CSLM-BILI 44.1290 0.9863 0.99985 0.06991

B. QUALITY-BITRATE TRADEOFF MERIT, LUMA MEAN which the X-axis denotes the average bitrate required and the
PRESERVATION EFFECT, AND EXECUTION TIME Y-axis denotes the average CPSNR value of the reconstructed
COMPARISON RGB full-color images. Fig. 4 indicates that under the same
In this subsection, we first present the quality-bitrate tradeoff bitrate, our CSLM method has the highest CPSNR among
merit of our CSLM method, and then present the luma mean the nine considered methods. Based on the testing video
preservation effect. Finally, the execution time comparison is sequence “Boat”, Fig. 5 indicates that under the same bitrate,
reported. our CSLM method still has the highest CPSNR.
1) Quality-bitrate tradeoff merit: When setting QP = 0, 4, 2) Luma mean preservation effect: The luma mean-loss of
8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, and 51, the quality- our CSLM method is measured by
bitrate tradeoff of each considered method is depicted by the N
RD curve for the reconstructed RGB full-color images. The 1 X ¯Y
|Ik − I¯krec,Y | (16)
bitrate of one compressed dataset is defined by N
k=1
B where N denotes the number of test images in the dataset;
bitrate = (15)
N I¯kY and I¯krec,Y denote the luma mean values of original kth
where B denotes the total number of bits required in com- luma image and reconstructed kth analogue, respectively.
pressing N test images in that dataset. On VVC platform, the Although most of the comparative methods do not consider
RD curves corresponding to the Kodak dataset and the IMAX modifying the luma values in chroma subsampling, their
dataset are depicted in Fig. 4(a) and Fig. 4(b), respectively, in luma mean-loss values are the same, empirically 0.0028
7

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2999910, IEEE Access

(a) (b)

FIGURE 4. The quality-bitrate tradeoff merit of our CSLM method. (a) For the Kodak dataset on VTM-8.0. (b) For the IMAX dataset on VTM-8.0.

dB, due to the floating point-to-integer conversion error be- sented. Then, an overdetermined system is derived to deploy
fore compression; it indicates a nearly perfect luma mean- the two chroma subsampled parameters and four luma mod-
preservation effect. For the Kodak dataset, the average luma ification parameters into the distortion model. Furthermore,
mean-loss value by our CSLM is 0.0157 dB and for the we show that the derived overdetermined system can be
IMAX dataset, the average luma mean-loss by our CSLM is solved by the matrix pseudoinverse technique, determining
0.0151 dB. On average, the luma mean-loss value is 0.0154 the solution of the required chroma subsampled pair and four
dB, indicating a good luma mean-preservation effect of our modified luma values for each 2×2 YUV block. Finally, a
CSLM method. whole procedure is provided to realize our CSLM method.
3) Execution time comparison: Table 1 tabulates the execu- Based on the Kodak and IMAX datasets, the comprehensive
tion time (in seconds) comparison among the six traditional experimental results have justified the quality and quality-
chroma subsampling methods and our CSLM method. For bitrate tradeoff merits of our CSLM method relative to six
simplicity, the execution time of each traditional chroma traditional chroma subsampling methods and three state-of-
subsampling method is listed once in Table 1, and from Table the-art methods. In addition, based on the video sequence
1, although our method takes more time than the traditional “Boat”, the quality-bitrate tradeoff merit of our method has
methods, our method has clearly better quality. In Table 2, been justified.
we observe that besides the quality merit, our CSLM method How to extend the delivered process to estimate the four
is also much faster than the two state-of-the-art methods, chroma pairs of each 2×2 chroma block, as described in Sub-
IDID [12], and 4:2:0(A)-LM [1]; our CSLM method has section II.A, using the other nonlinear upsampling processes,
worse execution time performance but has better quality such as the bicubic interpolation-based estimation process,
performance relative to modified 4:2:0(A)-TN [6]. is our first future work. In our second future work, we hope
to combine CSLM and the discrete cosine transform based
V. CONCLUSION (DCT-based) downsampling approach, and then compare it
We have presented the proposed CSLM chroma subsampling with the current work [13] in which the downsampling pro-
method for RGB full-color images. First, a newly recon- cess is only done in the DCT domain, while it does nothing
structed 2×2 RGB full-color block-distortion model is pre- on the chroma subsampling and luma modification prior to
8

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2999910, IEEE Access

FIGURE 5. The quality-bitrate tradeoff merit of our CSLM method for the video sequence “Boat” on VTM-8.0.

the compression. [10] Y. C. Yu, J. W. Jhang, X. Wei, H. W. Tseng, Y. Wen, and Z. Liu, “Chroma
upsampling for YCbCr 420 videos,” IEEE International Conference on
Consumer Electronics, pp. 163-164, June 2017.
ACKNOWLEDGEMENT [11] L. Zhang, L. Zhang, X. Mou, and D. Zhang, “FSIM: A feature similarity
We appreciate the help of Associate Editor Prof. Z. Pan and index for image quality assessment,” IEEE Transactions on Image Pro-
the two anonymous reviewers for their valuable comments to cessing, vol. 20, no. 8, pp. 2378-2386, Aug. 2011.
[12] Y. Zhang, D. Zhao, J. Zhang, R. Xiong, and W. Gao, “Interpolation-
improve the manuscript. We also appreciate the proofreading dependent image downsampling,” IEEE Transactions on Image Process-
help of Ms. Catherine Harrington to improve the manuscript. ing, vol. 20, no. 11, pp. 3291-3296, Nov. 2011.
[13] S. Zhu, C. Cui, R. Xiong, Y. Guo, and B. Zeng, “Efficient chroma
sub-sampling and luma modification for color image compression,” IEEE
REFERENCES Transactions on Circuits and Systems for Video Technology, vol. 29, no. 5,
[1] K. L. Chung, T. C. Hsu, and C. C. Huang, “Joint chroma subsampling and pp. 1559-1563, May 2019.
distortion-minimization-based luma modification for RGB color images
with application,” IEEE Transactions on Image Processing, vol. 26, no.
10, pp. 4626-4638, Oct. 2017.
[2] B. N. Datta, Numerical Linear Algebra and Applications, First ed. CA.,
USA: pp. 315-324, 1995. Brooks/Cole Publishing Company.
[3] IMAX True Color Image Collection. Accessed: Aug. 2014. [Online].
Available: http://www.comp.polyu.edu.hk/~cslzhang/CDM_Dataset.htm
[4] Kodak True Color Image Collection. Accessed: Aug. 2014. [Online].
Available: http://r0k.us/graphics/kodak
[5] X. Li and M. T. Orchard, “New edge-directed interpolation,” IEEE Trans-
actions on Image Processing, vol. 10, no. 10, pp. 1521-1527, Oct. 2001.
[6] T. L. Lin, Y. C. Yu, K. H. Jiang, C. F. Liang, and P. S. Liaw, “Novel chroma
sampling methods for CFA video compression in AVC, HEVC and VVC,”
IEEE Transactions on Circuits and Systems for Video Technology, in early
access, 2019.
[7] Spatial Scalability Filters, document ISO/IEC JTC1/SC29/WG11 ITU-T
SG 16 Q.6, Jul. 2005.
[8] Versatile Video Coding (VVC). Available: https://vcgit.hhi.fraunhofer.de/
jvet/VVCSoftware_VTM
[9] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image
quality assessment: from error measurement to structural similarity,” IEEE
Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2999910, IEEE Access

KUO-LIANG CHUNG (SM01) received his B.S.,


M.S., and Ph.D. degrees from National Taiwan
University, Taipei, R.O.C. in 1982, 1984, and
1990, respectively. He has been one Chair Profes-
sor of the Department of Computer Science and
Information Engineering at National Taiwan Uni-
versity of Science and Technology, Taipei, R.O.C.
since 2009. He was the recipient of the Distin-
guished Research Award (2004-2007; 2019-2022)
and Distinguished Research Project Award (2009-
2012) from the Ministry of Science and Technology of R. O. C. In 2020,
he received the K. T. Li Fellow Award from the Institute of Information
and Computing Machinery, R.O.C. He has been an Associate Editor of the
Journal of Visual Communication and Image Representation since 2011.
His research interests include deep learning, image processing, and video
compression.

JEN-SHUN CHENG received his B.S. degree in


Computer Science and Information Engineering
from the Fu Jen Catholic University, New Taipei
City, R.O.C., in 2016. He is currently working
towards his M.S. degree in Computer Science and
Information Engineering at the National Taiwan
University of Science and Technology, Taipei, Tai-
wan. His research interests include image process-
ing and video compression.

HONG-BIN YANG is a junior majoring in Com-


puter Science and Information Engineering at the
National Taiwan University of Science and Tech-
nology, R.O.C., Taiwan. He has participated in
some programming contests such as ACM ICPC.
Under the supervision of Prof. K. L. Chung, he is
currently working on a project in image process-
ing.

10

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.

You might also like