0% found this document useful (0 votes)
46 views

Kiran Reference 4

In this paper, we develop a novel wavelet image coder called Significance-Linked Connected Component Analysis (SLCCA) of wavelet coefficients. Computer experiments show convincingly that the proposed SLCCA outperforms EZW, MRWD, and SPIHT.

Uploaded by

Ratna Kiran
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Kiran Reference 4

In this paper, we develop a novel wavelet image coder called Significance-Linked Connected Component Analysis (SLCCA) of wavelet coefficients. Computer experiments show convincingly that the proposed SLCCA outperforms EZW, MRWD, and SPIHT.

Uploaded by

Ratna Kiran
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

774

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO. 6, JUNE 1999

Signicance-Linked Connected Component Analysis for Wavelet Image Coding


Bing-Bing Chai, Member, IEEE, Jozsef Vass, Student Member, IEEE, and Xinhua Zhuang, Senior Member, IEEE
Abstract Recent success in wavelet image coding is mainly attributed to recognition of the importance of data organization and representation. There have been several very competitive wavelet coders developed, namely, Shapiros embedded zerotree wavelets (EZW), Servetto et al.s morphological representation of wavelet data (MRWD), and Said and Pearlmans set partitioning in hierarchical trees (SPIHT). In this paper, we develop a novel wavelet image coder called signicance-linked connected component analysis (SLCCA) of wavelet coefcients that extends MRWD by exploiting both within-subband clustering of signicant coefcients and cross-subband dependency in signicant elds. Extensive computer experiments on both natural and texture images show convincingly that the proposed SLCCA outperforms EZW, MRWD, and SPIHT. For example, for the Barbara image, at 0.25 b/pixel, SLCCA outperforms EZW, MRWD, and SPIHT by 1.41 dB, 0.32 dB, and 0.60 dB in PSNR, respectively. It is also observed that SLCCA works extremely well for images with a large portion of texture. For eight typical 256 2 256 grayscale texture images compressed at 0.40 b/pixel, SLCCA outperforms SPIHT by 0.16 dB0.63 dB in PSNR. This outstanding performance is achieved without using any optimal bit allocation procedure. Thus both the encoding and decoding procedures are fast. Index TermsClustering, connected component, image coding, morphology, signicance-link, subband, wavelet.

I. INTRODUCTION INCE its introduction for speech coding in the 1970s, subband coding [1] has become a very active research area for image and video compression. Wavelet theory [2][4] provides a fundamental insight into the structure of subband lters that leads to a more productive approach to designing the lters [2], [5]. This is evidenced by the discovery of symmetric biorthogonal wavelet bases with compact support [2], [6] which are instantly converted into more desirable linear phase lters while maintaining the necessary perfect reconstruction. Thus subband and wavelet are often used interchangeably in the literature. Most of the subband image coders published recently are based on pyramidal (or dyadic) wavelet decomposition as shown in Fig. 1. Conventional wavelet or subband image coders [6], [7] mainly exploit the energy compaction property
Manuscript received May 15, 1997; revised June 1, 1998. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Antonio Ortega. B.-B. Chai is with Sarnoff Corporation, Princeton, NJ 08543 USA (e-mail: [email protected]). J. Vass and X. Zhuang are with the Department of Computer Engineering and Computer Science, University of Missouri, Columbia, MO 65211 USA (e-mail: [email protected]; [email protected]). Publisher Item Identier S 1057-7149(99)04065-8.

of subband decomposition by using optimal bit allocation strategies. The drawback is apparent in that all zero-valued wavelet coefcients, which convey little information, must be represented and encoded, biting away a signicant portion of the bit budget. Although this type of wavelet coders provide superior visual quality by eliminating the blocking effect in comparison to block-based image coders such as JPEG [8], their objective performance measured by peak signal-to-noise ratio (PSNR) increases only moderately. In recent years, we have seen an impressive advance in wavelet image coding. The success is mainly attributed to innovative strategies for data organization and representation of wavelet-transformed images which exploit the statistical properties in a wavelet pyramid one way or the other. There are three representatives of such top-ranked wavelet image coders, namely, Shapiros embedded zerotree wavelet coder (EZW) [9], Servetto et al.s morphological representation of wavelet data (MRWD) [10], [11], and Said and Pearlmans set partitioning in hierarchical trees (SPIHT) [12]. Both EZW and SPIHT exploit cross-subband dependency of insignicant wavelet coefcients while MRWD does within-subband clustering of signicant wavelet coefcients. As a result, the PSNR of reconstructed images is consistently raised by 13 dB over block-based transform coders. In this paper, we propose a novel strategy for data organization and representation for wavelet image coding termed signicance-linked connected component analysis (SLCCA). SLCCA strengthens MRWD by exploiting not only within-subband clustering of signicant coefcients but also cross-subband dependency in the signicant elds. The cross-subband dependency is effectively exploited by using the so-called signicance-link between a parent cluster and a child cluster. The key components of SLCCA include multiresolution discrete wavelet decomposition, connected component analysis of signicant elds within subbands, and signicance-link registration across subbands, as well as bit plane encoding of magnitudes of signicant coefcients by adaptive arithmetic coding. The rest of the paper is organized as follows. In Section II, we discuss the statistical properties of wavelet-transformed images. In Section III, we analyze and compare the data organization and representation strategies by EZW, MRWD, and SPIHT. Our wavelet image coding algorithm, SLCCA, is presented in Section IV. Section V presents the performance evaluation. In Section V-A, the performance of SLCCA is evaluated against other wavelet coders. Performance comparison among SLCCA and recently published most state-of-the-

10577149/99$10.00 1999 IEEE

CHAI et al.: WAVELET IMAGE CODING

775

(a)

(b)

Fig. 1. Wavelet pyramid. (a) Three-scale wavelet decomposition for the Lena image. (b) Illustration of parent-child relationship between subbands at different scales.

art codecs is given in Section V-B. The last section concludes the paper. II. STATISTICAL PROPERTIES OF WAVELET-TRANSFORMED IMAGES Discrete-wavelet-transformed images demonstrate the following statistical properties and their exploitation continually proves to be important for image compression: 1) spatial-frequency localization; 2) energy compaction; 3) within-subband clustering of signicant coefcients; 4) cross-subband similarity; 5) decay of magnitude of wavelet coefcients across subbands. Each wavelet coefcient contains only features from a local segment of an input image, i.e., it is spatially localized. Since subband coding decomposes an image into a few frequency bands with almost no overlap, each subband is frequency localized with nearly independent frequency content. In brief, each wavelet coefcient represents information in a certain frequency range at a certain spatial location. A natural image is typically composed of a large portion of homogeneous and textured regions, together with a rather small portion of edges including perceptually important object boundaries. Homogeneous regions have the least variation and mostly consist of low frequency components; textured regions have moderate variation and consist of a mixture of low and high frequency components; and edges show the most variation and are mostly composed of high frequency components. Accordingly, wavelet transform compacts most energy distributed over homogeneous and textured regions into the lowpass subband. Each time a lowpass subband at a ne resolution is decomposed into four subbands at a coarser resolution, critical sampling is applied that allows the newly generated lowpass subband to be represented by using

only one-fourth the size of the original lowpass subband. Repeating this decomposition process on an image will effectively compact the energy into few wavelet coefcients. A wavelet coefcient is called signicant with respect to ; otherwise, it is deemed a predened threshold if insignicant. An insignicant coefcient is also known as zero coefcient. Due to the absence of high frequency components in homogeneous regions and the presence of high frequency components in textured regions and around edges, signicant coefcients in highpass subbands usually indicate the occurrence of edges or textures with high energy. In other words, they are indicative of prominent discontinuity or prominent changes, a phenomenon which tends to be clustered. The within-subband clustering of Lena image is shown in Figs. 1(a) and 2(a). Relative to a given wavelet coefcient, all coefcients at ner scales of similar orientation which correspond to the same spatial location are called its descendents; accordingly, the given coefcient is called their ancestor. Specically, the coefcient at the coarse scale is called the parent and all four coefcients corresponding to the same spatial location at the next ner scale of similar orientation are called children [Fig. 1(b)]. These concepts were introduced by Lewis and Knowles in [13]. Although the linear correlation between the values of parent and child wavelet coefcients has been empirically found to be extremely small, there is additional dependency between the magnitudes of parent and children. Experiments showed that the correlation between the squared magnitude of a child and the squared magnitude of its parent tends to be between 0.2 and 0.6 with a strong concentration around 0.35 [9]. Although it appears to be difcult to characterize and make a full use of this cross-subband magnitude similarity, a reasonable conjecture based on experience with real-world images is that the magnitude of a child is smaller than the magnitude of its parent. By assuming Markov random eld as the image model, we are able to prove that statistically

776

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO. 6, JUNE 1999

(a)

(b)

(c) Fig. 2. Signicance map for six-scale wavelet decomposition, q = 11. (a) Signicance map after quantization: White pixels denote insignicant coefcients and black pixels denote signicant coefcients. (b) The marker image (after removing clusters having only one signicant coefcient) shows the seed positions for the 689 clusters. Note that only 16 seed positions need to be transmitted explicitly with the use of signicance-link. (c) The transmitted signicance map. White pixels denote insignicant coefcients that are not encoded. Black and gray pixels denote encoded signicant and insignicant wavelet coefcients, respectively.

the magnitude of wavelet coefcients decays from the parent to its children. More precisely, if we measure the coefcient magnitude by its variance, we can prove that a parent has larger variance than its children [14]. This provides a strong theoretical support to EZW, SPIHT, and SLCCA. III. OVERVIEW OF DATA ORGANIZATION AND REPRESENTATION STRATEGIES There exist two different approaches to an efcient organization and representation of wavelet coefcients in literature [9], [12], [11]. While EZW and SPIHT use a regular tree structure or set-partitioned tree structure to approximate insignicant elds across subbands, MRWD nds irregular clusters of signicant elds within subbands.

It is widely accepted from the source coding theory that in general, an image compression technique grows computationally more complex as it becomes more efcient. EZW interrupts this tendency by achieving outstanding performance with very low computational complexity. It efciently identies and approximates arbitrary shaped zero regions across subbands by the union of highly constrained tree-structured zero regions called zerotrees. Meanwhile, it denes the signicant elds everywhere outside these zero regions through progressively rening the magnitudes of coefcients. It is apparent that each zerotree can be effectively represented by its root symbol. On the other hand, there may still be many zero coefcients which cannot be included in the highly structured zerotrees. These isolated zeros remain expensive to represent.

CHAI et al.: WAVELET IMAGE CODING

777

TABLE I PERFORMANCE COMPARISON (PSNR [dB])

ON

512

2 512 LENA IMAGE


(a) (b)

SPIHT seeks to enhance EZW by partitioning the crosssubband tree structure into three parts, i.e., tree root, children of the root, and nonchild descendents of the root; the last part comprising a majority of the population in the tree structure. When a child coefcient is found signicant, EZW represents and encodes all four grandchild coefcients separately even if all nonchild descendents are insignicant. By contrast, SPIHT treats all the insignicant nonchild descendents as a set and employs a single symbol to represent and encode it. This ne set partitioning strategy leads to an impressive increase in PSNR by 0.860.94 dB over EZW on Lena image (see Table I), indicating that SPIHT exploits cross-subband dependency more efciently than EZW. Different from EZW and SPIHT, MRWD [10], [11] directly forms irregular-shaped clusters of signicant coefcients within subbands. The clusters within a subband are progressively delineated by insignicant boundary zeros through morphological conditioned dilation operation [15] which utilizes a structuring element to control the shape and size of clusters as well as the formation of boundaries. Details on cluster formation will be presented in the following section. With MRWD, the boundary zeros of each cluster still need to be represented but the expensive cost of representing and encoding isolated zeros in EZW is largely avoided. As a result, MRWD constantly outperforms EZW. For instance, it gains 0.780.95 dB over EZW on Lena image, as shown in Table I. Nevertheless, in the early version of MRWD [10], a seed for each cluster, i.e., a pixel from which a cluster is originated, needs to be specied and its positioning information is encoded as overhead. Since a large number of clusters are involved, the overall overhead may take up a signicant portion of the bit budget. Our new coding algorithm is developed based on our knowledge of this early version of MRWD. In its latest version [11], MRWD keeps its main feature, i.e., the within-subband clustering. However, the coding method has been changed. All wavelet coefcients are coded regardless of their signicance. The seed of each cluster is specied by transmitting a special symbol. In addition, a context-based adaptive arithmetic model is employed for entropy coding, where the context is based on the signicance of the parent coefcient. The use of this adaptive arithmetic model is an implicit exploitation of the cross subband similarity which is expected to improve the compression. The experimental results shows that the latest MRWD indeed outperforms the early version (Table I). IV. SIGNIFICANCE-LINKED CONNECTED COMPONENT ANALYSIS In this section, the key features of our wavelet coder SLCCA is rst described. Then a complete algorithm is presented.

(c) Fig. 3.

(d) Structuring elements used in conditioned dilation.

A. Formation of Connected Components within Subbands First, we will review some morphological operations relevant to our algorithm. More detailed discussion of mathematical morphology can be found in [15][17]. A binary image , where denotes the can be considered as a subset of set of numbers used to index a row or column position on a binary image. Pixels are in this subset if and only if they have the binary value 1 on the image. The dilation of set with set is dened by , where denotes the translation of by the pixel-wise union. For a structuring element that contains the origin (such as those used in SLCCA and shown in Fig. 3), the dilation operation and produces an enlarged set containing the original set denote the set to be some neighboring pixels. Let be an arbitrary subset reconstructed and the marker of . Then the conditioned dilation operation is dened as follows [15]:

where

is the pixel-wise intersection. If we let (1)

. then a cluster is formed when Since a rather large portion of wavelet coefcients are usually insignicant and signicant coefcients within subbands tend to be more clustered, organizing and representing each subband as irregular shaped clusters of signicant coefcients provide an efcient way for encoding. Clusters are progressively constructed by using conditioned dilation, resulting in an effective segmentation of the within-subband signicant eld. The idea was sketched in [10]. In the following, we discuss the issue in regard to the selection of structuring elements. In the case of clustering in signicance eld, the binary represents the signicance map, i.e., image if the wavelet coefcient at location is signicant otherwise The marker contains the seeds of each cluster. Traditionally, a connected component is dened based on one of the three types of connectivity: 4-connected, 8-connected, and 6-connected, each requiring a geometric adjacency of two neighboring pixels. Since the signicant

778

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO. 6, JUNE 1999

(a)

(b)

(c)

(d)

(e)

(f)

Fig. 3(a) is used. The signicant map is shown in Fig. 4(a). is chosen as seed, i.e., In the example, pixel at , and the remaining seven steps of the recursive cluster detection are shown in Figs. 4(c)(i). As extremely small clusters usually do not produce discernible visual effects, and these clusters render a higher insignicant-to-signicant coefcient ratio than large clusters, they are eliminated to avoid more expensive coding cost. The connected component analysis is illustrated in Fig. 2. The signicance map obtained by quantizing all wavelet coefcients with a uniform scalar quantizer of step size is shown in Fig. 2(a). There are 22 748 signicant wavelet coefcients after quantization, forming 1654 clusters using the structuring element shown in Fig. 3(c). After removing connected components having only one signicant coefcient, the number of clusters is reduced to 689. The nal encoded signicance map is shown in Fig. 2(c) with the marker image shown in Fig. 2(b). It is clear that only a small fraction of zero coefcients are encoded.

B. Signicance-Link in Wavelet Pyramid


(g) (h) (i) Fig. 4. Demonstration of the progressive cluster detection by using conditioned dilation on a simple example. White pixels denote insignicant coefcients that are not coded. Black and gray pixels denote encoded signicant (S) and insignicant (I) coefcients, respectively. (a) The signicance map and (b) the seed position. (c)(i) Steps of the algorithm. The nal transmitted string is SISIIISISIISISIIII.

coefcients in wavelet eld are only loosely clustered, the conventional denition of connected component will produce too many components, affecting the coding efciency. Thus we may use symmetric structuring elements with a size larger than a 3 3 square. But we still call the segments generated by conditioned dilation connected components even if they are not geometrically connected. Some structuring elements tested in our experiments are shown in Fig. 3. The ones in Fig. 3(a) and (b) generate 4- and 8-connectivity, respectively. The structuring elements in Fig. 3(c) and (d) represent a diamond of size 13 and a 5 5 square, respectively. These latter two may not preserve geometric connectivity but perform better than the former in terms of coding efciency. To effectively delineate a signicant cluster, all zero coefof each signicant coefcients within the neighborhood cient in the cluster need to be marked as the boundary of the cluster. By increasing the size of the structuring element, the number of connected components decreases. On the other hand, a larger structuring element results in more boundary zero coefcients. The optimal choice of the size of the structuring element is determined by the cost of encoding boundary zeros versus that of encoding the positional information of connected components. Since the signicance-link technique, which will be presented in the next subsection, largely reduces the positioning cost, relatively smaller structuring elements can be selected for connected component analysis. The progressive cluster detection by conditioned dilation operation is illustrated in Fig. 4, where image size of 5 5 is assumed and 4-connected structuring element shown in

The cross-subband similarity among insignicant coefcients in wavelet pyramid has been exploited in EZW and SPIHT that greatly improves the coding efciency. On the other hand, it is found that the spatial similarity in wavelet pyramid is not strictly satised, i.e., an insignicant parent does not warrant all four children insignicant. The isolated zero symbol used in EZW indicates the failure of such a dependency. The similarity described by zerotree in EZW and the similarity described by both zerotree and insignicant all second-generation descendents in SPIHT are more of a reality when a large threshold is used. As was stated in [9] and [18], when the threshold decreases (for embedding) to a certain point, the tree structure or set-partitioned-tree structure is no longer efcient. In the proposed algorithm, as opposed to EZW and SPIHT, we attempt to exploit the spatial similarity among signicant coefcients. However, we do not seek a very strong parentchild dependency for each and every signicant coefcient. Instead, we try to predict the existence of clusters at ner scales. As pointed out before, statistically, the magnitudes of wavelet coefcients decay from a parent to its children. It implies that in a cluster formed within a ne subband, there likely exists a signicant child whose parent at the coarser subband is also signicant. In other words, a signicant child can likely be traced back to its parent through this signicance linkage. It is crucial to note that this signicance linkage relies on a much looser spatial similarity. Now, we dene signicance-link formally. Two connected components or clusters are called signicance-linked if the signicant parent belongs to one component, and at least one of its children is signicant and lies in another component (Fig. 5). If the positional information of the signicant parent in the rst component is available, the positional information for the second component can be inferred through marking the parent as having a signicance-link. Since there are generally many signicant coefcients in connected components, the

CHAI et al.: WAVELET IMAGE CODING

779

TABLE II PERFORMANCE COMPARISON (PSNR [dB]) ON 512

22 51 BARBARA IMAGE

Fig. 5. Illustration of signicance-link. The values are the magnitudes of quantized coefcients. Nonzero values denote signicant coefcients.

likelihood of nding signicance-link between two components is fairly high. Apparently, marking the signicancelink costs much less than directly encoding the position, and a signicant saving on encoding cluster positions is thus achieved. An experiment has been conducted to test the effectiveness of signicance-link. Two versions of the algorithm are tested under the condition that all parameters are set the same except for that one version uses signicancelink while the other does not. It has been shown that the saving from using signicance-link over without signicancelink increases as the bit rate increases, ranging from 527 bytes (at 0.25 b/pixel) to 3103 bytes (at 1 b/pixel) for Lena image. Among all, using signicance-link makes a major difference between SLCCA and MRWD. C. Bit-Plane Organizing and Adaptive Arithmetic Coding As in most image compression algorithms, the last step of SLCCA involves entropy coding for which adaptive arithmetic coding [19] is employed. In contrast to a xed model arithmetic coder, which works well for a stationary Markov source, the adaptive arithmetic coder updates the corresponding conditional probability estimation every time when the coder visits a particular context. For the data stream generated by a nonstationary source such as natural images, the conditional probabilities or local probability distributions may vary substantially from one section to another. The knowledge of the local probability distributions acquired by an adaptive model is more robust than global estimates and follows the local statistical variation well. In comparison to the xed model arithmetic coder, the adaptive arithmetic coder is thus able to achieve higher compression. In order to exploit the full strength of an adaptive arithmetic coder, it is preferable to organize outcomes of a nonstationary Markov source into such a stream that each local probability distribution is in favor of one source symbol. This is the basic idea behind the well-known lossless bit-plane coding [20], in which an original image is divided into bit-planes with each bit-plane being encoded separately. Since more signicant bit-planes generally contain large uniform areas, the entropy coding techniques can be more efcient.

This idea is employed by SLCCA to encode the magnitude of signicant coefcients in each subband. The magnitude of each signicant coefcient is converted into a binary representation with a xed length determined by the maximum magnitude in the subband. Generally, there are more coefcients with small magnitudes than those with large magnitudes, implying that the more signicant bit-planes would contain a lot more zeros than ones. Accordingly, the adaptive arithmetic coder would generate more accurate local probability distributions in which the conditional probabilities for 0 symbols are close to one for the more signicant bit-planes. The context used to determine the conditional probability is related to the model of signicant coefcient at signicance status of its parent and its eight neighbors. Let denote the signicance status of the parent, i.e., if the parent pixel is signicant, otherwise . Let denote the number of signicant coefcients in a 3 3 causal neighborhood of the current pixel . The adaptive context is selected by , which yields a total of 18 possible models. The bit-plane encoding idea is also used in both EZW and SPIHT, but in a different manner. In EZW, for instance, the idea is realized through progressive transmission of magnitudes, with the 0 bits before the rst 1 bit being encoded as either zerotree or isolated zero. D. Description of SLCCA Algorithm In the following, we summarize the previous three subsections with the encoding algorithm of SLCCA. Four symbols are used to encode the shape of clusters: POS, NEG, ZERO, and LINK. POS or NEG represents the sign of a signicant coefcient. ZERO represents an insignicant coefcient that delineates the boundary of a cluster. LINK marks the presence of a signicance-link. The magnitudes of signicant coefcients are encoded in bit-plane order with two symbols: 0 and 1. Three lists of coefcients are maintained in the algorithm: List of scan order (LSO), list of child clusters (LCC), and list of signicant coefcients (LSC). All these lists are rst-in-rst-out queues. Each entry in the lists is identied . denotes the coefcient at position by a coordinate . BEGIN SLCCA-encode( ) Step 1: Form a subband pyramid and quantize all wavelet coefcients with a uniform scalar quantizer. The quantization step size is selected such that the target bit rate is satised. Step 2: Perform connected component analysis of signicant coefcients within each subband using conditioned

780

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO. 6, JUNE 1999

(a)

(b)

(c)

(d)

512 Barbara image. (a) Original, reconstructed images (b) at 0.5 b/pixel, PSNR = 31:89 dB, (c) at 0.25 b/pixel, Fig. 6. Coding results for 512 PSNR = 28:18 dB, and (d) at 0.125 b/pixel, PSNR = 25:36 dB.

dilation and remove extremely small connected components. Step 3: Form LSO containing all coefcient positions in the subband pyramid as follows. Starting from the coarsest subband, scan subbands according to the [Fig. 1(b)]. Within each order subband, scan the coefcients from left to right, top to bottom. Go to the next ner scale after all coefcients in the current scale have been scanned. Step 4: Encoding clusters. in 4.1. Start of a new cluster. For every entry is signicant and has not yet been LSO, if encoded: , i.e., is the 4.1.1. encode the position seed of a cluster; 4.1.2. call Encode-signicant-coeff .

4.2. Encode child clusters. For each entry in LCC: from LCC; 4.2.1. remove , and : 4.2.2. for is signicant and 4.2.2.1. if has not been encoded, go to Step 4.2.3; is insignicant and 4.2.2.2. if has not been encoded, encode a ZERO symbol. . 4.2.3. Call Encode-signicant-coeff Step 5: Encode the magnitude of signicant coefcients, i.e., all the entries from the LSC list, in bit-plane order using the adaptive arithmetic coder. END SLCCA-encode( ) BEGIN Encode-signicant-coeff

CHAI et al.: WAVELET IMAGE CODING

781

TABLE III PERFORMANCE COMPARISON (PSNR [dB]) OF SPIHT, MRWD, AND SLCCA ON DIFFERENT 512 512 NATURAL IMAGES

Step 1: Encode the sign (POS or NEG) of , put the to the end of LSC. position is the parent of a child cluster that has not Step 2: If been linked to any other coefcient, then: 2.1. encode a special symbol (LINK); , which represents 2.2. move the child position , to the end of LCC. This all four children of indicates that the child cluster has been linked. in a preStep 3: Expanding a cluster. For every dened neighborhood: is signicant and has not 3.1. if been encoded, then call Encode-signicant-coeff ; is insignicant, then encode 3.2. if a ZERO symbol. END Encode-signicant-coeff The decoding algorithm is straightforward and can be obtained by simply reverse the encoding process. V. PERFORMANCE EVALUATION A. Comparison of Algorithms Using Different Data Organization Strategies SLCCA is evaluated on eight natural 512 512 grayscale images, i.e., Lena, Barbara, baboon, couple, man, boat, tank, and Goldhill. The performance is compared with three wavelet coders EZW, MRWD, and SPIHT. In SLCCA, each original image is decomposed into a six-scale subband pyramid using the 9/7 biorthogonal lters [6]. There is no optimal bit allocation carried out in SLCCA. Instead, all wavelet coefcients are quantized with the same uniform scalar quantizer. As usual, the distortion is measured by peak signal-to-noise ratio (PSNR) dened as [dB] where RMSE is the root mean-squared error between the original and reconstructed images. All the reported bit rates are computed from the actual le sizes.

Fig. 7. The 256 256 texture images. From left to right, top to bottom: ngerprint, sweater, grass, pig skin, rafa, sand, water, and wool.

MRWD, SPIHT, and SLCCA all use the same 9/7 biorthogonal lters and six-scale dyadic wavelet decomposition. EZW also uses six-scale dyadic wavelet decomposition. However, a somewhat older lter proposed in [21] is used to obtain wavelet transform. Table I shows the PSNR comparison on Lena image at different bit rates. SLCCA consistently outperforms EZW, both versions of MRWD, and SPIHT as well. Compared to EZW, SLCCA gains 1.03 dB in PSNR on average. When compared to the latest version of MRWD [11], SLCCA is superior by 0.16 dB on average. Compared to SPIHT, SLCCA gains 0.13 dB on average.

782

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO. 6, JUNE 1999

(a)

(b)

Fig. 8. Fingerprint images. (a) Original image. (b) Reconstructed image by SLCCA at 0.444 b/pixel. PSNR = 35:65 dB.

TABLE IV PERFORMANCE COMPARISON (PSNR [dB]) SLCCA ON 256 256 TEXTURE IMAGES

OF AT

SPIHT AND 0.4 B/PIXEL

TABLE V PERFORMANCE COMPARISON (PSNR [dB]) ON 768

2 768 FINGERPRINT IMAGE

Table II compares the performance of SLCCA, EZW, MRWD and SPIHT on Barbara image. On average, SLCCA is superior to EZW by 1.42 dB, and to SPIHT by 0.47 dB. SLCCA also outperforms MRWD [11] by 0.33 dB on average. The original Barbara image, and the reconstructed images at 0.125 b/pixel, 0.25 b/pixel, and 0.5 b/pixel are shown in Figs. 6(a)(d), respectively. The comparison between SLCCA, MRWD, and SPIHT on the rest of test images is shown in Table III. SLCCA consistently outperforms both SPIHT and MRWD. It appears that SLCCA performs signicantly better than SPIHT for images which are rich in texture; see, for instance, the results of Barbara, baboon, boat, and tank. For images which are relatively smooth, the performance between SLCCA and SPIHT gets closer, as indicated by the results of Goldhill, couple, and man. A similar observation is also true for MRWD [11], i.e., for the texture-rich images, MRWD outperforms SPIHT in general, indicating that the use of clustering is superior to zerotree structure for texture images. To further verify the above observation, we compare the performance of SLCCA and SPIHT on eight typical 256 256 grayscale texture images, i.e., ngerprint, sweater, grass, pig skin, rafa, sand, water, and wool, as shown in Fig. 7. The

results at 0.4 b/pixel are summarized in Table IV, indicating that SLCCA constantly outperforms SPIHT by 0.16 dB to 0.63 dB. An explanation is as follows. When textured images are encoded, wavelet transform is unlikely to yield many large zero regions for lack of homogeneous regions. Thus, the advantage of using an insignicant tree as in EZW, or an insignicant part-of-tree structure as in SPIHT is weakened. On the other hand, SLCCA uses signicance-based clustering and signicance-based between-cluster linkage, which are not affected by the existence of textures. Finally, we apply SLCCA to ngerprint image compression, which represents a very important issue demanding the best solution. As known, the digitized ngerprints of a person may require 10 Mbyte of storage without any compression. With such a huge amount of data, the real time transmission of uncompressed ngerprints becomes impossible. The FBI has developed a ngerprint image compression algorithm called wavelet scalar quantization (WSQ) [22]. Table V lists the coding results of the 768 768 ngerprint image from WSQ, SPIHT, and SLCCA. Again, SLCCA outperforms SPIHT by an average of 0.26 dB. At 0.444 b/pixel or 18 : 1 compression, SLCCA yields a PSNR of 35.65 dB as opposed to WSQs 34.43 dB, corresponding to a 1.22 dB improvement. The original and reconstructed images from SLCCA at 0.444 b/pixel are shown in Fig. 8. The coding results along with the images are also available at the homepage of the Multimedia Communications and Visualization Laboratory at http://meru.cecs.missouri.edu.

CHAI et al.: WAVELET IMAGE CODING

783

TABLE VI PERFORMANCE COMPARISON OF THE SFQ, EQ, OC, AND SLCCA ON LENA, BARBARA, AND GOLDHILL IMAGES

SLCCA by 0.28 and 0.83 dB, respectively. Unlike SLCCA, the exact bit rate control of OC is not solved. Thus, at 0.25 b/pixel and 1.0 bpp the actual rate by OC exceeds the target rate. VI. CONCLUSION A new image coding algorithm termed signicance-linked connected component analysis is proposed in this paper. The algorithm takes advantage of two properties of the wavelet decomposition: 1) the within-subband clustering of signicant coefcients and 2) the cross-subband dependency in signicant elds. The signicance-link is employed to represent the positional information for clusters at ner scales, which greatly reduces the positional information overhead. The magnitudes of signicant coefcients are coded in bit-plane order so that the local statistic in the bit stream matches the probability model in adaptive arithmetic coding to achieve further saving in bit rate. Extensive computer experiments justify that SLCCA is among the state-of-the-art image coding algorithms reported in the literature. As no optimization is involved, both the encoding and decoding procedures are fast. ACKNOWLEDGMENT The authors would like to thank the reviewers for their valuable comments and suggestions. REFERENCES
[1] J. W. Woods, Ed., Subband Image Coding. Boston, MA: Kluwer, 1991. [2] I. Daubechies, Ten Lectures on Wavelets. Philadelphia, PA: SIAM, 1992. [3] O. Rioul and M. Vetterli, Wavelets and signal processing, IEEE Signal Processing Mag., vol. 8, pp. 1438, Oct. 1991. [4] M. Vetterli and J. Kova evi , Wavelets and Subband Coding. Englec c wood Cliffs, NJ: Prentice-Hall, 1995. [5] S. G. Mallat, A theory for multiresolution signal decomposition: The wavelet representation, IEEE Trans. Pattern Anal. Machine Intell., vol. 11, pp. 674693, July 1989. [6] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, Image coding using wavelet transform, IEEE Trans. Image Processing, vol. 1, pp. 205220, Apr. 1992. [7] N. Farvardin and N. Tanabe, Subband image coding using entropycoded quantization, in Proc. SPIE Conf. Image Processing Algorithms and Techniques, 1990, vol. 1244, pp. 240254. [8] G. K. Wallace, The JPEG still picture compression standard, Commun. ACM, vol. 34, pp. 3044, Apr. 1991. [9] J. M. Shapiro, Embedded image coding using zerotrees of wavelet coefcients, IEEE Trans. Signal Processing, vol. 41, pp. 34453462, Dec. 1993. [10] S. Servetto, K. Ramchandran, and M. T. Orchard, Wavelet based image coding via morphological prediction of signicance, in Proc. IEEE Int. Conf. Image Processing, Oct. 1995, pp. 530533. [11] S. Servetto, K. Ramchandran, and M. T. Orchard, Image coding based on morphological representation of wavelet data, IEEE Trans. Image Processing, to be published. [12] A. Said and W. A. Pearlman, A new, fast, and efcient image codec based on set partitioning in hierarchical trees, IEEE Trans. Circuits Syst. Video Technol., vol. 6, pp. 243250, June 1996. [13] A. S. Lewis and G. Knowles, A 64 Kb/s video codec using the 2-D wavelet transform, in Proc. Data Compression Conf., Snowbird, UT, 1991. [14] X. Li and X. Zhuang, The decay and correlation properties in wavelet transform, Tech. Rep., Univ. Missouri, Columbia, Mar. 1997. [15] L. Vincent, Morphological grayscale reconstruction in image analysis: Applications and effective algorithms, IEEE Trans. Image Processing, vol. 2, pp. 176201, Apr. 1993. [16] R. M. Haralick, S. R. Sternberg, and X. Zhuang, Image analysis using mathematical morphology, IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-9, pp. 532550, July 1987.

B. Comparison with the State-of-the-Art To the best of our knowledge, there have been three other very competitive wavelet coders proposed recently, namely, Xiong et al.s space-frequency quantization (SFQ) [23], LoPresto et al.s estimation quantization (EQ) [24], and Joshi et al.s optimal classication (OC) [25]. In SFQ, the zerotree structure is optimized (using the Lagrange multiplier method) in the operational rate-distortion sense for a given target bit rate. The optimization procedure yields remarkable performance at the price of high computational complexity. In EQ, the wavelet coefcients are modeled as a generalized Gaussian distribution with zero mean and unknown variance. For each subband, the variance of wavelet coefcients is estimated by maximum likelihood estimator. Then, each wavelet coefcient is quantized with an off-line designed optimal quantizer scaled to match the estimated variance of wavelet eld. The distinct feature of EQ is the backward adaptive magnitude estimation and quantization of a wavelet coefcient based on its quantized neighboring coefcients. In OC, the variance of coefcients in each subband is estimated by an iterative algorithm, after which small blocks of coefcients are classied into a given number of classes based on their variance. Then, each class is modeled by a generalized Gaussian density and optimal bit allocation is carried out among classes from all the subbands. The performance evaluation is given in Table VI. The performance comparison between the above mentioned coding algorithms and SLCCA is fairly difcult due to the different type of lter and wavelet decomposition used by different algorithms. SLCCA and SFQ both use the same wavelet lter and dyadic wavelet decomposition. The performance of SLCCA and SFQ for the Lena and Goldhill images is comparable, i.e., SFQ slightly outperforms SLCCA by 0.03 dB and 0.08 dB on average, for the Lena and Goldhill images, respectively. For the Barbara image SFQ exceeds SLCCA by 0.22 dB on average. EQ is superior to SLCCA by 0.35 dB and 0.21 dB on average for the Lena and Goldhill images, respectively. Nevertheless, EQ uses 4-scale dyadic wavelet transform with 10/18 normalized biorthogonal lter set [26] which gives a slightly superior performance when compared to 9/7 lter used in SLCCA. Finally, OC uses a 22-band decomposition. For the Lena image at 0.25 b/pixel, SLCCA is slightly superior to OC by 0.03 dB, and at 0.5 b/pixel and 1.0 b/pixel OC exceeds

784

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO. 6, JUNE 1999

[17] R. M. Haralick and L. G. Shapiro, Computer and Robot Vision. Reading, MA: Addison-Wesley, 1992. [18] A. Said and W. A. Pearlman, An image multiresolution representation for lossless and lossy compression, IEEE Trans. Image Processing, vol. 5, pp. 13031310, Sept. 1996. [19] I. H. Witten, M. Neal, and J. G. Cleary, Arithmetic coding for data compression, Commun. ACM, vol. 30, pp. 520540, June 1987. [20] M. Rabbani and P. W. Jones, Digital Image Compression Techniques. Belingham, WA: SPIE, 1991. [21] E. H. Adelson and E. Simoncelli, Orthogonal pyramid transforms for image coding, in Proc. SPIE Conf. Visual Communications and Image Processing, 1987, vol. 845, pp. 5058. [22] J. N. Bradley, C. M. Brislawn, and T. Hopper, The FBI wavelet/scalar quantization standard for grayscale ngerprint image compression, in Proc. SPIE Conf. Visual Communications and Image Processing, 1993. [23] Z. Xiong, K. Ramchandran, and M. T. Orchard, Space-frequency quantization for wavelet image coding, IEEE Trans. Image Processing, vol. 6, pp. 677693, May 1997. [24] S. M. LoProso, K. Ramchandran, and M. T. Orchard, Image coding based on mixture modeling of wavelet coefcients and a fast estimationquantization framework, in Proc. Data Compression Conf., 1997. [25] R. L. Joshi et al., Comparison of different methods of classication in subband coding of images, IEEE Trans. Image Processing, vol. 6, pp. 14731486, Nov. 1997. [26] B. Usevitch, Optimal bit allocation for biorthogonal wavelet coding, in Proc. Data Compression Conf., Snowbird, UT, 1996, pp. 387395.

Jozsef Vass (S97) received the Dipl. Eng. degree in electrical engineering from the Technical University of Budapest, Hungary, and the M.S. degree in electrical engineering from the University of Missouri, Columbia, where he is currently pursuing the Ph.D. degree in the Department of Computer Engineering and Computer Science. He was with NASA Goddard Space Flight Center, Greenbelt, MD, in the summer of 1996, in the development of robust algorithms for automatic cloud height estimation. His research interest include speech, image, and video compression for multimedia communications, networking, computer vision, image processing, and pattern recognition. He has authored over 20 refereed technical journal and conference publications.

Bing-Bing Chai (M88) received the B.S. degree in physics from Peking University, Beijing, China, in 1990, and the M.S. degree in medical physics and the Ph.D. degree in electrical engineering, both from the University of Missouri, Columbia, in 1992 and 1997, respectively. From 1993 to 1997, she was a teaching and research assistant in the Department of Electrical and Computer Engineering, University of Missouri. In October 1997, she joined the Multimedia Technology Laboratory, Sarnoff Corporation, Princeton, NJ. Her research interest include video and image compression, multimedia signal processing, digital communication, and networking.

Xinhua Zhuang (SM92) received the B.S., M.S., and Ph.D. degrees in mathematics from the Peking University, Beijing, China, in 1959, 1960, and 1963, respectively. He is currently Professor of computer engineering and computer science at the University of Missouri, Columbia. He has been a consultant to Siemens, Panasonic, NeoPath Inc., and NASA. He has been afliated with a number of schools and research institutes including Hannover University, Germany, Zhejiang University, China, the University of Washington Seattle, the University of Michigan, Ann Arbor, the Virginia Polytechnic Institute and State University of Virginia, Blacksburg, and the Research Institute of Computers. He has over 200 publications in the areas of signal processing, speech recognition, image processing, machine vision, pattern recognition, and neural networks, and was a contributor to seven books. Dr. Zhuang was Associate Editor for IEEE TRANSACTIONS ON IMAGE PROCESSING from 1993 to 1995. Since 1997, he serves as Chairman of Benchmarking and Software Technique Committee for the International Association of Pattern Recognition. He has received awards from NSF, NASA High Performance Computing and Communications, NASA Innovative Research, and NATO Advisory Group of Aerospace Research and Development.

You might also like