IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

4031

Self-Similarity and Spectral Correlation Adaptive Algorithm for Color Demosaicking Joan Duran and Antoni Buades Abstract— Most common cameras use a CCD sensor device measuring a single color per pixel. The other two color values of each pixel must be interpolated from the neighboring pixels in the so-called demosaicking process. State-of-the-art demosaicking algorithms take advantage of interchannel correlation locally selecting the best interpolation direction. These methods give impressive results except when local geometry cannot be inferred from neighboring pixels or channel correlation is low. In these cases, they create interpolation artifacts. We introduce a new algorithm involving nonlocal image self-similarity in order to reduce interpolation artifacts when local geometry is ambiguous. The proposed algorithm introduces a clear and intuitive manner of balancing how much channel-correlation must be taken advantage of. Comparison shows that the proposed algorithm gives state-of-the-art methods in several image bases. Index Terms— Image demosaicking, Bayer pattern, color filter array (CFA), directional interpolation, nonlocal filtering.

I. I NTRODUCTION IGITAL color images are usually represented by three color values at each pixel. Nevertheless, most common cameras use a CCD sensor device measuring a single color per pixel. The other two color values of each pixel must be interpolated from the neighboring ones in the so-called demosaicking process. The selected configuration of the CCD sensor usually follows the Bayer color filter array [1], which we shall call Bayer CFA and abbreviate in CFA. Out of a group of four pixels, two are green (in quincunx), one is red and one is blue. This configuration provides equal horizontal and vertical sampling frequency for each color. The subsampling factor is 4 for the red and the blue channels, and 2 for the green channel [2]. Color demosaicking techniques have been extensively reviewed in the literature [3]–[6]. Due to the CFA configuration, the interpolation of the green channel is performed in a first step by all state-of-the-art algorithms. Then, as first noted by Cok [7] and Hibbard [8], and due to channel

D

Manuscript received February 3, 2014; revised April 9, 2014 and June 3, 2014; accepted July 10, 2014. Date of publication July 23, 2014; date of current version August 11, 2014. This work was supported by the Ministerio de Ciencia e Innovación under Grant TIN2011-27539. During this work, J. Duran benefited from a fellowship of the Conselleria d’Educació, Cultura i Universitats of the Govern de les Illes Balears for the realization of his Ph.D. thesis, which has been selected under an operational program co-financed by the European Social Fund. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Xin Li. J. Duran is with the Universitat de les Illes Balears, Palma de Mallorca 07122, Spain (e-mail: [email protected]). A. Buades is with the Universitat de les Illes Balears, Palma de Mallorca 07122, Spain, and also with the Center for Mathematical Studies and their Applications, École Normale Supérieure de Cachan, Cachan Cedex 94235, France (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIP.2014.2341928

inter-correlation, the differences or ratios of the red and blue components with the interpolated green one are interpolated instead of the red and blue channels directly. The overall performance of the algorithm depends crucially on the quality of the interpolated green. Demosaicking techniques concentrate on locally estimating the most suitable direction of interpolation for the green channel. The method by Cok [7] bilinearly interpolated the green channel while Hibbard [8] used anisotropic interpolation. Hamilton and Adams [9] were the first to use not only the green channel but also the red and blue ones for analyzing the local configuration of the image around the pixel of interest and interpolate the green value. The authors proposed to interpolate the missing green values horizontally or vertically depending on the first order derivatives of the sub-sampled green and the second order derivatives of the sub-sampled red (blue). In many cases, the ambiguity of the local configuration in the CFA makes nearly impossible to take such a decision. Recently, many authors [10]–[14] proposed to take this decision a posteriori once the green channel or full color image has been interpolated completely horizontally and completely vertically. The methods differentiate on how they combine or choose between these two values. Hirakawa et al [10] select at each pixel by using a color image homogeneity measure. Menon et al [13] use the horizontal and vertical interpolated green channel and chooses depending on the smoothness of the differences green-red and green-blue where the red or blue values are available. Zhang et al [11] filtered the chromatic components red-green and blue-green of the vertically and horizontally interpolated images by a LMMSE algorithm. The LMMSE algorithm averages each pixel with its neighboring ones and corrects the obtained value depending on the variance of the pixels that gave performed average. The filtered chromatic components are combined at each pixel depending on these variances. Similar approaches were proposed in [12] and [14]. Iterative algorithms have been proposed in [15]–[17]. These algorithms start with an initial condition and, following in a wide sense the above commented strategies, iteratively force the three color channels to have the same high frequencies. Demosaicking algorithms have been mainly tested on the Kodak database. Images in this bases have fewer color saturated regions, but are challenging by their Nyquist frequency details. The large inter-channel correlation of these images explains why most methods assume that high frequencies of the three channels are highly correlated. These methods give impressive results in this basis except when local geometry cannot be inferred from neighboring pixels. Indeed, local methods can create interpolation artifacts such

1057-7149 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

4032

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

as aliasing, erroneously interpolated structures or the so-called zipper effect. Buades et al [3] showed that these interpolation artifacts could be eliminated by involving image self-similarity and redundancy. The authors adapted the nonlocal-means filter [18] in order to infer high frequency information by transportation from known pixels to unknown pixels. More recently, Mairal et al [19] and Yu et al [20] proposed demosaicking methods based on sparse representation of images. These algorithms assume that patches in natural images admit a sparse representation over a dictionary. Many methods have emerged to deal with the IMAX collection, introduced by Li et al [5]. Images in this basis have many more saturated colors and edges separating colored regions than the Kodak one. Many of these algorithms taken advantage of nonlocal self-similarity of the image [21]–[25]. However, most part of algorithms are still designed to deal with exclusively one of the databases. That is, algorithms are conceived completely differently depending on the amount of channel-correlation in the test images being used. As a consequence, they give state-of-the-art results in one basis but usually poor results in the other one. We outline the recent methods by Getreuer [26] and Kiku et al [27], [28] being able to give state-of-the-art results in both databases. Getreuer’s demosaicking algorithm [26] is based on total variation along curves [29], and first estimates the image contour orientations directly from the mosaicked data using contours stencils. The demosaicking is performed as an energy minimization, using a graph regularization adapted according to the orientation estimates. The objective energy functional consists in two terms. The first one regularizes the luminance to suppress zipper artifacts while the second term regularizes the chrominance to suppress color artifacts. Kiku et al [27] proposed a strategy that consists in the interpolation of the so-called residual differences, that is, the differences between observed and tentatively estimated pixel values. The tentative estimate is generated by upsampling the observed pixel values by using the guided filter [30]. The authors incorporated the proposed residual interpolation into the gradient based threshold free algorithm introduced previously by Pekkucuksen and Altunbasak [31], [32]. In a posterior paper, Kiku et al [28] proposed to estimate the tentative pixel values by minimizing the Laplacian energies of the residuals. Our aim is to propose a single algorithm for which channel inter-correlation is encoded in a clear and intuitive manner and automatically adapted depending on the image. The demosaicking technique proposed in this paper reconstructs the original image by means of a two-step algorithm. First, a local directional interpolation is proposed. This local interpolation combines different directionally interpolated images depending on its chromatic smoothness. It automatically evaluates the degree of channel inter-correlation that must be taken advantage of by the algorithm. In a second step, a nonlocal filtering based in NL-means algorithm [18] is applied. This method is different from previous nonlocal demosaicking strategies [3], [22] since it filters the channel differences instead of the channels themselves. This nonlocal filtering is also automatically adapted to the degree of inter-channel correlation on the initial image.

The rest of the paper is organized as follows. Section II describes the local directional interpolation algorithm. Section III introduces the channel inter-correlation parameter. The nonlocal demosaicking filtering is detailed in Section IV. Finally, Section V is dedicated to quantitative and qualitative performance evaluation of the proposed algorithm and comparison with state-of-the-art demosaicking techniques on several image basis. II. L OCAL D IRECTIONAL I NTERPOLATION W ITH A P OSTERIORI D ECISION Most methods first interpolate the green channel and then use this estimation to interpolate the red and blue channels. This is due to the higher sampling rate of the green component which permits an easier reconstruction of the main geometry and texture than the red and blue ones. Once the green values are interpolated, the correlation among channels permits a better reconstruction of the red and blue components. We proceed in the same way by first estimating the green channel. At each red and blue position four interpolated values of the green are computed along north, south, east and west directions. Instead of deciding for each pixel which is the dominant direction and interpolate according to it, we take this decision a posteriori once a full color image has been reconstructed for each direction. Therefore, for each of these directionally interpolated green channels, we reconstruct the red and blue components by a bilinear interpolation on the channel differences green-red and green-blue. Finally, a decision of the most suitable approximation is made on the basis of all reconstructed images. A. Green Interpolation We first introduce some general notations. Let  denote the image domain. Furthermore, let  R , G and  B denote the subset of the image grid where the values of the red, green and blue channels are respectively available. Consider a CFA block and focus on a pixel (i, j ) ∈ / G where the green color has to be estimated. Note that the green value is instead known on the upper, lower, right, and left pixels. These values can be used to approximate the gradient along north (n), south (s), east (e) and west (w) directions. Let us denote the intensity of the green channel by G i, j and of the red (blue) channel by Ci, j , with C ∈ {R, B}. Because of the channel inter-correlation, the differences between the green and red (blue) values are smoother than the green channel itself. Therefore, these color differences di, j = G i, j −Ci, j can be estimated more accurately than the green colors directly. Once these differences are interpolated, the missing green value is recovered as G i, j = Ci, j + di, j . We estimate the differences di, j along each direction as follows:  1 Ci, j + Ci, j −2 , di,n j = G i, j −1 − 2  1 s Ci, j + Ci, j +2 , di, j = G i, j +1 − 2  1 Ci, j + Ci+2, j , di,e j = G i+1, j − 2

DURAN AND BUADES: SELF-SIMILARITY AND SPECTRAL CORRELATION ADAPTIVE ALGORITHM

 1 Ci, j + Ci−2, j . (1) 2 Now, the missing green component along each direction can be easily interpolated by means of   ni, j = G i, j −1 + 1 Ci, j − Ci, j −2 , G 2  1 s  G i, j = G i, j +1 + Ci, j − Ci, j +2 , 2  1 e  G i, j = G i+1, j + Ci, j − Ci+2, j , 2  1 w  (2) G i, j = G i−1, j + Ci, j − Ci−2, j . 2 Let us somehow justify formulas (1)-(2). For instance, consider the interpolation process in the north direction. If there is and edge along this direction, then upper neighboring pixels  have similar values, that is, 12 Ci, j + Ci, j −2  Ci, j −1 , where Ci, j −1 stands for the real value of red (blue) channel at (i, j − 1) whenever it could be known. Consequently, the missing green value can be written as di,wj = G i−1, j −

i, j = G i, j −1 + (Ci, j − Ci, j −1 ). G Note that Ci, j − Ci, j −1 is a first order discrete gradient in the north direction. Therefore, this procedure reconstructs the green component by imposing it to have the high frequencies of the red and blue channels. B. Red and Blue Interpolation With each of the four locally interpolated green channels, we estimate the red and blue by exploiting the spectral correlation. As pointed out by Cok et al [7], the difference images between red (blue) and green mostly contain lowfrequency components and can be more accurately interpolated than red and blue channels themselves. Let m stand for the direction along which the green channel has been previously interpolated (north, south, east or west). Consider a pixel (i, j ) ∈ / C where the channel C ∈ {R, B} has to be estimated, and define the following vector: m C Gm i, j = Ci, j − G i, j .

Here G m i, j denotes either an original CFA green sample or an interpolated one. A bilinear interpolation is then applied to estimate C G m i, j according to the CFA configuration: - If (i, j ) ∈ G and (i + 1, j ) ∈ C :  1 m m  (3a) C G i, j = C G m i−1, j + C G i+1, j . 2 - If (i, j ) ∈ G and (i + 1, j ) ∈ / C :  1 m m  C G i, j = C G m (3b) i, j −1 + C G i, j +1 . 2 - If (i, j ) ∈ / G : 1 m m  C G i, j = C G m i−1, j −1 + C G i+1, j −1 4  m (3c) +C G m i−1, j +1 + C G i+1, j +1 . Finally, the missing red (blue) value is recovered as m  i,mj = C G i, j + G m C i, j .

4033

C. Decision The previous process provides four fully color images, each one interpolated in a different direction. In order to weight each estimate, we compute the variation of the chrominance at each pixel along the four directions. For this task we decompose each RGB image into the YUV space according to the following linear combination: Y = α R R + αG G + α B B, U = R − Y, V = B − Y, where α R = 0.299, αG = 0.587 and α B = 0.114. This color system separates the geometric information contained in Y from the chromatic information contained in U and V . Let us denote by R n , G n and B n the color components of the demosaicked image obtained from the north interpolation of green channel and subsequent bilinear reconstruction of red and blue channels. Let U n and V n stand for the associated chromatic components in the YUV space. Similarly, denote the vector components regarding south, east and west interpolations. For each direction, we compute the variation of the chromatic components by using a local neighborhood of L pixels in the same direction of interpolation: ∇i,n j

∇i,s j

∇i,e j

∇i,wj





1 := L

X ∈{U,V }



1 := L

X ∈{U,V }



1 := L

X ∈{U,V }



1 := L

X ∈{U,V }

L  

X i,n j −l − X i,n j

2

12 ,

l=1



L  

X i,s j +l − X i,s j

2

12 ,

l=1



L  

e e X i+l, j − X i, j

2

12 ,

l=1



L  

w X i−l, j



X i,wj

2

2| .

(4)

l=1

In general terms, a large gradient along one direction means that there is an edge or image feature across it and, thus, it is suitable to avoid averaging in that direction. We simply let the weight assigned to a directional estimate be inversely proportional to the corresponding gradient: ωi,mj =

1 , ∀m ∈ {n, s, e, w}, +ε

∇i,mj

(5)

where ε > 0 is a small real parameter that avoids dividing by too small values or even zero. The weights are normalized at each pixel as ωi,mj

=

i,mj ω Wi, j

, ∀m ∈ {n, s, e, w},

with Wi, j = ωi,n j + ωi,s j + ωi,e j + ωi,wj . Finally, the four directional estimates are fused in the color RGB space by means of i, j = ωi,n j Ri,n j + ωi,s j Ri,s j + ωi,e j Ri,e j + ωi,wj Ri,wj , R i, j = ωi,n j G ni, j + ωi,s j G si, j + ωi,e j G ei, j + ωi,wj G w G i, j i, j = ωi,n j Bi,n j + ωi,s j Bi,s j + ωi,e j Bi,e j + ωi,wj Bi,wj . B

4034

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

TABLE I AVERAGE OF THE C HROMATIC G RADIENTS C OMPUTED ON THE R EFERENCE I MAGES IN F IG . 1 (K ODAK BASIS ), F IG . 2 (IMAX BASIS ) AND F IG . 3 (D IGITAL P HOTOGRAPHS ). T HE B OUNDING PARAMETER FOR THE G RADIENT OF THE L UMINANCE U SED IN (6) I S S ET TO M = 13. W E A LSO D ISPLAY T HIS M EASURE FOR THE I NTERPOLATED I MAGES O BTAINED F ROM THE A LGORITHM P ROPOSED IN S ECTION II, W HICH C ORRESPOND TO C OLUMNS E NTITLED β = 1. F INALLY, A FTER I NTRODUCING THE C HANNEL I NTER -C ORRELATION PARAMETER β, W E C OMPUTE THE AVERAGE OF THE C HROMATIC G RADIENTS ON THE I NTERPOLATED I MAGES IN A LL BASIS W ITH β = 0.7

III. C HANNEL C ORRELATION I DENTIFICATION Most part of algorithms are designed to deal with only one of the basis, Kodak or IMAX. That is, algorithms are conceived completely differently depending on the amount of channel-correlation in the test images being used. Our aim is to propose a single algorithm for which channel inter-correlation is encoded in a clear and intuitive manner. The strong differences between Kodak and IMAX basis can be illustrated by comparing the mean value of the chromatic gradients on several images of both collections. For this purpose, we project the RGB coordinates of each reference image into the YUV space. We compute then the norm of the gradient of the chromatic components U and V . We compute the mean value of these gradients taking into account only pixels with a luminance gradient above a certain bounding parameter M > 0. We restrict the mean to contrasted parts of the image in order to eliminate the effect of large constant zones which are easily interpolated and carry no information about the chromatic regularity of the image. More specifically, we define the set of pixels I = {(i, j ) : |(∇Y )i, j | > M}, and compute the L 1 −norm of the chromatic gradients as |∇U | =

 1   |Ui+1, j − Ui, j | + |Ui, j +1 − Ui, j | , 2|I | (i, j )∈I

 1   |Vi+1, j − Vi, j | + |Vi, j +1 − Vi, j | . (6) |∇V | = 2|I | (i, j )∈I

Finally, the mean value of |∇U | and |∇V | is obtained. Table I displays the results for M = 13. As expected, chromatic gradients are much larger on IMAX than on Kodak. Table I also displays the average of the chromatic gradients of the interpolated images obtained with the algorithm proposed in Section II. We observe that these values are more similar to the original ones for images in the Kodak database than for images in the IMAX collection. Indeed, the gradient

of the chromatic components for the IMAX basis is drastically reduced. In order to balance the assumption on chromatic regularity by the proposed algorithm for each image, we introduce a parameter 0 < β ≤ 1. At each pixel (i, j ) ∈ / G , the green interpolation along each direction is finally written as  ni, j = G i, j −1 + β Ci, j G 2  si, j = G i, j +1 + β Ci, j G 2  β e  = G i+1, j + Ci, j G i, j 2 β w Ci, j G i, j = G i−1, j + 2

 − Ci, j −2 ,  − Ci, j +2 ,  − Ci+2, j ,  − Ci−2, j .

Once the green channel is filled, the missing red (blue) sample at each pixel (i, j ) ∈ / C , C ∈ {R, G}, is recovered as  i, j = C G i, j + βG i, j , C where C G i, j = Ci, j − βG i, j has been estimated by bilinear interpolation as in (3). Table I displays the mean chromatic gradient for images in the Kodak and IMAX basis, as well as for the interpolated images using β = 1 (initial algorithm) and β = 0.7. These values illustrate the correlation between the chromatic regularity of the image and the value of β. Indeed, β = 1 is a good estimation for images with relatively small chromatic gradient (Kodak basis). We experimentally checked that β = 0.7 is a good value to better approach the chromatic gradient of the original images for the IMAX basis. In order to corroborate this observation we repeated the same experience in a set of photographs taken by ourselves (Fig. 3). It is not clear which of the Kodak or IMAX basis better reflects natural images. The photographs taken by ourselves seem to have a similar amount of chromaticity information to the Kodak basis. However, a complete study on a huge set of randomly chosen images should be carried out in order to clarify this point. For testing demosaicking algorithms, more

DURAN AND BUADES: SELF-SIMILARITY AND SPECTRAL CORRELATION ADAPTIVE ALGORITHM

and more data basis should appear, as for example the recently proposed [33]. In order to quantify this chromaticity, the value of β could naturally range between zero and one, meaning zero that there’s no correlation among channels. It is clear that even for IMAX images a strong correlation among channels exist. We experimentally checked that a minimal value of 0.7 could be used. Automatic Selection of β Parameter: In order to provide an unsupervised selection of β we use a function of the chromatic gradient. As we do not dispose of the original image, the estimation will be computed using the interpolated image obtained from the original algorithm described in Section II. In view of the results displayed in Table I, we establish that β = 0.7 is an adapted value for images with a chromatic gradient above 3.25 while β = 1.0 is more adapted for images with a lower chromatic gradient. We used the following decreasing function of the chromatic gradient satisfying above requirements: β(t) = 1 −

0.3 , 1 + e490−150t

(7)

where t = 12 (|∇U | + |∇V |), and |∇U | and |∇V | are computed as in (6). IV. N ONLOCAL F ILTERING Classical demosaicking methods, exploiting only local regularity of the image, might create color artifacts and erroneous interpolations when the interpolation direction is not clear. Buades et al [3] showed that these interpolation artifacts could be eliminated by involving image self-similarity and redundancy. The SSD algorithm introduced in [3] was inspired by the NL-means denoising algorithm introduced in [18]. The fact that this method correctly removes noise and spurious oscillations of the image while preserving the geometric and texture features is a proof that images are self-similar and this property can be taken advantage of. For an exhaustive description of the NL-means filter and its application to several image processing tasks see [3], [18], [34], [35]. The second step of our proposed strategy takes advantage of all image self-imilarities as the NL-means [18] and SSD algorithm [3] do. A. Self-Similarity Driven Demosaicking The NL-means algorithm was first adapted to mosaicked images by Buades et al [3]. The authors proposed a selfsimilarity driven demosaicking algorithm (SSD) involving two steps. In the first one, the nonlocal-means filter is used to infer high frequency information by transportation from known pixels to unknown pixels. In the second step, the chromatic components are regularized by a very local median filter. The SSD algorithm interpolates the missing pixel values of each channel by averaging original CFA values of the same channel with a similar color neighborhood in u0 , where u0 denotes an initial color image previously estimated. Therefore, the expression applies for each channel C ∈ {R, G, B}

as NL[C](x) =

4035

 − dρ (u0 (x),u0 (y)) 1 h2 e C(y), x ∈ / C , C (x) y∈C

where dρ (u0 (x), u0 (y)) denotes the distance of patches of size ρ centered at x and y and C (x) denotes the normalization factor given by  − dρ (u0 (x),u0 (y)) h2 e . C (x) = y∈C

The parameter h controls the decay of the exponential function and, thus, the decay of the weights as a function of the Euclidean distances. Note that color intercorrelation by SSD algorithm is enforced by computing the distance in the three-channel image but not using the channel differences. Finally, in order to gradually correct the erroneous structures and artifacts of u0 , a coarse to fine strategy that refines at each step the similarity search by reducing the value of h is proposed. B. Exploiting Nonlocal Regularity of Channel Differences Instead of directly interpolating the missing values of all three channels as SSD does, the green channel is first interpolated. This interpolation needs the initialization provided by the local directional algorithm introduced in Sections II-III. This initialization is used for computing patch distances as well as for providing the values which will be averaged. The proposed algorithm averages both original CFA values and interpolated values by the initialization. As for the local directional algorithm, the color differences G − C, being C ∈ {R, B} the CFA color of the current pixel, are first interpolated and then added back to the original value of the CFA. The chromatic regularity assumption of the algorithm is balanced by the same value of β used for the initial interpolation. Given the initialization u0 = (R0 , G 0 , B0 ), the averaging process for the green channel in the discrete setting reads  ω(x, y) (G 0 (y) − βC0 (y)) G(x) = y∈

+ βC0 (x), x ∈ / G ,

(8a)

where C0 = R0 if x ∈  R and C0 = B0 if x ∈  B . Indeed, the values of R0 and B0 used in the previous formula coincide with original values since they were already available in the CFA. However, both original and interpolated green values take part in the previous average. The interpolation of the red and blue channels take advantage of the already filled green channel. The averaging process writes  ω(x, y) (C0 (y) − βG(y)) C(x) = y∈

+ βG(x), x ∈ / C ,

(8b)

where C = {R, B}. The weight distribution is given in all cases by ω(x, y) =

0 (y)) 1 − d (u0 (x),u h2 e , (x)

(8c)

4036

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

with (x) =



e



d (u0 (x),u0 (y)) h2

.

TABLE II RMSE IN RGB C OORDINATES B ETWEEN R EFERENCE I MAGES F ROM

(8d)

y∈

K ODAK BASIS (F IG . 1) AND THE R ESULTS P ROVIDED BY E ACH C OMPARED D EMOSAICKING A LGORITHM

The parameter h controlling the decay of the exponential function will depend on the chromatic regularity of the image (see next section for more details). C. Implementation Details The search for similarity described in (8) can be performed in the whole image. However, for computational purposes, the filtering process is restricted to pixels at a certain distance from the current pixel (search window). That is, the weight distribution will be zero for pixels such that x − y > K , for a certain parameter K > 0. Therefore, the distance between x and y is measured as  d (u0 (x), u0 (y)) = u0 (x + z) − u0 (y + z)2 z∈N0

if x − y ≤ K , and d (u0 (x), u0 (y)) = ∞ otherwise. Here N0 is a s × s discrete window centered at (0, 0) (comparison window). In general, the smaller the distance d(x, y) is, the more similar the configurations of u0 around x and y are. Based on this distance, only the N most similar pixels to x, y1 , . . . , y N , take part in the average. Furthermore, the weight of the reference pixel, ω(x, x), is set to the maximum of the weights ω(x, yn ) for yn = x. This setting avoids an excessive weighting of the reference point. The value of h is adapted to the chromatic gradient of the image as proposed for the β parameter. Once again, there is a clear difference between the value of h required by images with few color saturated regions (h = 32) and the one which better performs for images with strong edges separating saturated colors areas (h = 1). We use here the same bound for the chromatic gradient as for the β estimation. That is, we fix h = 1 for images such that 12 (|∇U | + |∇V |) > 3.25 and h = 32 for images with a lower chromatic gradient. We use the following continuous decreasing function satisfying above requirements: 31 , h(t) = 32 − 1 + e490−150t where t = 12 (|∇U | + |∇V |), and |∇U | and |∇V | are computed as in (6). The final algorithm chain adapts first the value of the channel inter-correlation parameter β and of the filtering parameter h to the chromatic gradients of the interpolated image obtained from Section II. After that, the algorithm applies the proposed local directional interpolation adapted to β. The obtained demosaicked image is used as the initialization u0 for the nonlocal algorithm applying to channel differences, above presented. V. E XPERIMENTAL R ESULTS This section is devoted to a detailed performance comparison between the proposed algorithm and some state-of-the-art demosaicking techniques.

We compare with the following methods: Hamilton-Adams (HA) [9], Directional Linear Minimum Mean Square-Error Image Demosaicking (DLMMSE) [11], Self-Similarity Driven Demosaicking (SSD) [3], Local Directional Interpolation with Nonlocal Adaptive Thresholding (LDNAT) [22], Image Demosaicking with Contour Stencils (CS) [26], and MinimizedLaplacian Residual Interpolation (MLRI) [28]. The DLMMSE and LDNAT methods were specifically conceived to get an outstanding performance on the Kodak and on the IMAX collection, respectively. On the other hand, CS and MLRI are recent developed techniques providing pretty good results on both databases. However, LDNAT, CS and MLRI have greater computational complexity than the proposed algorithm. We implemented HA in C/C++, whereas for LDNAT and MLRI Matlab codes were downloaded from their respective webpages. Reliable software and online demos on the other approaches can be found at IPOL webpage [36]–[38]. All parameters appearing in each algorithm were selected as given in the original papers. For the proposed method, the very same values of the parameters were fixed once and for all. The search for similar pixels is restricted to a search window of size 21 × 21. The comparison window is performed by a flat Euclidean distance on a 3 × 3 neighborhood. This window has shown to be robust and small enough to take care of details and fine structures. Finally, the size of the local neighborhood used in (4) is L = 3, the tresholding parameter used in (5) is ε = 10−8 , the bounding parameter for the gradient of the luminance in (6) is M = 13, and the number of similar pixels from the support zone used in the nonlocal average is N = 10. Numerical Comparison: We will evaluate the spectral quality by means of the root mean squared error (RMSE), which is a simple metric that directly takes the difference in pixel values into account. If we denote by u r = (u r1 , u r2 , u r3 ) the reference image and by u = (u 1 , u 2 , u 3 ) the demosaicked one, then  2

3 r 1 x∈ u c (x) − u c (x) RMSE = . 3 ||2 c=1

Tables II and III display the RMSE in RGB coordinates for images in Figs. 1 and 2. In general terms, all methods perform significantly better on the Kodak database than on the IMAX collection. One observes that DLMMSE, CS and the proposed algorithm have a similar performance on Kodak,

DURAN AND BUADES: SELF-SIMILARITY AND SPECTRAL CORRELATION ADAPTIVE ALGORITHM

4037

Fig. 1.

Selection set of 768 × 512 pixel images from Kodak basis. Critical details are used for comparative testing in the experimental section.

Fig. 2.

Selection set of 400 × 400 pixel images from IMAX basis. Critical details are used for comparative testing in the experimental section.

TABLE III RMSE IN RGB C OORDINATES B ETWEEN R EFERENCE I MAGES F ROM IMAX BASIS (F IG . 2) AND THE R ESULTS P ROVIDED BY E ACH C OMPARED D EMOSAICKING A LGORITHM

TABLE V RMSE IN RGB C OORDINATES B ETWEEN R EFERENCE I MAGES IN F IG . 3 AND THE

R ESULTS P ROVIDED BY E ACH C OMPARED D EMOSAICKING A LGORITHM

TABLE IV T OTAL AVERAGE OF THE RMSE OVER K ODAK AND IMAX BASIS

being the error of the first two methods larger than the one of our model in most examples. Although DLMMSE is pointed out as one of the best algorithms on Kodak, it does not work satisfactory on IMAX. On the other hand, MLRI gives the lower RMSE for IMAX collection. Only LDNAT and ours achieve to get close to it, but LDNAT is not at all competitive for Kodak images. The average of the RMSE over both basis in Table IV reveals that MLRI and the new-proposed method give the lowest error, although our method outperforms all compared techniques. Table V displays the RMSE in RGB coordinates between the reference images in Fig. 3 and the demosaicked images provided by each method. The error of the proposed model remains the lowest one for almost every photograph, although CS, DLMMSE and MLRI also perform well. In general, HA, SSD and LDNAT are not able to provide as good results as the above techniques do. These conclusions are reinforced by the average of the RMSE over the proposed database. Visual Comparison: A low performance in the RMSE generally entails a rejection by human visual inspection. In spite of

this, any numerical criterion cannot fully replace human evaluation, which still is the most important criterion to judge the performance of demosaicking algorithms. This performance must be evaluated on edges, textures and several geometric details such as corners, diagonals and fine patterns. Figs. 4-7 illustrate a wide-ranging comparative visual quality assessment of demosaicked color images. Critical details extracted from the Kodak basis (Fig. 1), the IMAX basis (Fig. 2) and our own collection of digital photographs (Fig. 3) are used for comparative visual testing. Fig. 4 points out how difficult it is for all demosaicking algorithms to make guesses at corners or at crossing points, even though the image is mainly gray. Note that all techniques except ours either modify the original pattern of the window blind or introduce several color artifacts. Only the proposed method gives a fully visual acceptable solution in this case. Natural scenes contain large gray regions. False colors can been created in these zones by demosaicking algorithms. Fig. 5 reveals that most of the methods under comparison incur a high risk of creating wrong color spots. We observe that HA, SSD, LDNAT, CS and MLRI present color aliasing and spots since they are not able to correctly choose between horizontal and vertical interpolation. However, DLMMSE and the new-proposed algorithm reconstruct accurately the original image and the solutions illustrate significant perception gain.

4038

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

Fig. 3.

Selection set of 700 × 700 pixel photographs. Critical details are used for comparative testing in the experimental section.

Fig. 4. Kodak image. This figure illustrates how the pattern of the window blinds is distorted due to wrong guesses at corners and crossing points. On the other hand, relevant color artifacts also appear in most of the demosaicked images. Only the proposed algorithm gives a fully visual acceptable solution since it achieves to preserve the original window pattern as well as it is able to reduce the color spectral distortion.

Fig. 5. Kodak image. This figure shows how most of the compared demosaicking algorithms incur false colors in low-frequency image regions. Note that HA, SSD, LDNAT, CS and MLRI present color aliasing and spots. DLMMSE and the new-proposed technique reconstruct accurately the original image and the solutions illustrate significant perception gain.

This means that very few color artifacts will be created by these methods in gray zones of the images. Since images in the Kodak database have very few color saturated regions, these methods have a very good performance in this basis. Due to the configuration of the CFA mask, the green channel has one of every two pixel values fixed in each row and each column. When a demosaicking algorithm fails to interpolate

the green channel, the interpolating artifacts are manifested as artificial on-off image patterns. This effect, which was called “zipper effect” in [39], is commonly created when the red and blue channels have high frequencies that are different from the green ones. For this reason, the zipper effect concentrates on the edges separating colored regions. Fig. 6 illustrates this phenomenon on the IMAX basis. Note that DLMMSE,

DURAN AND BUADES: SELF-SIMILARITY AND SPECTRAL CORRELATION ADAPTIVE ALGORITHM

4039

Fig. 6. IMAX image. This figure illustrates the zipper effect on color edges due to erroneous color frequency copying (this is the case of SSD) or due to erroneous interpolation of diagonal patterns (this is the case of DLMMSE and MLRI). Only LDNAT, CS and the proposed demosaicking algorithm give a reasonable solution without significant on-off image patterns.

Fig. 7. Digital photograph. This figure illustrates how most of the compared techniques lead to false colors and spectral distortion. All results except ours and the one provided by CS algorithm present significant color aliasing and spots.

SSD and MLRI show significant zipper effect because of erroneous color frequency copying or because diagonal patterns are not interpolated correctly. LDNAT, CS and the proposed algorithm exhibit a higher performance in the compared visual analysis, although some zipper always remains. Let us finally comment about the results on our own database. Most of the algorithms under comparison create color spots not only in quasi-gray images as indicated in Fig. 5, but also in high-frequency regions. This phenomenon is illustrated in Fig. 7, where one observes that all results except ours and the one provided by CS suffer from

color aliasing. Both methods reconstruct accurately the original colors and the demosaicked images illustrate significant perception gain. VI. C ONCLUSIONS This paper introduced a new demosaicking algorithm taking advantage of image self-similarity and inter-channel correlation. A first demosaicked image is filled by deciding a posteriori among a set of local directionally interpolated images. In a second step, a patch based algorithm takes advantage of image similarities to refine the locally interpolated image.

4040

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

Both algorithms interpolate channel differences instead of directly interpolating the channels themselves. A parameter measuring the degree of inter-channel correlation is introduced and used for balancing the channel differences in the algorithms. Experiments have shown that most part of state-of-the-art algorithms are designed to deal only with images with a certain amount of channel correlation. The proposed algorithm automatically adapts to all type of images. It also outperforms state-of-the-art algorithms in most cases, both visually and in terms of error with the original image. R EFERENCES [1] B. Bayer, “Color imaging array,” U.S. Patent 3 971 065, Jul. 20, 1976. [2] D. Alleysson, S. Susstrunk, and J. Herault, “Linear demosaicing inspired by the human system,” IEEE Trans. Image Process., vol. 14, no. 4, pp. 439–449, Apr. 2005. [3] A. Buades, B. Coll, J.-M. Morel, and C. Sbert, “Self-similarity driven color demosaicking,” IEEE Trans. Image Process., vol. 18, no. 6, pp. 1192–1202, Jul. 2009. [4] B. K. Gunturk, J. Glotzbach, Y. Altunbasak, R. W. Schafer, and R. Mersereau, “Demosaicking: Color filter array interpolation,” IEEE Signal Process. Mag., vol. 22, no. 1, pp. 44–54, Jan. 2005. [5] X. Li, B. Gunturk, and L. Zhang, “Image demosaicing: A systematic survey,” in Proc. SPIE, vol. 6822, 2008, p. 68221J. [6] D. Menon and G. Calvagno, “Color image demosaicking: An overview,” Image Commun., vol. 26, nos. 8–9, pp. 518–533, Oct. 2011. [7] D. Cok, “Signal processing method and apparatus for producing interpolated chrominance values in a sampled color image signal,” U.S. Patent 4 642 678, Feb. 10, 1987. [8] R. Hibbard, “Apparatus and method for adaptively interpolating a full color image utilizing luminance gradients,” U.S. Patent 5 382 976, Jan. 17, 1995. [9] J. Hamilton and J. Adams, “Adaptive color plan interpolation in single sensor color electronic camera,” U.S. Patent 5 629 734, Mar. 13, 1997. [10] K. Hirakawa and T. W. Paks, “Adaptive homogeneity-directed demosaicing algorithm,” IEEE Trans. Image Process., vol. 14, no. 3, pp. 360–369, Mar. 2005. [11] D. Zhang and X. Wu, “Color demosaicking via directional linear minimum mean square-error estimation,” IEEE Trans. Image Process., vol. 14, no. 12, pp. 2167–2178, Dec. 2005. [12] K.-H. Chung and Y.-H. Chan, “Color demosaicing using variance of color differences,” IEEE Trans. Image Process., vol. 15, no. 10, pp. 2944–2955, Oct. 2006. [13] D. Menon, S. Andriani, and G. Calvagno, “Demosaicing with directional filtering and a posteriori decision,” IEEE Trans. Image Process., vol. 16, no. 1, pp. 132–141, Jan. 2007. [14] D. Paliy, V. Katkovnik, R. Bilcu, S. Alenius, and K. Egiazarian, “Spatially adaptive color filter array interpolation for noiseless and noisy data,” Int. J. Imag. Syst. Technol., vol. 17, no. 3, pp. 105–122, Oct. 2007. [15] B. K. Gunturk, Y. Altunbasak, and R. M. Mersereau, “Color plane interpolation using alternating projections,” IEEE Trans. Image Process., vol. 11, no. 9, pp. 997–1013, Sep. 2002. [16] X. Li, “Demosaicing by successive approximation,” IEEE Trans. Image Process., vol. 14, no. 3, pp. 370–379, Mar. 2005. [17] R. Kolta, H. A. Aly, and W. Fakhr, “A hybrid demosaicking algorithm using frequency domain and wavelet methods,” in Proc. Int. Conf. Image Inf. Process. (ICIIP), 2011, pp. 1–6. [18] A. Buades, B. Coll, and J.-M. Morel, “A review of image denoising algorithms, with a new one,” SIAM Multiscale Model. Simul., vol. 4, no. 2, pp. 490–530, 2005. [19] J. Mairal, M. Elad, and G. Sapiro, “Sparse representation for color image restoration,” IEEE Trans. Image Process., vol. 17, no. 1, pp. 53–69, Jan. 2008. [20] G. Yu, G. Sapiro, and S. Mallat, “Solving inverse problems with piecewise linear estimators: From Gaussian mixture models to structured sparsity,” IEEE Trans. Image Process., vol. 21, no. 5, pp. 2481–2499, May 2012. [21] M. Nawrath and J. Jakel, “Deriving color images from noisy Bayer data using local demosaicking and non-local denoising,” in Proc. 4th Int. Congr. Image Signal Process. (CISP), vol. 2. 2011, pp. 668–672.

[22] L. Zhang, X. Wu, A. Buades, and X. Li, “Color demosaicking by local directional interpolation and nonlocal adaptive thresholding,” J. Electron. Imag., vol. 20, no. 2, p. 023016, Jun. 2011. [23] D. Gao, X. Wu, G. Shi, and L. Zhang, “Color demosaicking with an image formation model and adaptive PCA,” J. Vis. Commun. Image Represent., vol. 23, no. 7, pp. 1019–1030, Oct. 2012. [24] G.-G. Wang, X.-C. Zhu, and Z.-L. Gan, “Image demosaicing by non-local similarity and local correlation,” in Proc. 11th Int. Conf. Signal Process. (ICSP), vol. 2. Oct. 2012, pp. 806–810. [25] S. Kasar and S. Ruikar, “Image demosaicking by nonlocal adaptive thresholding,” in Proc. Int. Conf. Image Process. & Pattern Recognit. (ICSIPR), 2013, pp. 34–38. [26] P. Getreuer, “Color demosaicing with contour stencils,” in Proc. 17th Int. Conf. Digit. Signal Process. (DSP), Jul. 2011, pp. 1–6. [27] D. Kiku, Y. Monno, M. Tanaka, and M. Okutomi, “Residual interpolation for color image demosaicking,” in Proc. IEEE Int. Conf. Image Process. (ICIP), Jul. 2013, pp. 2304–2308. [28] D. Kiku, Y. Monno, M. Tanaka, and M. Okutomi, “Minimized-Laplacian residual interpolation for color image demosaicking,” in Proc. SPIE, Mar. 2014, p. 90230L. [29] P. Getreuer, “Contour stencils: Total variation along curves for adaptive image interpolation,” SIAM J. Imag. Sci., vol. 4, no. 3, pp. 954–979, 2011. [30] K. He, J. Sun, and X. Tang, “Guided image filtering,” in Proc. 11th Eur. Conf. Comput. Vis. (ECCV), vol. 6311. 2010, pp. 1–14. [31] I. Pekkucuksen and Y. Altunbasak, “Gradient based threshold free color filter array interpolation,” in Proc. 17th IEEE Int. Conf. Image Process. (ICIP), Sep. 2010, pp. 137–140. [32] I. Pekkucuksen and Y. Altunbasak, “Multiscale gradients-based color filter array interpolation,” IEEE Trans. Image Process., vol. 22, no. 1, pp. 157–165, Jan. 2013. [33] S. Andriani, H. Brendel, T. Seybold, and J. Goldstone, “Beyond the kodak image set: A new reference set of color image sequences,” in Proc. 20th IEEE Int. Conf. Image Process., Melbourne, VIC, Australia, Sep. 2013, pp. 2289–2293. [34] J. Duran, A. Buades, B. Coll, and C. Sbert, “A nonlocal variational model for pansharpening image fusion,” SIAM J. Imag. Sci., vol. 7, no. 2, pp. 761–796, Apr. 2014. [35] A. Buades, B. Coll, and J.-M. Morel, “Nonlocal image and movie denoising,” Int. J. Comput. Vis., vol. 76, no. 2, pp. 123–139, Feb. 2008. [36] A. Buades, B. Coll, J.-M. Morel, and C. Sbert, “Self-similarity driven demosaicking,” Image Process. Line, vol. 18, no. 6, pp. 1192–1202, Jun. 2011. [37] P. Getreuer, “Zhang-Wu directional LMMSE image demosaicking,” Image Process. Line, vol. 1, to be published. [38] P. Getreuer, “Image demosaicking with contour stencils,” Image Process. Line, vol. 2, to be published. [39] W. Lu and Y. Tan, “Color filter array demosaicking: New method and performance measures,” IEEE Trans. Image Process., vol. 12, no. 10, pp. 1194–1210, Oct. 2003.

Joan Duran received the bachelor’s degree in mathematics from the University of Balearic Islands, Palma, Spain, in 2010, and the M.S. degree in advanced mathematics and mathematical engineering from the Polytechnic University of Catalonia, Barcelona, Spain, in 2011. He is currently pursuing the Ph.D. degree in mathematics. His research is focused on mathematical analysis of digital images, in particular, image restoration, demosaicking, and satellite imaging.

Antoni Buades received the Ph.D. degree in mathematics from the University of Balearic Islands, Palma, Spain, in 2006. He is currently an Associate Researcher with CMLA-CNRS, Paris, France, and the University of Balearic Islands. His research is focused on mathematical analysis of digital images, in particular, image and video restoration, demosaicking, and medical imaging.

Self-similarity and Spectral Correlation Adaptive Algorithm for Color Demosaicking.

Most common cameras use a CCD sensor device measuring a single color per pixel. The other two color values of each pixel must be interpolated from the...
3MB Sizes 0 Downloads 5 Views