IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

VOL. 36, NO. 2,

FEBRUARY 2014

209

A Physically-Based Approach to Reflection Separation: From Physical Modeling to Constrained Optimization Naejin Kong, Yu-Wing Tai, Member, IEEE, and Joseph S. Shin, Member, IEEE Abstract—We propose a physically-based approach to separate reflection using multiple polarized images with a background scene captured behind glass. The input consists of three polarized images, each captured from the same view point but with a different polarizer angle separated by 45 degrees. The output is the high-quality separation of the reflection and background layers from each of the input images. A main technical challenge for this problem is that the mixing coefficient for the reflection and background layers depends on the angle of incidence and the orientation of the plane of incidence, which are spatially varying over the pixels of an image. Exploiting physical properties of polarization for a double-surfaced glass medium, we propose a multiscale scheme which automatically finds the optimal separation of the reflection and background layers. Through experiments, we demonstrate that our approach can generate superior results to those of previous methods. Index Terms—Reflection separation, image enhancement, polarized light, computational photography

Ç 1

INTRODUCTION

W

E address the problem of reflection separation for images such as photographs of scenes taken through glass windows or photographs of objects placed inside glass showcases in retail store and museum settings. By separating the contribution of reflection, one can enhance a captured image to better see the desired scene. Since light reflected off the surface of a reflective medium is polarized [1], a common practice to reduce the effect of reflection is to place a polarizer in front of a camera lens to filter out the reflected light being polarized. However, such a method works only if the image is captured at Brewster’s angle (around 56 degrees for glass reflection), which is rarely set for image capture in practice. Consequently, weak reflection still remains in the filtered image. For reflection separation, we exploit physical properties of reflection and transmission for a double-surfaced glass medium, where both the reflected light and the transmitted light are polarized. Derived from physical equations of polarization, the effect of reflection and transmission can be modeled by the following equation (see Section 3):

IðxÞ ¼ ðxÞ

LR ðxÞ LB ðxÞ þ ½1  ðxÞ ; 2 2

ð1Þ

. N. Kong is with Max-Planck-Institut fu¨r Intelligente Systeme, Spemannstrasse 41, Tu¨bingen 72076, Germany. . Y-W. Tai is with the Department of Electrical Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, Republic of Korea. . J.S. Shin is with the Department of Computer Science, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, Republic of Korea. Manuscript received 13 Apr. 2012; revised 1 Nov. 2012; accepted 5 Feb. 2013; published online 15 Feb. 2013. Recommended for acceptance by D. Forsyth. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TPAMI-2012-04-0286. Digital Object Identifier no. 10.1109/TPAMI.2013.45. 0162-8828/14/$31.00 ß 2014 IEEE

where I is the intensity of light received by an image sensor, LR is the intensity of light from the scene reflected off a glass surface, LB is the intensity of light from the background scene behind the glass, and  is a mixing coefficient. The value of  depends on the refractive index of glass, the orientation of the plane of incidence, the angle of incidence, and the polarizer angle at pixel position x. Assuming that the camera response function to an image sensor is linear, I is equal to the image recorded by the image sensor. Under this assumption, our goal is to estimate LR and LB , given I captured with a polarizer. The form of (1) is similar to that of a matting equation [2]. In the typical matting problem, there are a large number of foreground/background pixels with  equal to either 0 or 1. In the reflection separation problem, however,  is rarely equal to 0 or 1 but varies over an image between 0 and 1. In addition, conventional matting algorithms tend to blur the foreground/background regions where  is between 0 and 1 as the focus of the matting problem is natural image composition. On the other hand, it is desirable that the reflection/background layers are sharp and clear in the reflection separation problem. Hence, although (1) is similar to the matting equation, the conventional matting algorithms cannot be applicable to our reflection separation problem. Solving (1) with a single input image is an ill-posed problem. Our main contribution is that the reflection separation problem can be solved automatically by using three input images. Each of these images is captured with a polarizer angle separated by 45 degrees while the mixing coefficient  is allowed to be spatially varying. In most of previous methods,  is assumed to be constant over an image that is often invalid for real images. Therefore, the results produced with these methods may not be satisfactory for real-world examples. Based on the reflection model in (1), our method achieves high-quality reflection Published by the IEEE Computer Society

210

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

VOL. 36, NO. 2,

FEBRUARY 2014

Fig. 1. (a), (b), (c) Input images, (d) the estimated background layer, (e) the estimated reflection layer. Note that image details as well as image noise are favorably reconstructed in (d) and (e). We employed gamma correction for better visual presentation.

separation results for various examples. We show that our results are superior both quantitatively for synthetic examples and qualitatively for real-world examples to the results of the previous methods. Fig. 1 illustrates one of our real-world examples. A shorter version of this work appeared in [3]. The current version extends our conference version with further technical details in reflection separation. In addition, we present a multiscale scheme to accelerate reflection separation and provide additional results. For self-completeness, we also provide a detailed discussion on reflectance/ transmittance for a single- or double-surfaced medium. The remainder of this paper consists of as follows: We discuss the previous methods related to reflection separation in Section 2. Our reflection model in (1) is derived in Section 3. Details of our method for reflection separation is presented in Section 4. Section 5 shows results of our method and their quantitative and qualitative comparisons to those of previous methods, followed by robustness tests of our algorithm. Finally, we conclude this paper with discussions in Section 6.

2

RELATED WORK

Reflection separation methods can be categorized into two groups: image-based methods and physically-based methods. We start with reviewing image-based methods. Early work by Ohnishi et al. [4] used multiple polarized images with different polarizer angles for reflection separation. They regarded the minimum intensity image as the background layer, and the difference between the maximum and minimum intensity images as the reflection layer. However, weak reflection may still remain in the recovered background image because reflection with partial polarization is not fully separated by a polarizer. Farid and Andelson [5] presented a method to further reduce remaining reflection based on independent component analysis (ICA). Bronstein et al. [6] generalized the ICA-based method to allow multiple polarized images, while improving its accuracy and efficiency based on sparsity of large image gradients. Multiple polarized images with different polarizer angles were also used in [7]. In this work, the reflection model considered the spatially varying mixing coefficient. However, the model only dealt with single-surface reflection, but not polarization of the transmitted light. Moreover, user corrections were often required to enhance

the quality of the reflection separation. On the other hand, our physically-based reflection model deals with doublesurface reflection and the spatially varying mixing coefficient of both layers. In addition, our solution method is fully automatic. Sarel and Irani [8] and Yan et al. [9] also dealt with the spatially varying mixing coefficient of reflection. Similar to the approach in [7], their approaches are based on an assumption that the structures from two sources, the reflection and background layers, are statistically uncorrelated. This statistical assumption may fail in the presence of overlapping edges from different layers. Adopting a physically-based approach, we avoid the assumption to better handle such overlapping edges. Levin and Weiss [10], [11] used a single image and user-provided pixel-wise gradient locations labeled as either the reflection or background layer. An automatic method by Levin et al. [12] found the most likely decomposition that minimizes the total number of edges and corners in the recovered layers by using a database of natural images. However, these methods may not work well for a complex image containing many intersections of edges from the reflection and background layers: The manual gradient labeling would become very hard for the method in [10], [11], and the desired decomposition may not be obtained by database search for the method in [12]. On the other hand, our method mainly relies on physical properties of polarization that are less affected by the complex edge intersections. Projection of gradients between a pair of flash and noflash images can be used to separate reflection in a flash image. Agrawal et al. [13] detected edges introduced by reflection based on the coherency of gradient directions between the images, and separated these edges by taking the projection between the flash image gradients and the no-flash image gradients. In [14], they adopted structure tensors of the flash and no-flash images to better detect the edges from reflection, and separated these by taking an affine transformation. They assumed that there is neither reflection nor saturation in the flash image. However, it is hard to obtain such a flash image because the flash power is so strong as to have most pixels saturated or is so weak as to make reflection remain. Another useful image-based information is “misalignment” of image contents between layers. Irani et al. [15] and Szeliksi et al. [16] used temporal misalignment of each layer in the intensity domain in a sequence of

KONG ET AL.: A PHYSICALLY-BASED APPROACH TO REFLECTION SEPARATION: FROM PHYSICAL MODELING TO CONSTRAINED...

211

Fig. 2. (a) Reflectance R or transmittance T is expressed as the sum of two orthogonal polarized components, that is, R ¼ R? þ Rk , T ¼ T ? þ T k , (b) R and T vary with respect to the angle of incidence , where R is completely polarized only at the Brewster’s angle (around 56 degrees for glass reflection), where the relative strength of each polarized component in R or T is shown as a curve in a different color, (c) the amount of light received by a camera depends on the amount of polarization in the reflected and transmitted light.

images, and Gai et al. [17], [18] detected the temporal misalignment in the gradient domain via gradient sparsity. Schechner et al. [19] used focus difference between the background and reflected scenes. Tsin et al. [20] solved the stereo matching problem in the presence of superimposed reflection. These methods assume that a static reflection layer is defocused by convolution with a single defocus blur kernel [19], or is transformed between images due to stereo motion [20] or general motion such as camera movement, glass surface movement, or target object movement [15], [18]. Schechner et al. [21], [22] proposed a physically based method that separate reflection by exploiting physical properties of polarization. They assumed that the mixing coefficient for the reflection and background layers is static over an image and that the two layers are statistically independent of each other. Under these assumptions, the angle of incidence is chosen by maximizing the statistical independence of the layers in terms of mutual information [22] or their cross covariance [21]. Except for the work in [7], [8], [9], all of above methods assume that the mixing coefficient for the reflection and background layers is spatially invariant over the image, which is rarely satisfied for a real polarized image. Based on physical properties of polarization, our reflection model uses an alpha matte to model the spatially varying mixing coefficient (see Section 3). We also present a fully automatic method to determine the mixing coefficient.

3

REFLECTION MODEL

In this section, we discuss a physically-based reflection model in (1). We derive the model based on physical properties of polarization in Section 3.1. For self-completeness, we also provide a detailed discussion on reflectance and transmittance for a single- or double-surfaced medium in Section 3.2.

3.1 Model Derivation We describe properties of polarization in reflection and transmission for a double-surfaced medium such as a sheet of glass. Light reflected off or transmitted through a glass surface is partially polarized and expressed as the sum of two orthogonal polarized components, which are perpendicular and parallel to the plane of incidence as illustrated in Fig. 2a. We define R as the reflectance that models the relative strength of light reflected off a glass surface, and T as the transmittance that models the relative strength of

light transmitted through the glass surface,  as the angle of incidence, and  as the polarizer angle. Special subscripts ? and k are added to R and T that correspond to the polarized components perpendicular and parallel to the plane of incidence, respectively. Hence, R? and Rk represent two orthogonal polarized components of R such that R ¼ R? þ Rk . Similarly, T ¼ T ? þ T k . We define k as the angle for the orientation of the intersection line between the polarizer and the plane of incidence, and ? ¼ k þ 90 . Each polarized component in R and T is a function that depends on  and the refractive index  of a reflection medium. Exact forms of Rð; Þ and T ð; Þ are given in Section 3.2. Since  ¼ 1:474 for glass reflection, we hide  for R and T in the remainder. As shown in Fig. 2b, the relative strength of each polarized component varies smoothly in . In addition, the weak parallel component of reflectance completely disappears only when the angle of incidence is equal to the Brewster’s angle (around 56 degrees for glass reflection), while none of transmittance components disappears at any angle of incidence. A polarizer fully eliminates the effect of the reflected light only at the Brewster’s angle, which is rarely set for image capture. When taking an image with a polarizer, the amount of light that passes through the polarizer is given by Malus’ law [1]: ^ L cos2 ðÞ;

ð2Þ

where L is the intensity of incoming polarized light, ^ is the angle between the polarization direction of the incoming light and the transmission axis of the polarizer. We assume that the intensities of two components in incoming light, that is, LR before reflection and LB before transmission are unpolarized, and thus that even energy is contained in each component, i.e., LR? ¼ LRk ¼ LR =2 and LB? ¼ LBk ¼ LB =2. Then, the intensity of light at a single pixel of an image sensor after passing through the polarizer is   LR I ¼ R? ðÞ cos2 ð  ? Þ þ Rk ðÞ cos2 ð  k Þ 2  LB  2 2 þ T ? ðÞ cos ð  ? Þ þ T k ðÞ cos ð  k Þ ; 2

ð3Þ

which is a function of ,  and ? . Thus, we obtain the reflection model in (1) by setting    ¼ R? ðÞ cos2 ð  ? Þ þ Rk ðÞ sin2 ð  ? Þ ; ð4Þ

212

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

VOL. 36, NO. 2,

FEBRUARY 2014

Fig. 3. When a camera is close to a reflection medium,  and ? vary spatially, as shown in (a) and (b), respectively.

which is the mixing coefficient for LR . 1   is the mixing coefficient for LB since T ? ðÞ ¼ 1  R? ðÞ and T k ðÞ ¼ 1  Rk ðÞ (see Section 3.2). Now, let us take a closer look on the reflection model in (3). For an input image captured with a polarizer,  is constant and  and ? are spatially varying over the image, as illustrated in Fig. 3. In addition, we can see that the variations of the last two quantities are spatially smooth over a planar glass surface or a glass surface with small curvature. The reflectance and transmittance are also smooth with respect to , as shown in Fig. 2b. Consequently, the mixing coefficient  is a function of  and ? , which is spatially smooth over an image. In this paper, we explicitly deal with the spatially varying mixing coefficient to achieve high-quality reflection separation.

3.2 Reflectance and Transmittance This section describes double-surface reflectance and transmittance represented as functions of single-surface reflectance and transmittance, and their physical properties in detail. For a single-surfaced medium, whose refractive index is , two polarized components perpendicular and parallel to the plane of incidence of reflectance are defined as 2

2

ðt ð;ÞÞ ðt ð;ÞÞ Rs? ð; Þ ¼ sin and Rsk ð; Þ ¼ tan ; tan2 ðþt ð;ÞÞ sin2 ðþ ð;ÞÞ t

ð5Þ

respectively, according to the Fresnel equations [1]. Here,  is the angle of incidence and t ð; Þ ¼ arcsinð1 sin Þ from

Fig. 4. Reflectance and transmittance depending on the varying angle of incidence for a single- or double-surfaced medium with the same refractive index. The relative strength of each polarized component is shown as a curve in a different color. (a) Single-surface reflectance/ transmittance for glass, where Rs? and Rsk denote two orthogonal components of reflectance, and T s? and T sk denote two orthogonal components of transmittance, (b) double-surface reflectance/transmittance for glass, where R? and Rk denote two orthogonal components of reflectance, and T ? and T k denote two orthogonal components of transmittance. The perpendicular and parallel polarized components are represented by the subscripts ? and k , respectively.

Fig. 5. Reflection and transmission for a double-surfaced medium. Black arrow: incoming light ray, red and blue dotted arrows: rays produced by single-surface reflection and single-surface transmission, respectively, shaded area: internal part of the medium.

Snell’s law [1]. For a given  with  fixed, reflectance and transmittance for each polarized component always sum to 1. Therefore, two polarized components perpendicular and parallel to the plane of incidence of transmittance are T s? ð; Þ ¼ 1  Rs? ð; Þ and T sk ð; Þ ¼ 1  Rsk ð; Þ;

ð6Þ

respectively. The values of Rs? , Rsk , T s? , and T sk with respect to  are plotted in Fig. 4a. The plotted graph shows partial polarization caused by the relationship between reflectance and transmittance, where the amount of polarization smoothly varies with respect to the angle of incidence from 0 to 90 degrees, and has the maximum at Brewster’s angle (around 56 degrees for glass reflection). For a double-surfaced medium, light transmitted through the medium undergoes a series of internal (single-surface) reflections between the front and back surfaces of the medium as illustrated in Fig. 5. Each of the internal reflection produces a ray transmitted outside the medium. As a result, the observed light outside the front surface is the sum of the first reflected ray at the front surface and the internal reflection rays transmitted through the front surface. Similarly, the observed light outside the back surface is the sum of all internal reflection rays transmitted through the back surface. Let  and t be the angle of incidence and the angle of transmittance for the first reflection outside the front surface, respectively. According to the law of reflection [1], it is trivial that t is the angle of incidence and  is the angle of transmittance for internal reflection. Moreover,  ¼ arcsinð sin t Þ by Snell’s law. We can easily prove that (5) and (6) remain unchanged for internal reflection. Thus, (5) and (6) can still be applied to the internal reflection. Schechner et al. [21] assumed that only the first few observed rays significantly contribute to the intensity of the observed reflected light or the observed transmitted light, while others have negligibly weak intensities. In addition, assuming that the thickness of the medium is thin enough, spatial shift between those significant observed rays can be ignored. Thus, double-surface reflectance R and double-surface transmittance T can be approximated as functions of single-surface reflectance and transmittance in (5) and (6). The polarized components perpendicular and parallel to the plane of incidence in the double-surface reflectance are 2Rs ð;Þ

2Rs ð;Þ

?

k

R? ð; Þ ¼ 1þR?s ð;Þ and Rk ð; Þ ¼ 1þRks ð;Þ;

ð7Þ

KONG ET AL.: A PHYSICALLY-BASED APPROACH TO REFLECTION SEPARATION: FROM PHYSICAL MODELING TO CONSTRAINED...

213

the perpendicular components in the reflected and transmitted light, and Ik ðxÞ is defined similarly:

Fig. 6. Basic algorithm overview: Step 1 extracts a pair of orthogonal images from three input polarized images. Step 2 separates the reflection and background layers by estimating the spatially varying angle of incidence. Step 3 refines these layers by employing constrained optimization. Step 4 postprocesses the refined layers with edge suppression.

respectively. The polarized components perpendicular and parallel to the plane of incidence in the double-surface transmittance are 1Rs ð;Þ

T ? ð; Þ ¼ 1þR?s ð;Þ ¼ 1  R? ð; Þ and ?

T k ð; Þ ¼

1Rsk ð;Þ 1þRsk ð;Þ

¼ 1  Rk ð; Þ;

LR ðxÞ LB ðxÞ þ T ? ððxÞÞ ; 2 2

ð9Þ

Ik ðxÞ ¼ Rk ððxÞÞ

LR ðxÞ LB ðxÞ þ T k ððxÞÞ : 2 2

ð10Þ

By substituting the above equations into (3), we can express an input image Ii ðxÞ, i ¼ 1; 2; 3, in terms of I? ðxÞ, Ik ðxÞ, and i  ? ðxÞ: Ii ðxÞ ¼ I? ðxÞ cos2 ½i  ? ðxÞ þ Ik ðxÞ sin2 ½i  ? ðxÞ I? ðxÞ þ Ik ðxÞ I? ðxÞ  Ik ðxÞ þ cos 2½i  ? ðxÞ: ¼ 2 2 ð11Þ Now, we can derive the following three equations from (11) by substituting ðI1 ; 1 Þ, ðI2 ; 1 þ 45 Þ, and ðI3 ; 1 þ 90 Þ for ðIi ; i Þ, respectively [23]: I1 ðxÞ þ I3 ðxÞ ¼ I? ðxÞ þ Ik ðxÞ;

ð8Þ

respectively, which implies that the double-surface reflectance and transmittance for each polarized component sum to 1 under the same  and . The values of R? , Rk , T ? , and T k with respect to  are plotted in Fig. 4b. Note that comparing this plot to that in Fig. 4a on singlesurfaced reflectance and transmittance, only the shapes of the graphs are different but the properties on polarization remain identical.

4

I? ðxÞ ¼ R? ððxÞÞ

I1 ðxÞ  I3 ðxÞ ¼ ½I? ðxÞ  Ik ðxÞ cos 2½1  ? ðxÞ;

4.1 Basic Algorithm The basic algorithm consists of four steps as illustrated in Fig. 6: orthogonal image extraction, image separation, reflection refinement, and weak-edge suppression. The first step extracts a pair of orthogonal images from three polarized images each captured with a polarizer angle separated by 45 degrees. The next step separates the reflection and background layers by estimating the spatially varying angle of incidence. The third step refines these layers by employing constrained optimization. Finally, the last step postprocesses the results with edge suppression [14] to remove remaining weak edges. We further illustrate the behavior of our algorithm by providing results of each step for synthetic polarized images shown in Fig. 11. 4.1.1 Orthogonal Image Extraction Consider three polarized images, Ii ðxÞ, i ¼ 1; 2; 3, each of which was captured with a polarized angle separated by 45 degrees, i.e., i ¼ 1 , 1 þ 45 , and 1 þ 90 . From these three images, we compute 1  ? ðxÞ instead of the explicit values of 1 and ? ðxÞ. We then compute two orthogonal images I? ðxÞ and Ik ðxÞ. I? ðxÞ is the sum of intensities for

ð13Þ

I1 ðxÞ þ I3 ðxÞ  2I2 ðxÞ ¼ ½I? ðxÞ  Ik ðxÞ sin 2½1  ? ðxÞ: ð14Þ By solving these equations for i  ? ðxÞ, I? , and Ik , we get   1 I1 ðxÞ þ I3 ðxÞ  2I2 ðxÞ ; ð15Þ ½1  ? ðxÞ ¼ arctan 2 I1 ðxÞ  I3 ðxÞ

REFLECTION SEPARATION METHOD

In this section, two versions of our reflection separation method are presented. We begin with the basic algorithm in Section 4.1 and then provide its multiscale extension in Section 4.2.

ð12Þ

I? ðxÞ ¼

I1 ðxÞ þ I3 ðxÞ I1 ðxÞ  I3 ðxÞ þ ; 2 2 cos 2½1  ? ðxÞ

ð16Þ

Ik ðxÞ ¼

I1 ðxÞ þ I3 ðxÞ I1 ðxÞ  I3 ðxÞ  : 2 2 cos 2½1  ? ðxÞ

ð17Þ

We assume 1  ? ðxÞ is within ½45 ; 45  for the unique solution of 12 arctanðÞ A similar assumption was made in [22]. If 1  ? ðxÞ is smaller than 45% or larger than 45 , the computed value is different 90 from the true value. In this case, we simply exchange I? and Ik because the sign of cos 2ðÞ is reversed. For justification of this assumption, we refer the readers to [24, Appendix]. Results of this step are shown in Fig. 7.

4.1.2 Image Separation In this step, we present how to separate an image into LR and LB by estimating the angle of incidence ðxÞ. We also show how to compute an alpha matte ðxÞ for each input image, which will be used in the next step. Since T ? ðÞ ¼ 1  R? ðÞ and T k ðÞ ¼ 1  Rk ðÞ; LR , and LB can be derived from (9) and (10) as follows [21], [22]: LR ðxÞ ¼

2½ð1  Rk ððxÞÞÞI? ðxÞ  ð1  R? ððxÞÞÞIk ðxÞ ; R? ððxÞÞ  Rk ððxÞÞ ð18Þ

214

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Fig. 7. Results of Step 1 from the input images in Fig. 11. (a), (b) Images of the parallel and perpendicular components, (c) physical quantity 1  ? , where 1 ¼ 0 , hence it is ? .

LB ðxÞ ¼

2½Rk ððxÞÞI? ðxÞ  R? ððxÞÞIk ðxÞ ; Rk ððxÞÞ  R? ððxÞÞ

ð19Þ

where I? and Ik have been computed in the previous step. The range of ðxÞ is limited between 5 and 85 to avoid division by zero because R? ¼ Rk when  ¼ 0 or 90 . In the above equations, LR ðxÞ and LB ðxÞ are functions of ðxÞ that is unknown. The mutual information between two images LR and LB is defined as follows: IðLR ; LB Þ ¼

X

P ðlR ; lB Þ log

lR ;lB 2L

P ðlR ; lB Þ ; P ðlR ÞP ðlB Þ

ð20Þ

where P ðlR Þ and P ðlB Þ are the probabilities for certain intensity levels lR and lB in LR and LB , respectively, P ðlR ; lB Þ is the joint probability for lR and lB , and L is a set of all possible intensity levels in image histograms. To reduce the sensitivity to the contrast of images, Schechner et al. [22] introduced a new measure: I n ðLR ; LB Þ ¼ IðLR ; LB Þ=KRB ;

ð21Þ

where KRB

" # X 1 X ¼ P ðlR Þ log P ðlR Þ þ P ðlB Þ log P ðlB Þ : ð22Þ 2 l 2L l 2L R

B

KRB is the mean self-information between LR and LB . Therefore, I n ðLR ; LB Þ is the ratio of the mutual information to the mean self-information, which we call the “normalized” mutual information. For convenience, however, I n ðLR ; LB Þ will be referred to as the mutual information in what follows unless explicitly stated otherwise. To estimate ðxÞ, we make an assumption that there is no correlation between the image contents in the reflected scene and the background scene. Hence, the best ðxÞ tends to minimize the mutual information between LR and LB . We prepare all candidate pairs of LR ðxÞ and LB ðxÞ evaluated at a sequence of regularly sampled values of ðxÞ by using (18) and (19). Since ðxÞ varies smoothly over an image, we use belief propagation [25] to choose the best ðxÞ at each pixel over the image by solving the following minimization problem: X X X arg min ððxÞÞ þ ððxÞ; ðyÞÞ; ð23Þ ðxÞ

x

x y2N ðxÞ

where ððxÞÞ is the data cost function, and ððxÞ; ðyÞÞ is the neighborhood cost function, and N ðxÞ is the firstorder neighbors of x. ððxÞÞ measures the cost of

VOL. 36, NO. 2,

FEBRUARY 2014

assigning ðxÞ to pixel x, and ððxÞ; ðyÞÞ measures the cost of assigning ðxÞ and ðyÞ to a pair of neighboring pixels x and y, respectively. Schechner et al. [22] defined a data cost function in terms of the mutual information between LR ðxÞ and LB ðxÞ at each angle ðxÞ. Under the assumptions that  is static over the image and that image contents in LR and LB are statistically uncorrelated, they showed that the image pair ðLR ; LB Þ at the best  has the smallest mutual information. We adopt this function to our problem setting. However, it cannot be directly applied to our problem setting in which  is spatially varying. Instead, we assume that  is static locally, i.e., spatially invariant within a small patch around each pixel, based on its smoothness condition in a spatial domain. Then, our data cost function can be defined in terms of the patch-wise mutual information:  ½x ½x  ððxÞÞ ¼ I n LR ; LB ; ð24Þ ½x

½x

where I n ðLR ; LB Þ is the mutual information for image ½x ½x patches LR and LB around x. We have also tested the data cost function proposed by Levin et al. [12] to choose the image pair with the minimum number of corners and edges. However, we found that the mutual information generally produces better results for our problem. Our neighborhood cost function is defined as ððxÞ; ðyÞÞ ¼ jðxÞ  ðyÞj;

ð25Þ

where  ¼ 0:5 is a weight. Note that, if we simply maximize the data cost for each pixel without using the neighborhood cost function, the estimated value of ðxÞ over the image would be noisy and inaccurate. After choosing ðxÞ, we obtain two separate layers LR ðxÞ and LB ðxÞ by evaluating (18) and (19) with ðxÞ. We also compute ðxÞ for each input image based on its definition in (4), by setting ðxÞ to the value obtained from belief propagation and 1  ? ðxÞ to the value obtained in the previous step. Results of this step are shown in Fig. 8, where blocking artifacts are observed apparently in Fig. 8b.

4.1.3 Reflection Refinement The separation results include the blocking artifacts due to the patch-wise data cost function defined in (24). This section describes how to further refine the separation results to improve their quality. We formulate a minimization problem with an objective function given in (26) to refine i and ðLR ; LB Þ: 3 X X i¼1

x

k2Ii ðxÞ  i ðxÞLR ðxÞ  ð1  i ðxÞÞLB ðxÞk2  2 þ c i ðxÞ  0i ðxÞ þs kri ðxÞk2 :

ð26Þ

The objective function consists of three terms: The first one is the data term derived from (1) for the input images. The second one is a soft constraint that ensures i ðxÞ to be similar to 0i ðxÞ, which denotes the mixing coefficient estimated in Step 2. This term enforces consistent estimation of i ðxÞ over each input image. The third one is the smoothness term that minimizes the variation of i ðxÞ. c and s are weighting parameters.

KONG ET AL.: A PHYSICALLY-BASED APPROACH TO REFLECTION SEPARATION: FROM PHYSICAL MODELING TO CONSTRAINED...

215

Fig. 8. Results of Step 2. (a), (b) Initial background and reflection layers, (c) initial angle of incidence, (d)-(f) initial alpha mattes. We adjusted contrast of the layers for better visual presentation.

Fig. 9. Results of Step 3. (a), (b) Refined background and reflection layers, (c)-(e) refined alpha mattes. We adjusted contrast of the layers for better visual presentation.

Inspired by the work in [7], our optimization scheme ^i , L^R , and L^B , first initializes 0i , LR , and LB with  respectively, which are obtained with the image separation step (Section 4.1.2), and then minimizes the objective function by solving two convex subproblems alternatingly: solving for i while fixing LR and LB in one iteration, and solving for LR and LB while fixing i in the next iteration. Our algorithm is guaranteed to converge to a local minimum as the objective function value is strictly decreasing in every iteration. Results of this step are shown in Fig. 9. The refined layers are now very close to their ground truth, but the reflection layer still contains some weak false edges as observed in Fig. 9b.

To suppress a weak false edge, the cross-projection tensor at x in LR is defined as

T

0  v1 : ð29Þ DR ðxÞ ¼ ½v1 v2  1 0 2 vT2

4.1.4 Weak Edge Suppression Either of LR and LB after reflection refinement may still contain a very weak false edge that belongs to the other layer. We adopt cross-projection tensors [14] to suppress the false edges in LR and LB . An edge at x in L can be detected with its smoothed structure tensor G defined as follows:   G ¼ rLrLT  K ; ð27Þ where L was obtained by converting the color space from RGB to YUV and taking the Y channel values, rL denotes the gradient vector at x,  denotes convolution, and K is a normalized 2D Gaussian kernel of variance . The matrix G can be decomposed as follows:

T

0  u1 ; ð28Þ G ¼ VVT ¼ ½u1 u2  1 0 2 uT2 where u1 and u2 are the eigenvectors of G corresponding to the eigenvalues 1 and 2 , respectively, and 1  2 . Using this decomposition, the structure of L can be characterized locally: For a homogeneous region, 1 ¼ 2 ¼ 0. If 1 > 0 and 2 ¼ 0, then an edge appears at x and its direction is given by u1 . Based on this, we can detect an edge at x each of LR and LB if any.

Here, v1 and v2 are the major and minor eigenvectors of the smoothed structure tensor, respectively, which is derived from the gradient of at x in LB . We assume that a true edge in LR or LB has a larger gradient magnitude than its corresponding false edge in the other layer. In [14], the edge in LR is suppressed whenever an edge appears at the same location in LB . In our approach, however, we selectively suppress the edge in LR by setting the values of 1 and 2 of the cross-projection tensor DR based on our assumption on weak false edges. There are two cases: Case 1: There is an edge at x in LR . In this case, there are two subcases depending on whether or not there is an edge at x in LB . If there is no edge in LB , then the edge in LR is trivially a true edge. Suppose that there is an edge in LB . In this subcase, the edge in LR is a true edge if the gradient magnitude at x in LR is greater than that in LB . Otherwise, the edge in LR is a false edge. Our method sets 1 ¼ 1 and 2 ¼ 1 to retain the edge if it is a true edge. Otherwise, the method sets 1 ¼ 0 and 2 ¼ 1 to suppress the edge by projecting it onto the orthogonal vector of the edge in LB . Case 2: There is no edge in LR . In this case, our method sets 1 ¼ 0 and 2 ¼ 0 to set the gradient at x in LR to a zero vector (see (30)). Symmetrically, the cross-projection tensor DB ðxÞ at x in LB can be constructed. To suppress a false edge and retain the true edge, the gradients rLR ðxÞ and rLB ðxÞ are modified as follows: rL0R ðxÞ ¼ DR ðxÞ  rLR ðxÞ; rL0B ðxÞ ¼ DB ðxÞ  rLB ðxÞ:

ð30Þ

Then, the modified gradients rL0R ðxÞ and rL0B ðxÞ are integrated and the results are converted back to the RGB space to obtain the final separation results. As shown in

216

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

VOL. 36, NO. 2,

FEBRUARY 2014

Fig. 10. Results of Step 4. (a), (b) Images with false edges suppressed. We adjusted contrast for better visual presentation.

Fig. 10, the weak false edges are well suppressed in the final separation.

4.2 MultiScale Extension Our basic algorithm (Section 4.1) works well for small-sized images such as 64 64 images. However, the computation time of this algorithm is demanding. For instance, it takes a couple of hours on a PC (2.6 GHz processor, 12.0 GB RAM) even for a 256 256 image. The bottleneck is Step 2 (image separation) due to image histogram preparation and belief propagation (Section 4.1.2). To get around this difficulty, we adopt a multiscale reflection separation scheme based on the Gaussian pyramids of input images with a scale factor of 2. Our multiscale scheme performs Step 3 (Section 4.1.3) in a hierarchical fashion from the coarsest scale after initializing ^i , L^R , and L^B 0i , LR , and LB at this scale with values  computed by Steps 1 and 2 (Sections 4.1.1 and 4.1.2). In Step 2, we set the patch size to ðbw=16c 2 þ 1; bh=16c 2 þ 1Þ, where ðw; hÞ specifies the image size at the coarsest scale. At each scale, the two convex problems are alternatingly solved for i and ðLR ; LB Þ, respectively, as described in Section 4.1.3. The solutions,  ^i , L^R , and L^B at the current scale is upsampled to initialize 0i , LR , and LB at the next finer scale, respectively. Here, we used fixed values c ¼ 2 and s ¼ 100 at all scales for the parameters of (26). After completing Step 3 at the finest scale, Step 4 (Section 4.1.4) is performed to postprocess the separation results. Exploiting physical properties of polarization, ^i , L^R , and L^B close to their Steps 1 and 2 compute  ground-truth values at the coarsest scale. With good initial values of 0i , LR , and LB , our multiscale scheme converges quickly to a solution at each scale. Fig. 12 illustrates each step of our multiscale scheme with the input images in Fig. 11. The multiscale scheme accelerates reflection separation greatly: For a 256 256 image, the reflection separation completes in about 5 minutes with our multiscale scheme. In our experiments, we found that the solution of the multiscale scheme does not necessarily converge to that of the basic (single-scale) algorithm. However, both singleand multiscale solutions are close enough to show good visual quality and small RMSEs. In our supplemental materials, which can be found in the Computer Society Digital Library at http://doi.ieeecomputersociety.org/ 10.1109/TPAMI.2013.45, we included results from a test

Fig. 11. Input images and ground-truth values for Figs. 7, 8, 9, and 10, and Figs. 12 and 13. (a)-(c) Three polarized images were synthetically created from the ground-truth values in (d)-(j) with 1 , 1 þ 45 , and 1 þ 90 , respectively, where 1 ¼ 0 ; (d), (e) ground-truth background and reflection layers; (f) ground-truth angle of incidence; (g) ground-truth angle for the orientation of the line perpendicular to the intersection line between the polarizer and the plane of incidence; (h)-(j) ground-truth alpha mattes.

which compares single- and multiscale solutions qualitatively and quantitatively with our two synthetic examples used in Section 5.

5

EXPERIMENTS

In this section, we present our experimental results. The experiments were performed on an Intel i7PC (2.6 GHz processor, 12.0 GB RAM) with C++ and Matlab implementation. It takes about 5 minutes for an image of 256 256 with our multiscale scheme to complete the reflection separation. In the remainder of this paper, by our method we refer to the multiscale scheme (Section 4.2). We compared our method with those in [4], [11], [6], and [22]. For the method in [11], the maximum intensity image was used as the input image, and large gradient pixels of our projected gradients rL0R and rL0B (Section 4.1.4) were used to provide the gradient locations on the image. For the methods in [11] and [6], we employed the source codes

KONG ET AL.: A PHYSICALLY-BASED APPROACH TO REFLECTION SEPARATION: FROM PHYSICAL MODELING TO CONSTRAINED...

217

Fig. 12. Results of our multiscale extension. From left to right: (a)-(c) results of Step 1 (the coarsest scale); (a), (b) images of the parallel and perpendicular components; (c) physical quantity 1  ? , where 1 ¼ 0 , hence it is ? ; (d)-(i) results of Step 2 (the coarsest scale); (d),(e) initial background and reflection layers; (f) initial angle of incidence; (g)-(i) initial alpha mattes; (j)-(n) results after multiscale reflection refinement in Step 3 (the finest scale); (o), (p) results after edge suppression in Step 4 (the finest scale).

Fig. 13. Results for a synthetic example in Fig. 11. (a) Our results, (b) results of [4], (c) results of [11], (d) results of [6], (e) results of [22]. The first row shows the background layers, and the second row shows the reflection layers.

available on the web1 to generate their results. We admit that the comparisons are not entirely fair because the input requirements of our method are different from those of the compared methods. We showed their results obtained with the best parameter settings as a reference to evaluate the quality of our results. Figs. 13 and 14 show the results for synthetic examples. Based on (9), (10), and (11), the input images were synthesized from their ground-truth layers by setting the values of ðxÞ and ? ðxÞ as visualized in Figs. 11f and 11g, respectively, and 1 ¼ 0 . Our method generated separation results close to the ground-truth layers: We quantitatively compared our method with the others by measuring a rootmean-square error (RMSE) of each layer, which quantifies the difference between an estimated layer and its ground truth. Our results showed lower RMSEs than those of the 1. http://www.wisdom.weizmann.ac.il/ evina/papers/reflections.zip; http://visl.technion.ac.il/bron/spica/.

others. Comparing visual quality, our separation results looked almost similar to their ground-truth layers. On the other hand, the results of [4], [6], and [22] exhibited weak edges from different layers, and unnatural discontinuities were observed in the results of [11]. For the synthetic example in Fig. 13, we tested a modified multiscale scheme by performing Step 3 (Section 4.1.3) only at the finest scale, while initializing the optimization with up-sampled alpha mattes from Step 2 (Section 4.1.2) performed at the coarsest scale. In Step 2, we compute an ^i ) initial alpha matte for each input image (denoted by  based on (4) by setting  to the value obtained from belief propagation at the coarsest scale, and 1  ? to the value from Step 1 (Section 4.1.1) performed at the coarsest scale. ^i is up-sampled to the finest Then, the value of  scale. Optimization in Step 3 is performed only at the finest scale, after initializing 0i at this scale with the value of ^i , and ðLR ; LB Þ at this scale with the up-sampled 

218

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

VOL. 36, NO. 2,

FEBRUARY 2014

Fig. 14. Results for a synthetic example. (a) Our results, (b) results of [4], (c) results of [11], (d) results of [6], (e) results of [22], (f) input images, (g) ground truth LB and LR from left to right. In (a)-(e), the first row shows the background layers, and the second row shows the reflection layers.

Fig. 15. i at the finest scale from Step 3 based on different reflection separation schemes. (a) i from Step 3 performed only at the finest scale by initializing the optimization with up-sampled alpha mattes, (b) i estimated with our multiscale scheme, (c) i estimated with our single-scale scheme.

Fig. 16. Results for a real example. (a) Our results, (b) results of [4], (c) results of [11], (d) results of [6], (e) results of [22], (f) Input images. In (a)-(e), the first row shows the background layers, and the second row shows the reflection layers.

values by minimizing the objective function  P3 P estimated 2Ii ðxÞ  0 ðxÞLR ðxÞ  ð1  0 ðxÞÞLB ðxÞ2 . This i i i¼1 x objective function was derived from the data term in (26) by substituting 0i for the variable i . As we have discussed earlier, Step 2 generates alpha mattes with blocking artifacts due to belief propagation. If

we simply up-sample alpha mattes from belief propagation performed at the coarsest scale, the blocking artifacts will be propagated and accumulated to the finest scale. Fig. 15 illustrates the final alpha mattes from Step 3 based on different reflection separation schemes. As illustrated in Fig. 15a, the blocking artifacts in the alpha mattes still

KONG ET AL.: A PHYSICALLY-BASED APPROACH TO REFLECTION SEPARATION: FROM PHYSICAL MODELING TO CONSTRAINED...

219

Fig. 17. Results for a real example. (a) Our results, (b) results of [4], (c) results of [11], (d) results of [6], (e) results of [22], (f) input images. In (a)-(e), the first row shows the background layers, and the second row shows the reflection layers.

remained strong with the modified multiscale scheme, while they almost disappeared with our multiscale scheme as shown in Fig. 15b. The alpha mattes estimated with our single-scale scheme (Fig. 15c) showed the most accurate values compared to the others, but the computation time was too long (a couple of hours). Therefore, we performed the physically-based refinement (Step 3) in a multiscale scheme to avoid the blocking artifact efficiently. Figs. 16, 17, and 18 show the results on real-world examples. More results on real-world examples can be found in our supplemental materials, available online. Similar to the synthetic examples, our method consistently produced better results in terms of visual quality than those of the others. Although very weak reflection was captured at all input images, as shown in Fig. 17, our method still achieved good reflection separation. Results on more realworld examples are available in our supplemental materials, available online.

Next, we compared the results of the image-based method in [7] and those of our physically-based method. Fig. 19 shows this comparison by using the real example in Fig. 16. The method in [7] is based on a sparse gradient assumption, that is, a large image gradient comes from either the background layer or the reflection layer but not both, which may not always be satisfied in practice. Therefore, some of the image features were smoothed out with the method in [7], while those were relatively well recovered with our method. We finally tested the robustness of our method to errors in the polarizer angle and the refractive index. Our results from the robustness tests showed very small variances in RMSEs and little visual differences, compared to the ground-truth layers: We performed a sensitivity test to polarizer angle errors for two synthetic examples, one in Fig. 13 and the other in Fig. 14. The input images were

Fig. 18. Results for a real example. (a) Our results, (b) results of [4], (c) results of [11], (d) results of [6], (e) results of [22], (f) input images. In (a)-(e), the first row shows the background layers, and the second row shows the reflection layers. We employed gamma correction and contrast adjustment for better visual presentation.

220

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Fig. 19. Comparison to the method in [7] for the real example in Figs. 16. (a) and (b) Estimated LB , (c), (d) estimated LR .

VOL. 36, NO. 2,

FEBRUARY 2014

Fig. 21. Failure example. (a)-(c) Input images, (d), (e) LB and LR estimated with our method. Our method failed to properly separate reflection around specular highlights of glossy surfaces in the background scene. Nevertheless, our method still produced reasonable LB .

Fig. 22. Failure example. (a)-(c) Input images, (d), (e) LB and LR estimated with our method. Lots of air bubbles are contained between the glass surface and a film which coats the surface. Since reflected and transmitted light is diffused by the air bubbles and the film affects the refractive index of the glass surface, our separated layers contain errors.

Fig. 20. Robustness tests for two synthetic examples (one in Fig. 13 and the other in Fig. 14). (a)-(c) Average RMSEs for each of LB and LR over the two examples with respect to an angular perturbation of 2 (represented by "2 ) and that of 3 (represented by "3 ), both ranging in ½10 ; þ10 , (d) results with largest angular perturbations ("2 ¼ "3 ¼ þ10 ) for the example in Fig. 13, (e) Average RMSEs for LB and LR over the two examples with respect to the refractive index ranging in ½1:4; 1:6, (f) results with index 1.6 for the example in Fig. 13.

regenerated by adding angular perturbations "2 and "3 in ½10 ; þ10  to 2 and 3 , respectively. Figs. 20a, 20b, and 20c plot the average RMSEs of each layer over the two examples by varying the angular perturbations. Fig. 20d shows the estimated layers for largest angular perturbations ("2 ¼ "3 ¼ þ10 ). For the same examples, we also performed a sensitivity test to an error in the refractive index. Fig. 20e plots the average RMSEs of each layer estimated by varying the refractive index from 1.4 to 1.6, where the input images were generated by setting the refractive index to 1.471. Fig. 20f shows the results for the refractive index with the largest error (1.6). These results support that our method is robust to small variations in the polarizer angles and the refractive index.

6

DISCUSSIONS AND CONCLUSIONS

We have proposed a reflection separation method based on physical properties of polarization. Given a series of three polarized images, each captured with the different polarizer angle separated by 45 degrees, our method produces

high-quality separation of the reflection and background layers. We have derived a physically-based reflection model to estimate the spatially varying mixing coefficient (that is, the alpha mattes) of the two layers for an input image. On top of the model, we have proposed a multiscale scheme for reflection separation. Our scheme works fully automatically in a hierarchical manner. In the remainder of this paper, we discuss limitations of our approach as well as future work. Our reflection model assumes the incoming light to the glass medium is unpolarized (see Section 3). When there are polarized specular highlights in the background or reflection layer, our result contains errors as shown in Fig. 21. Our approach also cannot model the effect of contaminators on the glass surface such as dusts, water droplets, cracks, or air bubbles. When the reflected or transmitted light is diffused by the contaminators, the physical equations for polarization may not hold true anymore. Fig. 22 shows a failure example in which the glass surface contains many air bubbles. Finally, our approach assumes the camera and the captured scene are static. However, as discussed in Section 2, moving objects provide an alternative hint for reflection separation. As future work, we will study how to incorporate the additional information on moving objects into our approach to handle a dynamic scene.

ACKNOWLEDGMENTS This research was supported by the MCST and KOCCA in the Culture Technology (CT) Research & Development Program 2012 (R2010050008_00000003), and the National Research Foundation (NRF) of Korea (No. 2012-0003359).

REFERENCES [1] [2]

E. Hecht, Optics, fourth ed. Pearson Education, 2002. Y.Y. Chuang, B. Curless, D.H. Salesin, and R. Szeliski, “A Bayesian Approach to Digital Matting,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’01), 2001.

KONG ET AL.: A PHYSICALLY-BASED APPROACH TO REFLECTION SEPARATION: FROM PHYSICAL MODELING TO CONSTRAINED...

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

N. Kong, Y.-W. Tai, and S.Y. Shin, “A Physically-Based Approach to Reflection Separation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’12), 2012. N. Ohnishi, K. Kumaki, T. Yamamura, and T. Tanaka, “Separating Real and Virtual Objects from Their Overlapping Images,” Proc. European Conf. Computer Vision (ECCV ’96), vol. 1065, pp. 636-646, 1996. H. Farid and E. Adelson, “Separating Reflections from Images by Use of Independent Components Analysis,” J. Optical Soc. Am., vol. 16, pp. 2136-2145, 1999. A.M. Bronstein, M.M. Bronstein, M. Zibulevsky, and Y.Y. Zeevi, “Sparse ICA for Blind Separation of Transmitted and Reflected Images,” Int’l J. Imaging Systems and Technology, vol. 15, no. 1, pp. 84-91, 2005. N. Kong, Y.-W. Tai, and S.Y. Shin, “High-Quality Reflection Separation Using Polarized Images,” IEEE Trans. Image Processing, vol. 20, no. 12, pp. 3393-3405, Dec. 2011. B. Sarel and M. Irani, “Separating Transparent Layers through Layer Information Exchange,” Proc. European Conf. Computer Vision (ECCV ’04), vol. 4, pp. 328-341, 2004. Q. Yan, E.E. Kuruoglu, X. Yang, Y. Xu, and K. Kayabol, “Separating Reflections from a Single Image Using Spatial Smoothness and Structure Information,” Proc. Ninth Int’l Conf. Latent Variable Analysis and Signal Separation (LVA/ICA ’10), pp. 637-644, 2010. A. Levin and Y. Weiss, “User Assisted Separation of Reflections from a Single Image Using a Sparsity Prior,” Proc. European Conf. Computer Vision (ECCV), vol. 3021, pp. 602-613, 2004. A. Levin and Y. Weiss, “User Assisted Separation of Reflections from a Single Image Using a Sparsity Prior,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 9, pp. 1647-1654, Sept. 2007. A. Levin, A. Zomet, and Y. Weiss, “Separating Reflections from a Single Image Using Local Features,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’04), 2004. A. Agrawal, R. Raskar, S. Nayar, and Y. Li, “Removing Photography Artifacts Using Gradient Projection and FlashExposure Sampling,” ACM Trans. Graphics, vol. 24, pp. 828-835, July 2005. A. Agrawal, R. Raskar, and R. Chellappa, “Edge Suppression by Gradient Field Transformation Using Cross-Projection Tensors,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’06), vol. 2, pp. 2301-2308, 2006. M. Irani, B. Rousso, and S. Peleg, “Computing Occluding and Transparent Motions,” Int’l J. Computer Vision, vol. 12, no. 1, pp. 516, Feb. 1994. R. Szeliksi, S. Avidan, and P. Anandan, “Layer Extraction from Multiple Images Containing Reflections and Transparency,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’00), 2000. K. Gai, Z. Shi, and C. Zhang, “Blindly Separating Mixtures of Multiple Layers with Spatial Shifts,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’08), pp. 1-8, 2008. K. Gai, Z.W. Shi, and C.S. Zhang, “Blind Separation of Superimposed Images with Unknown Motions,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’09), pp. 18811888, 2009. Y.Y. Schechner, N. Kiryati, and J. Shamir, “Blind Recovery of Transparent and Semireflected Scenes,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’00), vol. 1, pp. 38-43, 2000. Y. Tsin, S. Kang, and R. Szeliski, “Stereo Matching with Reflections and Translucency,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’03), pp. 702-709, 2003. Y.Y. Schechner, J. Shamir, and N. Kiryati, “Polarization-Based Decorrelation of Transparent Layers: The Inclination Angle of an Invisible Surface,” Proc. IEEE Int’l Conf. Computer Vision (ICCV ’99), pp. 814-819, 1999. Y.Y. Schechner, J. Shamir, and N. Kiryati, “Polarization and Statistical Analysis of Scenes Containing a Semireflector,” J. Optical Soc. Am., vol. 17, no. 2, pp. 276-284, Feb. 2000. L.B. Wolff, “Polarization Camera for Computer Vision with a Beam Splitter,” J. Optical Soc. Am., vol. 11, no. 11, pp. 2935-2945, Nov. 1994. N. Kong, “Physically-Based Reflection Separation Using Polarized Images,” PhD dissertation, KAIST, 2012.

221

[25] P.F. Felzenszwalb and D.P. Huttenlocher, “Efficient Belief Propagation for Early Vision,” Int’l J. Computer Vision, vol. 70, no. 1, pp. 41-54, Oct. 2006. Naejin Kong received the BS degree in computer science from Sogang University, Seoul, Korea, in 2005, and the PhD degree in computer science from the Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea, in 2012. He is currently a postdoctoral researcher at the Max Planck Institute (MPI) for Intelligent Systems in Tu¨bingen, Germany. His research interests include computer vision, image processing, computational photography, and computer graphics. Yu-Wing Tai received the BEng (first class honors) and MS degrees in computer science from the Hong Kong University of Science and Technology (HKUST) in 2003 and 2005, respectively, and the PhD degree from the National University of Singapore (NUS) in June 2009. He joined the Korea Advanced Institute of Science and Technology (KAIST) as an assistant professor in fall 2009. He regularly serves on the program committees for the major computer vision conferences (ICCV, CVPR, and ECCV). His research interests include computer vision and image/video processing. He is a member of the IEEE. Joseph S. Shin (formerly Sung Yong Shin) received the BS degree in industrial engineering from Hanyang University, Seoul, in 1970 and the MS and PhD degrees in industrial engineering from the University of Michigan in 1983 and 1986, respectively. Since 1987, he has been with the Department of Computer Science, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea, where he is currently a professor of computer graphics and computational geometry. He also leads a computer graphics research group that has been nominated as a national research laboratory by the Government of Korea. His recent research interests include datadriven computer animation and geometric algorithms. He is a member of the IEEE.

. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.

A physically-based approach to reflection separation: from physical modeling to constrained optimization.

We propose a physically-based approach to separate reflection using multiple polarized images with a background scene captured behind glass. The input...
3MB Sizes 0 Downloads 0 Views