IEEE TRANSACTIONS ON CYBERNETICS, VOL. 44, NO. 1, JANUARY 2014

149

Multiresolution Imaging Xiaoqiang Lu and Xuelong Li, Fellow, IEEE

Abstract—Imaging resolution has been standing as a core parameter in various applications of vision. Mostly, high resolutions are desirable or essential for many applications, e.g., in most remote sensing systems, and therefore much has been done to achieve a higher resolution of an image based on one or a series of images of relatively lower resolutions. On the other hand, lower resolutions are also preferred in some cases, e.g., for displaying images in a very small screen or interface. Accordingly, algorithms for image upsampling or downsampling have also been proposed. In the above algorithms, the downsampled or upsampled (super-resolution) versions of the original image are often taken as test images to evaluate the performance of the algorithms. However, there is one important question left unanswered: whether the downsampled or upsampled versions of the original image can represent the low-resolution or high-resolution real images from a camera? To tackle this point, the following works are carried out: 1) a multiresolution camera is designed to simultaneously capture images in three different resolutions; 2) at a given resolution (i.e., image size), the relationship between a pair of images is studied, one gained via either downsampling or super-resolution, and the other is directly captured at this given resolution by an imaging device; and 3) the performance of the algorithms of super-resolution and image downsampling is evaluated by using the given image pairs. The key reason why we can effectively tackle the aforementioned issues is that the designed multiresolution imaging camera can provide us with real images of different resolutions, which builds a solid foundation for evaluating various algorithms and analyzing the images with different resolutions, which is very important for vision. Index Terms—Multiresolution camera, multiresolution imaging, optimal imagery, super-resolution, downsampling.

I. Introduction INCE the 1970s, imaging sensors such as charge-coupled device (CCD) and CMOS have been widely used to obtain digital images in many applications. It is important to achieve a high-resolution imaging system by exploiting different techniques. Mostly, high resolutions are desirable or essential for many applications, e.g., in remote sensing systems. Much has been done to achieve higher resolution imaging based on one or a series of images of relatively lower resolution(s). In order

S

Manuscript received October 23, 2012; revised June 17, 2013; accepted August 30, 2013. Date of current version December 12, 2013. This work was supported by the National Natural Science Foundation of China under Grant 61125106. This paper was recommended by Associate Editor D. Goldgof. The authors are with the Center for OPTical IMagery Analysis and Learning (OPTIMAL), State Key Laboratory of Transient Optics and Photonics, Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, China (Corresponding author: X. Li, e-mail: xuelong [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCYB.2013.2286496

to obtain a high-resolution image, the traditional diffraction limit should be conquered by using some special methods [1]. For example, some evanescent components are conveyed to the far field focalization and imaging to obtain the high resolution in far field [1]. The most direct way to increase the imaging resolution is to improve the resolution of the imaging unit. For hardware, it means to transcend the aforementioned limitation by reducing the unit pixel size or increasing the number of pixels per unit area. For example, the SPOT5 can capture concurrently two images by using a double CCD array interleaving with one another. In this case, the spatial offset of the two images is fixed at half a pixel in both directions and the spatial resolution of SPOT5 is between 2.5 m and 5 m. However, it is more difficult to improve the sensor’s spatial resolution due to the limit of the sensor array density. There exist some drawbacks for the hardware method. First, the reduction of the digital sensor size can increase the density of the array and find more scene detail, but it will increase the cost of the digital cameras. Second, when the digital sensor pixel size decreases to its limit, the light intensity that each pixel of the sensor can receive will decline. In this case, the images obtained from the digital sensor will be seriously contaminated by noise. Finally, the resolution of the digital camera suffers from low yields and plagued with signal-tonoise ratio limitations due to the shifting and detection of analog charge packets. Thus, the cost of the imaging system for capturing HR images will become prohibitively high. It is a limited way to improve the image resolution by using the hardware method. Image super-resolution (SR) is the technique that is generally exploited to overcome the limitations of optical imaging systems by using image processing algorithms. The single image SR technique is relatively inexpensive to implement. The technique can be exploited in any situation where the hardware method is too expensive to utilize. The task of SR technique is to reconstruct a HR image from one or more low-quality images of the same scene. SR aims to obtain HR images using low-resolution images generated from the less expensive imaging system. Although example-based methods are reported to be more effective than the state-of-the-art methods over the past few years, they need to learn the correspondence between the low-resolution (LR) images and HR images. However, it is difficult for example-based methods to obtain the true pairs of the LR images and the corresponding HR versions with a real camera. In most cases, the HR images are blurred and subsampled for constructing the LR counterparts, which are regarded as the simulated test images. In this case, the reconstructed HR image can be obtained by magnifying the test images based on the SR technique.

c 2013 IEEE 2168-2267 

150

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 44, NO. 1, JANUARY 2014

Fig. 1. Two corresponding LR images generated from the different measures. (a) LR image generated from a real camera. (b) Simulated image using downsampling.

However, it is questionable whether the test simulated images can truly represent the LR images from a real camera. If not, the quality of SR reconstruction is hardly evaluated by using the simulated images, which cannot describe the real scene. Although the demand of HR image is increased in most cases, the LR image is also preferred in some cases, e.g., for displaying images in a very small screen or interface. In fact, low-resolution images can provide users with ability to quickly scan in image retrieval and browsing systems. With the appearance of numerous digital images in multimedia technology, it is necessary to better share and exchange information in a low-resolution mobile phone. In this case, the LR images are expected to adapt for low-resolution display devices [2]. Thus, to obtain the LR image, the general way is to blur and subsample the HR version based on the model of image degrade process, which is called downsampling. The model of degrade process can be described as follows: Y = DBMX + n

(1)

where Y is the measured data (single or collection of images), X is the unknown HR image or images, D denotes the effect of sampling by the image sensor, B represents the blurring operation due to the optical point spread function, M is geometric warp operation capturing image motion, n is the random noise. However, the LR images generated from a real camera are different from the LR images using the degradation model in Eq. (1). Thus, it is difficult to describe their relationships by using the degradation model. To illustrate this point, two images generated from the different measures are shown in Fig. 1. It can be seen in Fig. 1 that the LR image using degradation model contains jagged edge and the blurred texture compared to one generated from the real camera. Recently, many researchers mainly adopt image resizing techniques to scale and crop the HR image to achieve the corresponding LR one, which is regarded as image downsampling. In this case, a problem can be generated whether there exists some consistency between the images captured from the camera and the corresponding simulated images using the downsampling technique. SR methods and image downsampling are two important tasks that adaptively magnify or degrade the image resolution for optimal display under different conditions. For the aforementioned tasks, there is a question left unanswered: whether the downsampled or upsampled versions of the original image can well represent the LR real image or high HR real image

from a camera. This question is very difficult to answer. The main reason is that the obtained LR or HR images, which are exploited to evaluate the performance of the algorithms or to investigate the intrinsic links, do not indeed represent the true corresponding images. They may only represent the downsampled or upsampled versions of the original image. To answer the aforementioned question, it is necessary to capture the HR and LR counterparts simultaneously from a scene. In fact, it can be shown that the simulated images generated from downsampled versions of the HR images are different from the LR image from a real camera. To illustrate this point, the HR image and two corresponding LR images generated using different measures are shown in Fig. 2. As can be seen from Fig. 2, there exists large discrepancy between the simulated images using downsampling and the true image from the LR camera in subjective evaluations. Mathematically, the downsampling operation is equivalent to the process in which a low-pass filter is used to filter the HR image into a simulated LR image. Hence, some fine image structures in the simulated image are lost and the blurred edges can be found compared to the true images. To further confirm this observation, we have compared the profiles along the central lines of the image in the horizontal and vertical directions, as shown in Fig. 3. It can be seen from Fig. 3 that the true LR images present sudden changes in the image regions compared with the simulated image, which is consistent with the true HR image. These sudden changes show that the high-frequency data in simulated LR image is missed compared with the true LR image. The general way to take the simulated LR image as the test image is not proper in the SR method. Hence, it is difficult to evaluate the quality of the reconstructed image generated from the SR method or image downsampling. The general quantitative measure criterion, peak signal-tonoise ratio (PSNR) or mean squared error, is exploited to evaluate the reconstructed result. In most cases, an HR image or LR image generated using upsampling or downsampling is chosen for comparison to verify the effectiveness of SR methods or image resizing methods. However, the aforementioned criterion is reliable only if the original image obtained from the true scene is available. It is necessary to properly evaluate the results generated by using the original image captured by a high-end camera or low-cost one. It is pointed out that the true LR images are not generated by downsamping the corresponding HR images, which is done in most SR methods or resizing methods. Similarly, the true HR images are not constructed by upsampling the corresponding LR images. Thus, it is important to produce the true LR image and the corresponding HR images to analyze the intrinsic difference between the simulated images generated using upsampling or downsampling operation and the true image. Recently, Gajjar and Joshi [3] constructed a training database, which is composed by HR images and LR ones captured by using different resolution settings of a camera, to reduce the error in the SR estimation. Gajjar and Joshi [3] had declared that the proposed method is dependent on a training database and made the assumption that the LR image and the corresponding HR image are collected under the

LU AND LI: MULTIRESOLUTION IMAGING

151

Fig. 2. HR image and two corresponding LR images generated from the different measures. (a) 1280 × 1024 true image. (b) 1280×1024 simulated image using upsampling. (c) 640×512 true image. (d) 640×512 simulated image using downsampling. (e) 320×256 true image. (f) 320×256 simulated image using downsampling.

Fig. 3. Top row: image profiles along the centers of the 1280×1024 simulated image (red line) and true image (blue line) in the horizontal and vertical directions obtained from a real camera. The display gray scale is [0, 255]. Middle row: image profiles along the centers of the 640×512 simulated image (red line) and true image (blue line) in the horizontal and vertical directions. Bottom row: image intensity profiles along the centers of the 320×256 simulated image (red line) and true image (blue line) in the horizontal and vertical directions.

same lighting condition [3]. However, the LR images and the corresponding HR images generated from the camera ignore spatial variations in lighting spectra and the scene may change at different time. In fact, the optical image is often affected by spatial variations in lighting spectra and the spectrum of the illuminance throughout the scene is not uniform. Moreover, the captured LR images and HR images cannot represent the same scene when the motion of objects such as human or other moving objects is considered. This means that the LR images and HR images must be collected from the same scene under the same lighting condition. Thus, there exists a drawback in [3] that the LR image and the HR image cannot be collected in the same time to prevent the change of lighting. In this case, the map between the LR image and the corresponding HR version will be affected by the spatial variations in lighting spectra, which is random and difficult to determine. Thus, the camera in [3] cannot effectively capture the concurrence priorcorresponding relation between the LR and HR images. Recently, most researchers adopted many methods to obtain the different resolution images by using the different imaging system hardware. Belay et al. [4] designed a three-channel multiresolution smart imaging system that many image processing methods can be exploited at different segments of the image sensor. Multispectral cameras and multispectral displays have been proposed to meet various demands in image processing. However, the aforementioned imaging system cannot be resistant to the change of the light intensity due to the drawback of the imaging system. To tackle the aforementioned drawback, a new device coined as multiresolution camera is reported in this paper. It can be ensured that the true different resolution images can be collected from the same scene by using the proposed device.

152

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 44, NO. 1, JANUARY 2014

Fig. 4. (a) and (b) Images obtained by unique 930-system camera at different times, respectively. (c) and (d) Image profiles along the centers of the images (a) and (b) in the horizontal and vertical directions. (e) Absolution value of the difference between the two images from (a) and (b). TABLE I Mirror Parameters of the Multiresolution Imaging System

Fig. 5.

In this paper, we present the design details of the new imaging equipment of multiresolution camera. The designed multiresolution camera adopts a three-path spectrophotometric optical system, and obtains different resolution images under the same lighting condition. The multiresolution camera exploits the spectroscopic technology, which enables the same optical intensity distributions with its surrounding illumination. The different resolution images generated from the multiresolution camera can be precisely taken as the test images to evaluate the performance of super-resolution algorithms. The rest of this paper is organized as follows. Section II describes the principle of the multiresolution camera. Our experimental results are reported in Section III. Finally, Section IV concludes this paper. II. Multiresolution Camera Getting the true LR images and their corresponding HR versions is challenging, which is caused by the construction principles of the ordinary camera. This is because the different resolution images are obtained with the different spatial variations of lighting. In this case, ordinary cameras cannot capture the same scene at the same time. Recently, many researchers adopted two measures to generate a pair of true

Optical principle of the flat mirror M.

LR images and its corresponding HR versions. First, one starts from a collection of HR images and degrades each of them to obtain the LR images. Typically, an HR image is blurred and subsampled to create an LR one by using the model of the degradation process. However, image SR reconstruction is a computationally complex and numerically ill-posed problem. There is often a one-to-many correspondence between LR image and the corresponding HR one. Thus, it is difficult to describe their relationships by using the degradation model. Second, many different resolution cameras are exploited to take the different resolution images at the different time from the same scene. However, the second measure has some drawbacks. First, there is (small) relative motion between different resolution cameras at different times, which results in the motion of pixels from one image to the others. Moreover, there exists the discrepancy between images from the same camera at different times due to the change of lighting conditions. To further explain this, we use the camera (unique 930) to take two frames with 0.03-s interval, which are shown in Fig. 4(a) and (b). As shown in Fig. 4(c) and (d), it can be seen that the difference of the same resolution images generated

LU AND LI: MULTIRESOLUTION IMAGING

Fig. 6.

153

Three-path optical system.

Fig. 8. Different resolution images captured by the multiresolution camera in the same light intensity. (a) 1280×1024 true image. (b) 640×512 true image. (c) 320×256 true image.

Fig. 7.

Multiresolution camera.

by the same camera is obvious. Similarly, the difference between Fig. 4(a) and (b) can be shown by using red circle in Fig. 4(e). This is mainly because the lighting intensities result in the differences of illumination of the image. Therefore, it is necessary to consider the relationship between the LR image and the corresponding HR image in the case of same light intensities. Currently, in order to capture the different resolution images, one can only shoot the same scene under the same location by using different resolution cameras. In this case, the obtained frames cannot represent the true different resolution images due to the effect of the intensity of the lighting. These images are the approximated version of the true different resolution images. It is crucial for SR methods to obtain true image pairs of LR and HR images for comparison. According to the obtained true image pairs, the intrinsic links of different resolution images can be precisely investigated. Traditionally, the relationships of the different images obtained by the ordinary camera are affected by the change of the light intensity or the motion of the subject. To overcome this problem and obtain the true different resolution image pairs, we design a multiresolution camera imaging system, which can snap the different

resolution images of the scene simultaneously, rather than snapping the different resolution images at different time. The multiresolution imaging system can provide an essential role in solving some fundamental questions about the relationships of the different resolution images. A. Optical Configuration of Multiresolution Camera System The optical design of the multiresolution imaging system has been driven from some scientific constraints. In fact, the multiresolution camera is designed to observe different resolution images from a same scene at the same time. This implies first that the design of spectrophotometric optical system is necessary to capture different resolution images at the same time. As shown in Fig. 5, the flat mirror M can be exploited to split the vision light. Then, it is necessary to choose the suitable transmittance and reflectance for the flat mirror to make the same illuminance of light captured on the different detectors. In fact, there exists the difference in different images generated from the same scene due to the spatial variations in lighting spectra. Finally, different resolution detectors are necessary to observe the characteristic features of different resolution images. The configuration obtained applying the aforementioned consideration is listed in Fig. 6. It can be shown in Fig. 6 that the designed multiresolution camera adopts the three-path spectrophotometric optical system. It consists of an coaxial section of a three-mirror concentric system. The parameters

154

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 44, NO. 1, JANUARY 2014

Fig. 9. Difference of the same resolution images generated from difference methods. (a) 1280 × 1024 CAR images captured from the multiresolution camera are regarded as the HR image. (b) 640 × 512 CAR images captured from the multiresolution camera are regarded as middle resolution image. (c) 320 × 256 CAR images captured from the multiresolution camera are regarded as LR image. (d) Image expanded from LR version using the learning-based method from (c). (e) Obtained image by downsampling the HR image from (a).

Fig. 10.

Histogram of different images at given resolution. (a) Histogram of Fig. 9(b). (b) Histogram of Fig. 9(d). (c) Histogram of Fig. 9(e).

of all the mirrors are listed in Table I. The principle of design of the multiresolution camera system is presented as follows. First, as shown in Fig. 6, the light coming from the left is spitted into three components by the spectroscopes L1, L2, and L3, respectively. Second, it can be seen from Fig. 6 that a ray of light is reflected by L1 and reaches the CCD1 detector, and the other part of light goes through L2. Finally, when the transmission light goes to L2, a portion of light is reflected by L2 and then captured on the CCD2 detector. Similarly, the other part of light can reach the CCD3 detector. As shown in Fig. 6, the different images can be captured from the same scene using the multiresolution camera system. To make the captured images more resistent to the change of the light intensity or the motion of the subject, it is necessary to guarantee that the illuminance of the light captured on three CCDs is equal. If the pixel size of an image is bigger than one pixel of sensor, the illuminance of the scene image captured on detector can be formulated as Ei =

πB D2 · · k1 k2 × 104 4 fi2

where Ei denotes the scene luminance of the ith CCD detector, B represents the light intensity of the scene, fi is the focal length of the ith-path optical system, fDi denotes the relative aperture of the ith optical system, D denotes the diameter of optical system, and k1 and k2 denote the transmittance of the optical system and atmosphere, respectively. When the reflected light of scene image is ignored, we can get the light intensity of the scene as follows: B=

(3)

where ρ denotes the reflection ratio of the object, E denotes the illumination on the object. According to (2), when the light intensity of an object is given, the illuminance of the scene 2 image captured on CCD depends on D and k1 . Thus, in order fi2 to make sure that the illuminance on three CCDs is equal, (4) can be obtained by analyzing (2) 

(2)

ρE 104 π

τ1

D1 f1

2

 = τ2

D2 f2

2

 = τ3

D3 f3

2 (4)

LU AND LI: MULTIRESOLUTION IMAGING

155

where Df11 , Df22 , and Df33 denote three relative apertures of the multiresolution camera system, respectively. τ1 , τ2 , and τ3 denote the transmittance of three different spectroscopes L1, L2, and L3 in Fig. 6, respectively. When the relative aperture of imaging system is given, we can ensure that the illuminance of an image captured on three CCDs detector is equal by choosing the suitable parameters τ1 , τ2 , and τ3 . B. Design of Multiresolution Camera System In this section, we discuss the design of the multiresolution camera system, which is based on the three-path light splitting system. The structure of the three-path light splitting system is illustrated in Fig. 6. We will discuss five components in the design of multiresolution camera system: focal length, selection of CCDs detector, field of view, relative aperture, and optic lens. 1) Focal Length: It is necessary to suitably choose the focal length in the multiresolution camera system. First, the focal length of the multiresolution camera system cannot be too large for practical application. Second, when the detectors are fixed, small focal length will make the field of view very large and increase the cost in design. Thus, the focal length cannot be too small either. Hence, the focal length of the threepath optical system is set to be 12.5, 25, and 50 mm. 2) Selection of CCDs Detector: In our multiresolution camera system, three CCDs detectors with the same configurations called MVC1000MF USB are chosen. This is because CCDs detectors can provide better performance, ensuring good overall image quality in addition to a low cost. For a CCDs sensor, the size of each pixel and circumcircle diameter are 5.2 μm×5.2 μm and 7.35 μm, respectively. 3) Field of View: The field of view (FOV) can be formulated as   T ω = arctg (5) 2f where ω denotes half of the field of view, T denotes the size of the target surface of the camera, and f represents the focal length. In fact, the multiresolution image system is mainly designed to observe the feature of the object in different resolution cases. It is necessary to guarantee that the faint structure of the object can be seen even in the low-resolution case. Hence, a rather large FOV is necessary to observe the faint structure of object. As a consequence, the field of view can be set as 28.80 × 28.80 , 14.70 × 120 , and 7.330 × 5.50 , respectively. 4) Relative Aperture: To make sure the image resolution captured on the CCD detector is independent of the lens, the diameter of diffraction spot is less than the diameter of circumcircle of each pixel. Hence, the relative aperture is defined as follows according to the diffraction theorem: d 0.00735 F≤ = = 5.5 (6) 2.44λ 2.44×0.00055 where d denotes the diameter of circumcircle of each pixel and is defined as 7.35 m. In our multiresolution camera system, the relative apertures of multiresolution cameras are set to 0.7143, 0.7143, and 0.4348 respectively.

Fig. 11. Five of the 50-type different resolution images generated from multiresolution camera. (a) HR images. (b) Middle resolution images. (c) LR images.

5) Design Parameters for Camera System: In this paper, the requirements for the camera design definition are defined as follows: the focal lengths of the three light splitting systems are set to 12.5, 25, and 50 mm, respectively. The sizes of CCDs are 5.2 mm × 5.2 mm and the size of the object is equal to 960 × 1280 pixels. The distance between the object and the optic lens is 10 m. When the distance from the object to the optic lens and the size of the object are given, the size of an image is proportional to the focal length according to the mirror formula. When the size of CCDs, the size of object, and the distance between the object and the optic lens are fixed, the image resolution can be adjusted by choosing the proper focal length. Therefore, we can get different resolution by setting different focal lengths. The main parameters of the multiresolution camera system are defined as follows: the total length of system (to detector) is 180 mm, the caliber of lens cone is 100 mm, the size of three spectroscopes is 110 mm × 64 mm. When the focal lengths of the three-path optical system are

156

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 44, NO. 1, JANUARY 2014

Fig. 12. Intensity histograms of different resolution images generated from multiresolution camera. (a) Averaged over 50 HR images. (b) Averaged over 50 middle resolution images. (c) Averaged over 50 LR images.

Fig. 13. Average intensity histograms of different resolution images using downsampling. (a) Averaged over 50 HR images. (b) Averaged over 50 middle resolution simulated images. (c) Averaged over 50 LR-simulated images.

Fig. 14. Curve of the average logarithmic density of image gradients of the 50-type different resolution images generated from multiresolution camera. (a) Averaged over 50 HR images. (b) Averaged over 50 middle resolution images. (c) Averaged over 50 LR images.

set to be 12.5, 25, and 50 mm, respectively, the maximum relative apertures of the three-path optical system are 0.7143, 0.7143, and 0.4348, respectively. According to (4), we can choose the appropriate spectroscopic transmittance such that the illuminance amount of images captured on three CCD detectors is equal. III. Experimental Results In this section, we focus on dealing with two problems by using the proposed multiresolution camera. 1) Are the downsampled or upsampled versions of the original image consistent with the LR or HR real image from a camera? 2) How to precisely evaluate the performance of the super-resolution algorithms. To tackle these two problems, we make three sets of studies.

a) The first sets of studies are to capture a scene in three different resolutions by using a multiresolution camera. b) The relationship between two images with the same resolution is studied in the second set of studies. One image is gained via downsampling or super-resolution, and the other image is directly captured at this given resolution. c) In the third set of studies, the performance of the algorithms of super-resolution are investigated. A. Experimental Setting In our experiments, the device of the multiresolution camera is shown in Fig. 7. The spectroscopes L1, L2, L3 and three CCD detectors with the same configurations called MVC1000MF USB can be seen in Fig. 7. The imaging

LU AND LI: MULTIRESOLUTION IMAGING

parameters of the multiresolution camera system are defined as follows. 1) The first spectroscopic transmittance of the spectroscope L1 is τ1 , the second spectroscopic transmittance of the spectroscope L2 is τ2 , and the third spectroscopic transmittance of the spectroscope L3 is τ3 =0. 2) The focal lengths of three path optical systems are set to be 12.5, 25, and 50 mm, respectively. 3) The distance from the object to the lens is 10 m. 4) The pixel size of the detector is 5.2 μm × 5.2 μm. In this case, when the focal length of the system is set to be 12.5 mm, the maximum relative aperture is set to be 0.7143. When the focal length of the system is set to be 25 mm, the maximum relative aperture is set to be 0.4348. According to (4), the spectroscopic transmittance of the different spectroscopes is τ1 =0.7872 and τ2 = 0.7297, and the luminance of the image captured on three CCD detectors will be the same. In this paper, the problem of alignment for the different resolution images is very critical to manifest the difference among the three types of images. Capturing the registered images may require precise positioning of the optical components such that there is no subpixel shift in the captured images. To prevent the captured images subpixel shift, it is necessary to provide the alignment system to align the optical axis of the multiresolution system. The alignment process can be divided into three steps. First, a collimator is exploited to provide an infinite target for the alignment system. An offaxis parabolic total reflection optical system is exploited to test the alignment of optical axis. Second, a beam-splitter is exploited to provide pointing reference for visible light spot. Finally, a Gauss ocular with temperature controlled cross hair is adopted to offer pointing reference for other visible light. In this case, the optical axis of area camera lens and CCDs is designed along the straight line. According to the formula of the object-image and the imaging parameter of the multiresolution camera system, the corresponding sizes of image plane are 1.6 μm × 1.2 μm, 3.2 μm × 2.4 μm, and 6.4 μm × 4.8 μm, respectively. Hence, the corresponding sizes of three images captured on three CCDs are 320 × 256 pixels, 640 × 512 pixels, and 1280 × 1024 pixels, respectively. In this case, we can get the different resolution images generated from the multiresolution camera. Fig. 8(a)–(c) is the different resolution images captured by the multiresolution camera from a real scene in the same light intensity. The sizes of three different resolution images are 1280 × 1024 pixels, 640 × 512 pixels, and 320 × 256 pixels, respectively. B. Simulated Images and True Images at Same Resolution In the set of experiments, we mainly consider the difference between the same resolution images generated from the learning based SR method, the multiresolution camera, and downsampling, respectively. As a result, the generated images from a scene at the same resolutions can be described as follows. First, three different resolution images generated from the same scene can be taken from the multiresolution camera, where the ratio of the resolutions is 16:4:1, respectively. In this case, the generated images are regarded as the HR image,

157

Fig. 15. Results of the Kurtosis values for different frequencies from 50-type different resolution images.

middle resolution version, and the LR version, as shown in Fig. 9(a)–(c). Second, by using the learning-based SR method [2], the LR image in Fig. 9(c) can be upsampled to be the middle resolution version, as shown in Fig. 9(d). Third, the HR image can be degraded into a middle resolution version by using downsampling, as shown in Fig. 9(e). It can be seen in Fig. 9 that Fig. 9(b) has higher contrast and a clearer image compared to Fig. 9(d) and (e). To further confirm this observation, it can be observed in Fig. 10 that the histogram distributions from the three images with the same resolution are different and the effective width of histogram in Fig. 9(b) exceeds those of image histogram in Fig. 9(d) and (e). In addition, different shapes of the histograms shows that the same resolution images generated from different measures are different. In the following, we established a dataset with 50 groups of images captured by the multiresolution camera, five of which are shown in Fig. 11. Each group is composed of three different resolution images generated from the same scene, which includes the different type images: outdoor, indoor, dynamic, and static. The average intensity histogram of the 50 type images is shown in Fig. 12. The intensity of all images is normalized between 0 and 31. Similarly, it can be seen in Fig. 13 that the average intensity histogram of the 50 different resolution images, which include the HR images, middle resolution images generated from HR images with decimation factor of 2, and LR images generated from HR image with decimation factor of 4. It appears that the average histograms distributions of LR images generated from multiresolution images are different from those of the corresponding versions, while the difference between the average histogram distributions of LR images and those of HR images in Fig. 13 is small. It shows that the information of LR images from a real scene is different from the information of the corresponding HR image versions and the same resolution image from downsampling or upsampling is different from the image generated from the real camera. Hence, the LR images cannot be constructed using downsampling, which is usually regarded as the test image to evaluate the performance of the SR method. The curve of the logarithmic density of image gradients can be shown in Fig. 14, and is peaked at zero and has the heavier tails which is well known property of natural

158

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 44, NO. 1, JANUARY 2014

Fig. 16. Comparison with different reconstructed results on different resolution images. (a) Randomly selected images captured from multiresolution camera. (b) SR images using learning based method from the true LR image with upsampling factors of 2 and 4. (c) SR images using the learning based method from the simulated LR image with upsampling factors of 2 and 4.

Fig. 17. Comparison with different reconstructed results on different resolution images. (a) Randomly selected images captured from multiresolution camera. (b) SR images using the learning based method from the true LR image with upsampling factors of 2 and 4. (c) SR images using learning based method from the simulated LR image with upsampling factors of 2 and 4.

images. Natural images are known to have scale invariant statistic. Recently, some studies have reported that the kurtosis values are lower for high-frequency filters than for lower frequency ones [5]. Fig. 15 shows the average Kurtosis values of the 50 different resolution images. In Fig. 15, HR, MR, and LR represent the high-resolution images size 1280×1024, middle resolution ones of size 640 × 512, and the LR versions of size 320×256 from the multiresolution system, respectively. Simulated MR and simulated LR denote middle resolution images of size 640×512 and the LR versions of size 320×256 by using downsampling, respectively. It can be seen in Fig. 15 that the marginal distributions of kurtosis results in HR, LR, and MR are consistent. Compared with simulated MR and simulated LR, moreover, the lower frequency in HR, LR and MR have a higher peak than the high frequency, making it more kurtotic. Hence, HR, LR, and MR can better satisfy the property of natural images compared with simulated MR and simulated LR. That is, the downsampled versions of the HR images cannot represent the LR real image from a camera. C. Performance Evaluation of Super-Resolution Algorithms The aforementioned demonstration has shown that it is inappropriate to generate LR images as the test images for simulating the true LR by using donsampling (or upsampling)

TABLE II PSNR and SSIM for Two or Four Scale Factors. For Each Image, We Have Two Rows. The First Row Is PSNR, the Second Row Is SSIM

operation. For low-level vision tasks, the LR images with downsampling operation cannot represent the real scene. Let us compare the different reconstructed results from LR images captured from the multiresolution camera and LR images with downsampling operation. First, we present the process in which the observed LR images generated from the multiresolution camera are taken as test images to evaluate the performance of the reconstruct algorithm. In this case, the process is called MR. Similarly, the process that

LU AND LI: MULTIRESOLUTION IMAGING

159

Fig. 18. Comparison with different reconstructed results on different resolution images. (a) Randomly selected images captured from multiresolution camera. (b) SR images using learning based method from the true LR image with upsampling factors of 2 and 4. (c) SR images using learning based method from the simulated LR image with upsampling factors of 2 and 4.

Fig. 19. Comparison with different reconstructed results on different resolution images. (a) Randomly selected images captured from multiresolution camera. (b) SR images using learning based method from the true LR image with upsampling factors of 2 and 4. (c) SR images using learning based method from the simulated LR image with upsampling factors of 2 and 4.

same resolution LR images using downsampling operation is regarded as test images is called DO. It can be shown in left column (a) of Figs. 16–19 that left column shows some randomly selected different resolution images captured by using the designed multiresolution camera. The ratio of different resolutions among them is 1:4:16. In left column of Figs. 16–19, the generated images are regarded as the HR images size 1280 × 1024, middle resolution ones of size 640×512, and the LR versions of size 320×256, respectively. The ground truth images in left column (a) of Figs. 16–19 from multiresolution camera are available for comparison. The reconstructed results using learning based method [2] are shown in middle column (b) of Figs. 16–19, which is generated from the LR images of multiresolution camera. In middle column (b) of Figs. 16–19, middle row and bottom row present the reconstructed results with the upsampling factor of 2 (called MR2) and 4 (called MR4), respectively. Similarly, it can be seen in right column (c) of Figs. 16–19

that the different LR images with downsampling operation are reconstructed for the upsampling factor of the 2 (called DO2) and 4 (called DO4). It can be observed that the reconstructed results in middle column (b) of Figs. 16–19 are clearly visible and have better details compared to those in right column (c) of Figs. 16–19. This is because the test LR images generated from multiresolution camera are closer to the real scene and have more high-frequency information when compared to those generated from the downsampling operation. For objective evaluation, peak signal-to-noise ratio (PNSR) and structural similarity (SSIM) are employed to evaluate the difference between the original images in left column (a) of Figs. 16–19 and the reconstructed results in middle column (b) or right column (c) of Figs. 16–19. Table II shows the PSNR and SSIM of learning methods from the different test image. It can be found that objective evaluations index in Table II are inconsistent with subjective evaluations of middle column (b) and right column (c) in Figs. 16–19. In Table II, the PSNR

160

IEEE TRANSACTIONS ON CYBERNETICS, VOL. 44, NO. 1, JANUARY 2014

and SSIM values of DO4 and DO2 are superior to MR4 and MR2, respectively. However, MR4 and MR2 produce sharper edges and better texture compared to DO4 and DO2 in subjective visual. This is because that the optical axis of three CCD detectors is inconsistent with the straight line in designing the multiresolution system. In this case, there exists some small discrepancy between the HR images and the corresponding versions from the multiresolution camera. However, it does show that the reconstructed results using test images generated from multiresolution are better than those using test images from downsampling operation with subjective evaluations. It is an important technique in image quality assessment. Hence, it is necessary for the objective image quality assessment approaches to consider the distortion parameters of the imaging system in future works. IV. Conclusion This paper reports a new design of a multiresolution camera, which is exploited to study the relationship of different resolution images. By adopting three-path light splitting technology, we can ensure that the illuminance of image captured on three CCDs is equal. This means that the LR and HR images can be collected from the same scene at the same time. The random error generated from the spatial variations in lighting spectra can be ignored by using the multiresolution camera. A large number of experimental results demonstrate that the proposed multiresolution camera has advantages over the ordinary camera. Although the alignment system is adopted to register the captured images in this paper, there is subpixel shift between the three images with different resolutions due to the imprecision of hardware. Hence, it is necessary to exploit post-processing of image registration to obtain no subpixel shift in the captured images in the future work.

Acknowledgment The authors would like to thank Prof. P. Yan and Prof. Y. Yuan for proofreading.

References [1] J. Pendry, “Negative refraction makes a perfect lens,” Phys. Rev. Lett., vol. 85, no. 18, pp. 3966–3969, 2000. [2] W. Freeman, T. Jones, and E. Pasztor, “Example-based super-resolution,” IEEE Comput. Graph. Appl., vol. 22, no. 2, pp. 56–65, 2002. [3] P. Gajjar and M. Joshi, “New learning based super-resolution: Use of DWT and IGMRF prior,” IEEE Trans. Image Process., vol. 19, no. 5, pp. 1201–1213, May 2010. [4] G. Belay, Y. Meuret, H. Ottevaere, P. Veelaert, and H. Thienpont, “Design of a multichannel, multiresolution smart imaging system,” Appl. Opt., vol. 51, no. 20, pp. 4810–4817, 2012. [5] M. Bethge, “Factorial coding of natural images: How effective are linear models in removing higher-order dependencies?” J. Opt. Soc. Am. A. Opt. Image Sci. Vis., vol. 23, no. 6, pp. 1253–1268, 2006.

Xiaoqiang Lu is currently an Associate Professor with the Center for OPTical IMagery Analysis and Learning (OPTIMAL), State Key Laboratory of Transient Optics and Photonics, Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an, China. His current research interests include pattern recognition, machine learning, hyperspectral image analysis, cellular automata, and medical imaging.

Xuelong Li (M’02–SM’07–F’12) is currently a Full Professor with the Center for OPTical IMagery Analysis and Learning (OPTIMAL), State Key Laboratory of Transient Optics and Photonics, Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an, China.

Multiresolution imaging.

Imaging resolution has been standing as a core parameter in various applications of vision. Mostly, high resolutions are desirable or essential for ma...
1MB Sizes 0 Downloads 0 Views