4724

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 11, NOVEMBER 2014

A Novel Classification Method of Halftone Image via Statistics Matrices Zhi-Qiang Wen, Yong-Xiang Hu, and Wen-Qiu Zhu Abstract— Existing classification methods tend not to work well on various error diffusion patterns. Thus a novel classification method for halftone image via statistics matrices is proposed. The statistics matrix descriptor of halftone image is constructed according to the characteristic of error diffusion filters. On this basis, an extraction algorithm is developed based on halftone image patches. The feature modeling is formulated as an optimization problem and then a gradient descent method is used to seek optimum class feature matrices by minimizing the total square error. A maximum likelihood method is proposed according to priori knowledge of training samples. In experiments, the performance evaluation method is provided and comparisons of performance between our method and seven similar methods are made. Then, the influence of parameters, performance under various attacks, computational time complexity and the limitations are discussed. From our experimental study, it is observed that our method has lower classification error rate when compared with other similar methods. In addition, it is robust against usual attacks. Index Terms— Classification, halftone image, matrices, error diffusion, maximum likelihood.

statistics

I. I NTRODUCTION

D

IGITAL halftoning, known as spatial dithering, is a process of converting a continuous tone image into a form suitable for display on binary devices or multilevel devices, such as newspaper printers, laser printers and some computer screens with several levels. These binary or multilevel hard copies are similar to the original images when viewed from a distance due to low-pass filtering characteristic of the human visual system. Generally, digital halftoning algorithms can be grouped into three categories: screening, error diffusion (ED) and iteration-based method [1]. Screening periodically tiles the threshold matrix over the image and compares each pixel with a matrix threshold value to binarize a continuous tone image. Depending on the screen matrix,

Manuscript received June 13, 2013; revised February 16, 2014; accepted August 4, 2014. Date of publication August 18, 2014; date of current version September 25, 2014. This work was supported in part by the National Natural Science Foundation of China under Grant 61170102, Grant 61350011, and Grant 11301170, in part by the Natural Science Fund of Hunan province in China under Grant 14JJ2115 and Grant 12JJ2036, and in part by the Education Department Fund of Hunan province in China under Grant 12A039. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Andrew Lumsdaine. The authors are with the School of Computer and Communication, Hunan University of Technology, Zhuzhou 412007, China (e-mail: [email protected]; [email protected]; [email protected]). This paper has supplementary downloadable material available at http://ieeexplore.ieee.org., provided by the author. The material includes datasets and Matlab code for classification of halftone images. The total size of the videos is 60.6 MB. Contact [email protected] for further questions about this work. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIP.2014.2348862

Fig. 1. The schematic diagram of digital halftoning and reconstruction of original image. Digital halftoning means the output of a binaryzation image (right). Inverse halftoning means the reconstruction of an original image (left). In practice, we need an original image but there are only its halftone image (or its image hard copy). In this situation, inverse halftoning is in need urgently.

screening has many kinds of algorithms, such as dispersed dot screen [2] and dot diffusion [3]. ED also compares each pixel with a fixed threshold, but the binarization error is diffused ahead to some neighbor pixels that have not yet been visited according to a certain proportion. There are also many popular ED algorithms, such as Floyd-Steinberg filter, Stucki filter and Stevenson filter [22]. Iteration-based methods, such as the direct binary search (DBS) [4], minimize a metric of perceived error between the continuous tone image and the halftone image according to a HVS model. Even though DBS can create high quality halftone images, it is computationally expensive, and hence only used as a benchmark [5]. Both screening and ED are low complexity algorithms and the visual quality of the produced binary images are fairly good. Thus, they have been used for many applications. Inverse halftoning, which belongs to the image restoration, is the inverse process of digital halftoning as shown in Fig.1. It also has a wide range of applications, e.g. paper image digitization, digital publishing system, image compression, and printed image processing. There are two properties about inverse halftoning: 1) It is an ill-posed problem because the binarization operation of digital halftoning is a many-to-one problem whereas the inverse operation is a one-to-many problem, and 2) The noises, introduced by digital halftoning methods, are dispersed over high or medium frequency component of halftone image. These noises will prevent the inverse halftoning problem from being solved. In practice, it is desirable to design an effective denoising strategy according to some priori knowledge about halftoning images. These priori knowledge include periodicity, distribution, correlation, and dispersion of point pattern. Most of the inverse halftoning methods describe the inverse process of a specific halftone pattern, such as binary permutation filters [6], probabilistic modeling [7], maximum

1057-7149 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

WEN et al.: NOVEL CLASSIFICATION METHOD OF HALFTONE IMAGE VIA STATISTICS MATRICES

Fig. 2. Halftone images produced through utilizing six ED filters (see TABLE I) on a 256-level gray image (peper, 256 × 256). Each halftone image displays its own halftone dot pattern and the differences between the textures of these halftone images are not obvious except (f). (a) Bur. (b) Sie. (c) Stu. (d) Jar. (e) Flo. (f) Ste.

a posteriori (MAP) estimation [8], projection onto convex set [9], optimized inverse halftoning [10], fast inverse halftoning for ordered dithered images [11], and fast inverse halftoning algorithm for error diffused halftones [12]. However, it is difficult to obtain the optimal reconstruction quality due to unknown halftoning pattern in practical applications. To solve this problem, it is desirable to develop a classification method for halftone images. A general inverse halftoning process should contain two parts: classification and reconstruction. This paper focuses on the classification of halftone images. A. Related Work To solve the problem of halftone pattern classification, there have been several attempts. Chung, et al [13] classified the halftone images into four types by using a three-layer back propagation neural network. Their data set is limited to the halftone images produced by dispersed-dot ordered dithering, clustered-dot ordered dithering, constrained average and ED. Kong, et al [14], [15] used enhanced 1D correlation and gray level co-occurrence matrix to extract features of halftone images, and then employed decision tree to split the halftone images into nine categories. Liu, et al [16] divided the halftone images into four types by using support region and least mean square (LMS) algorithm. Soon after, they extracted features from Fourier spectrum of nine types of halftone images by LMS, and then used naive Bayes to classify these halftone images [17]. However, all these methods try to avoid involving the classification problem of ED patterns. In [14], [15], and [17], three ED filters are included, while only one is involved in [13] and [16]. These are six popular ED filters which employ a similar approach with different ED kernels. It is difficult to classify these halftone patterns because of almost inconspicuous differences among various halftone pattern textures (see Fig. 2) or among various halftone image spectrum textures (see Fig. 3). Currently, ED is one of the most popular approaches, as it provides good image quality at a reasonable cost [18]. So it is necessary to develop the classification mechanism of various ED patterns for many graphics applications of inverse halftoning.

4725

Fig. 3. Fourier spectrums of the six ED images (see Fig. 2). The differences between these spectrum textures are not obvious except (f). (a) Bur. (b) Sie. (c) Stu. (d) Jar. (e) Flo. (f) Ste.

Fig. 4.

Schematic diagram of Floyd-Steinberg method.

B. Main Contributions We focus on the research of classification mechanism for six ED patterns. The main contributions of this paper are: 1) We proposed a feature extraction method based on statistics matrices according to the analysis on dot pattern of six ED filters. 2) We formulate feature modeling as an optimization problem and seek the optimal class feature matrices to minimize the total square error between expected output and the actual output of matrix dot product. The convergence is discussed theoretically and experimentally. 3) We propose a maximum likelihood method to classify halftone images under the assumption of Gaussian distribution. The rest of the paper is organized as follows: We begin with the concept of ED in Section II. Section III details the feature extraction based on statistic matrices and we explain the details about feature modeling and maximum likelihood method in Section IV. Section V presents the experiment results and Section VI concludes this paper. II. E RROR D IFFUSION ED algorithm [19] has attracted much attention in printing applications since its publiction in 1976. ED means the quantization error of the current input pixel is linearly filtered and then transferred to future input pixels in order to diffuse the quantization error among neighboring pixels. Let J (x, y) denote the pixel located at (x, y) in original gray image J and let F(x, y) denote the pixel located at (x, y) in halftone image F. According to ED algorithm, the value of J (x, y) is normalized to [0, 1] and the value of F(x, y) is quantized to 0 or 1. In addition, W (x, y) denotes the weight of an ED filter. Fig. 4 shows the schematic diagram of Floyd-Steinberg filter. The fixed weights of Floyd-Steinberg filter are shown in Fig. 6(a), where the pixel marked by dark dot denotes the current pixel (x, y) and the neighboring

4726

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 11, NOVEMBER 2014

Let b O , b A and b B (0 ≤ b O , b A , b B ≤ 1) be the pixel values of O, A and B point, respectively. According to Fig. 5, the pixel binaryzation of O point is  1 bO ≥ T qO = 0 other wi se

Fig. 5.

Pseudo code of Floyd-Steinberg method. TABLE I

S IX ED F ILTERS AND T HEIR A BBREVIATIONS

weights are W (x, y + 1) = 7/16, W (x + 1, y − 1) = 3/16, W (x + 1, y) = 5/16 and W (x + 1, y + 1) = 1/16. Floyd-Steinberg method contains three steps as illustrated in Fig. 5. The advantage of Floyd-Steinberg method is its simplicity and good visual quality of binary images. For this reason, it has become very popular in printing industry. Six popular ED methods are shown in TABLE I. All these filters are summarized in [20]–[22]. In practice, these filters can form their own template matrices as shown in Fig. 6. Among these template matrices, the maximum size of the neighbourhood is 4 × 7 (see Fig. 6(f)) and the minimum size is 2 × 3 (see Fig. 6(a)). III. F EATURE E XTRACTION BASED ON S TATISTICS M ATRICES For convenience, we define two co-occurrence neighboring pixels in a halftone image as pixel pair. For example, OA, OB, OC and OD become four pixel pairs in Fig. 7. Since each pixel of a halftone image only gets 0 or 1, there are four kinds of pixel pairs: 0-0, 0-1, 1-0 and 1-1. Because both 0-1 and 1-0 are commutative, we call them 1-0 pixel pair. Therefore, a halftone image contains three kinds of pixel pairs: 0-0, 1-0 and 1-1. This section presents three statistics matrices of a halftone image as the texture descriptor for halftone pattern according to the analysis on the characteristics of six ED filters. A. Statistics Matrix Descriptor To better understand the presented method, we will analyze the characteristics of ED filters. Fig. 7 shows the template matrix of Stucki filter where the number signifies the diffusion proportion. A point accepts more diffusion errors than B point does in Fig. 7. That is to say, OB has larger probability than OA to become 1-0 pixel pair. The reason is given below.

and the binaryzation error is e = q O − b O . After that, the new pixel value of A point bA = b A + 8e/42 after accepting the diffusion error according to the diffusion proportion. The new pixel value of B point bB = b B + e/42. There are two cases: 1) b O ≥ T . We can obtain q O = 1 and e ≥ 0. Since both b A and b B have the same probabilities to possess a certain pixel value, we let b A = b B under the ideal condition. We get bA ≥ b B . Thus B point has larger probability than A point to acquire 0 value in subsequent binaryzation operation, namely PO A1 < PO B1 where PO A1 (PO B1 ) denotes the probability of OA (OB) becoming 1-0 pixel pair in this case. 2) b O < T . We can get q O = 0 and e < 0. Similarly, let b A = b B under the ideal condition and we obtain bA < bB . Consequently, B point has larger probability than A point to acquire 1 value, namely PO B2 > PO A2 where PO A2 (PO B2 ) denotes the probability of OA (OB) becoming 1-0 pixel pair. We can conclude that the probability of 1-0 pixel pair satisfies PO A1 + PO A2 < PO B1 + PO B2 from above. It is easily observed that C point has the nearly same probability as D point to obtain 1(or 0) value by utilizing the same approach. Based on the above analysis of ED pattern characteristics, we can define three matrices (called statistics matrices) to store the elements of three kinds of pixel pairs with different neighboring distances and different directions. They are referred to as M10 , M11 and M00 , respectively. In order to ensure the consistency in each pixel direction, let M10 , M11 and M00 be three square matrices of the same size L × L (L is an odd number satisfying L = 2R + 1, where R indicates the maximum neighboring distance). We let the center entry of a statistics matrix overlap one of pixels (marked as “O”) as shown in Fig. 8(a) and let other entries of this statistics matrix overlap other neighborhood pixels. Then, we start to compute three statistics on 1-0, 1-1 and 0-0 pixel pairs within the scope of this statistics matrix. Lastly, we get the statistic matrices as the statistics feature descriptor after normalization. Fig. 8(b) and Fig. 8(a) show, respectively, a 5 × 5 statistics matrix M, and a schematic diagram of the overlaping neighborhood pixels, where the center entry M(3, 3) of the statistics matrix covers the pixel O of a halftone image. Hence, we compute the elements of the three kinds of pixel pairs with different neighboring distances. If we change the position O continually, we can obtain the three statistics matrices. Let F(i, j ) denote the pixel value in the position (i, j ) of a halftone image. The size of halftone image (grayscale image) is set as W × H . To begin, let M10 , M11 and M00 be, respectively, initialized to zero matrices. Then, the statistics matrices (SM) can be obtained from (1). ⎧ ⎪ ⎨M11 (x, y) = M11 (x, y)+1, if F(i, j ) = 1&F(u, v) = 1 M00 (x, y) = M00 (x, y)+1, if F(i, j ) = 0&F(u, v) = 0 ⎪ ⎩ M10 (x, y) = M10 (x, y)+1, otherwise (1)

WEN et al.: NOVEL CLASSIFICATION METHOD OF HALFTONE IMAGE VIA STATISTICS MATRICES

4727

Fig. 6. (a)Floy-Steinberg’s ED; (b)Stucki’s ED (c) Sierra’s ED; (d) Burkers’s ED; (e)Jarvis’s ED; (f)Stevenson’s ED. The left of each subfigure is a filter and the right is its template matrix. The value in each template indicates the diffusion proportion. 0 denotes not receiving any binaryzation errors. The underlined entry denotes the pixel being processed.

Fig. 7. The template matrix of Stucki filter. O denotes the pixel being processed. A, B, C and D indicate the four neighborhood pixels.

Fig. 9.

Pseudo code of the statistics matrix method.

Fig. 8. Statistics matrix method. (a) Schematic diagram of the matrix M overlaping a halftone image; (b) a statistics matrix M when L = 5.

where u = i − R + x − 1, v = j − R + y − 1 and 1 ≤ i, u ≤ W, 1 ≤ j, v ≤ H,1 ≤ x ≤ L,1 ≤ y ≤ L. Fig. 9 shows the pseudo code of the statistics matrix method. Its time complexity is O(W H L 2 ). The statistics matrix M10 , M11 and M00 have a property to be presented in Theorem 1. Furthermore, the center entries in M10 , M11 or M00 are 0. According to Theorem 1, one of the three statistics matrices can be described by other two matrices. This means that it suffices to describe the halftone image textures in terms of two of these statistics matrices. Here M10 and M11 are used to classify the halftone images in experiments, respectively. Theorem 1: Let M10 , M11 and M00 be the statistics matrices of a halftone images according to Fig.9. Then, it holds that M10 + M11 + M00 = constant matrix.

Fig. 10. An example of the statistics matrix method. Deep black grid denotes the image region overlapped by the matrix M when L = 5.

Proof: Let O be the pixel overlapped by the center entry of a statistic matrix and let A be any one pixel in the region covered by a statistics matrix, except O, as shown in Fig. 10. (x, y) denotes the position of A in statistic matrix. We note that OA becomes a pixel pair which may be 1-0, 1-1 or 0-0.

4728

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 11, NOVEMBER 2014

Algorithm 1 Extraction Algorithm Based on Statistics Matrices

There are two cases: 1) Ignoring boundary. The content of OA pixel pair is W H for it needs W H times to change entry O. Consequently, it holds that M10 (x, y) + M11 (x, y) + M00 (x, y) = W H . 2) Considering boundary constraint. Since all halftone image samples are of the same size and the relative position of two neighboring pixels is always a constant, the content of OA pixel pairs is also a constant (set as Const (x, y) for which Const (x, y) ≤ W H ). Hence, M10 (x, y) + M11 (x, y) + M00 (x, y) = Const (x, y). Based on above, we conclude that M10 (x, y) + M11 (x, y) + M00 (x, y) = constant matrix for any halftone image.  B. Extraction of Statistics Matrices Dot pattern of a halftone image has some important local properties. In view of the ED filters, we observe binaryzation errors are spread to their neighborhoods according to an appropriate proportion. That is to say, a halftone image must be split into many nonoverlapping patches with the same size to capture the texture features of halftone pattern. Let Bi be the i th patch with size K × K and let the statistics matrix extracted from Bi be denoted by Mi (M10 or M11 ). We now present the extraction algorithm based on statistics matrices (EASM) of a halftone image, seeing Algorithm 1. One key of Algorithm 1 is the choice of the parameter K . The fixed template matrices in error diffusion methods are used to act on the continuous tone images for producing the halftone patterns as mentioned in Section II. Therefore, K should satisfy K > mi n(Ti ), where Ti , 1 ≤ i ≤ 6, denotes the template matrix size of the i th ED filter. From Fig. 6, we obtain K > 7. During experiments, K = 32 can guarantee that the statistics matrices at least cover an ED kernel. Certainly, a overlapping strategy for computing the statistics matrices can be used in Algorithm 1, however, this strategy will lead to the excessive amounts of computing time, for example, it takes 407.74s per image (256 × 256) while the non-overlapping strategy only take 0.51s. The classification error rate using the overlapping strategy always exceed 30%. IV. C LASSIFICATION OF H ALFTONE I MAGES A. Class Feature Matrix The statistics matrix M(M10 or M11 ) extracted by Algorithm 1 may be used as the input of the neural network,

support vector machine, etc. However it will result in the so-called “curse of dimensionality.” In this paper, we will design the matrix W1 , . . . , W Nc to stand for the halftone patterns produced by ED filters where Nc denotes the number of ED kernels. We name these matrices as class feature matrices. Next, we formulate feature modeling as an optimization problem and then use a gradient descent method to seek optimum the class feature matrices. The obtained optimum class feature matrices have their own characteristics, capturing specific features of the halftone patterns. Let the matrix A and B have the same size W × H . The matrix dot product, denoted by A • B, is A•B =

H W  

A(u, v) × B(u, v)

(2)

u=1 v=1

The matrix dot product satisfies the commutative law and associative law, namely A • B = B • A and (A • B) • C = A • (B • C). n original images may produce N = Nc × n halftone image samples from Nc ED filters, respectively. We use Algorithm 1 to extract N statistics matrix M(M10 or M11 ) as the training sample from N halftone image samples, named M1 , . . . , M N (L × L). The class labels label(M1 ), . . . , label(M N ) denote the types (denoted by 1, 2, …, Nc respectively) of the ED filters. Given the i th sample Mi as the input, the target output vector vi (vi = [v i1 , . . . , v i Nc ], i = 1, . . . , N) and the Nc class feature matrices W1 , . . . , W Nc (L × L), the square error between the actual output and the target output is ei =

Nc 

(v i j − Mi • W j )2

(3)

j =1

where

 vi j =

j = label(Mi ) other wi se

1 0

For N samples, the total square error is e=

N 

ei

(4)

i=1

The purpose of learning is to seek the optimal matrices W j , j = 1, . . . , Nc , by minimizing the total square error e. Computing the derivatives of W j (x, y) in formula (3), yield ∂ei = −2(v i j − Mi • W j )Mi (x, y) ∂ W j (x, y)

(5)

where 1 ≤ x ≤ L and 1 ≤ y ≤ L. We obtain the iteration formula (6) by using the gradient descent method. η is a learning factor and k means the kth iteration. (x, y) = W kj (x, y) + 2η(v i j − Mi • W j )Mi (x, y) (6) W k+1 j Algorithm 2 shows a method for minimizing (4). Since the gradient descent method was used to solve the optimization problem [23], Algorithm 2 is convergent as shown in Fig. 11. However, it is known that the convergent rate is slow. In practice, we can use the more advanced methods such as Newton method, conjugate gradient method or quasi-Newton methods.

WEN et al.: NOVEL CLASSIFICATION METHOD OF HALFTONE IMAGE VIA STATISTICS MATRICES

Algorithm 2 Gradient Descent Algorithm for Class Feature Matrix

4729

B. Maximum Likelihood Method From the Algorithm 2, a simple classification rule is observed: for a given test sample M, compute the matrix dot product by (7) as the actual output. y j = M • W j , j = 1, . . . , Nc

(7)

We obtain the classification rule g1 (y) = arg max y j j

Fig. 11. Examples of the convergence process when η = 0.1. (a) The total square error e with different iterations; (b) (c) TACER and ACERV, respectively, with different iterations. The definitions of TACER and ACERV are referred to Section V.

However, slow convergence rate is acceptable in our study because the process of seeking the optimal class feature matrices is a training process and can be carried out offline. There are yet some parameters needed to be initialized. For each j = 1, . . . , Nc , the class feature matrix W j is chosen to be initialized as a matrix in which the scalar value is drawn from a uniform distribution on the unit interval. In our experiment study, the learning factor η is chosen to be 0.1 ∼ 1.0. For the case of η = 0.1, let 1 = 10−3 and 2 = 0.1, Algorithm 2 converges to a fixed point quickly. To reduce the total iterations, n in and n out should not be too large. We let n in = 10 and n out = 200. To use Algorithm 2 in our experiment study, let each kind of the halftone image pattern contain 3000 samples (M10 ) as the input. The class feature matrices are shown in Fig. 12 where Nc = 6. These class feature matrices have the following properties: 1) Absolute value of an entry in any class feature matricx is different from that in any other class feature matrix, that is to say, a subfigure in Fig. 12 has it own characteristics. 2) The entry with positive value in a class feature matrix indicates more 1-0 pixel pairs, while the entry with negative value means that there are less 1-0 pixel pairs.

(8)

where y = [y1 , . . . , y Nc ]. For convenience, this method is referred to as the maximum scalar (MS) method. By using MS, the average classification error rate is 4.02% when L = 15, and 2.78 when L = 29. This method can be improved through utilizing some priori knowledge on the training samples. Actual output values of the i th sample of the lth class are denoted by yli1 , . . . , yli Nc , where i = 1, . . . , Tl , and Tl is the size of the lth class sample. In experiments, let Tl = 3000, where l = 1, . . . , Nc . According to TABLE I, Nc = 6, consequently, we can plot 36 distribution curves as shown in Fig. 13 through utilizing the actual output yli j , l = 1, . . . , Nc and j = 1, . . . , Nc , produced by the i th sample of the lth class acting on the j th class feature matrix. The distribution curve of values of yl j denotes the statistics histogram of yli j , i = 1, . . . , Tl . These distribution curves have the following properties: 1) There are Nc × Nc curves, each of which obeys approximatively a Gaussian distribution. 2) The curve is in the vicinity of 1 when l = j and these curves are around 0 in other cases. TABLE II shows the means and variances of Nc × Nc curves. 3) There are some cross areas between the curve when l = j and other curves in each subfigure. These cross areas will cause classification errors. In practice, we can use these properties to improve the classification performance. Classification method based on probability theory is an effective method to fuse these properties, so next, we propose a maximum likelihood method. After obtaining the actual matrix dot product output vector y = [y1 , . . . , y Nc ] of the test sample matrix M and the class feature matrices W1 , . . . , W Nc , the probability that a halftone image pattern belong to the lth class can be expressed by p(l|y1 , . . . , y Nc ). The decision rule is g2 (y) = arg maxl p(l|y1, . . . , y Nc ). According to Bayes rule, we obtain p(l|y1, . . . , y Nc ) =

p(y1 , . . . , y Nc |l)P(l) , p(y1 , . . . , y Nc )

where the evidence factor p(y1 , . . . , y Nc ) can be viewed as a scale factor to guarantee that the posterior probabilities sum to one. Prior probability P(l) represents the prior knowledge of the class l and is assumed to be equal for all classes. The prior probability can be transformed into a posteriori probability by the above Bayes formula and the decision rules can be written as g2 (y) = arg maxl p(y1 , . . . , y Nc |l). The actual output y1 , . . . , y Nc are supposed to be mutually indepen Nc dent and p(y1 , . . . , y Nc |l) = j =1 p(y j |l). The decision  c rule is modified to be g2 (y) = arg maxl Nj =1 p(y j |l). This rule can be expressed in the log form: g2 (y) = c arg maxl Nj =1 log p(y j |l). With the assumption that the probability p(y j |l) obeys a Gaussian distribution with mean μl j

4730

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 11, NOVEMBER 2014

Fig. 12. 3D surfaces of the class feature matrices W j , j = 1, . . . , 6 and L = 15. The central entry of each matrix is set as the average value of all entries in these matrices. (a) j = 1. (b) j = 2. (c) j = 3. (d) j = 4. (e) j = 5. (f) j = 6.

Fig. 13. The distribution curves of values of yl j , 1 ≤ l ≤ 6, 1 ≤ j ≤ 6 and L = 15. Each curve obeys a approximate Gaussian distribution. (a) j = 1. (b) j = 2. (c) j = 3. (d) j = 4. (e) j = 5. (f) j = 6. TABLE II M EAN AND VARIANCE OF 36 D ISTRIBUTION C URVES (O BTAINED BY M AXIMUM L IKELIHOOD E STIMATION , L = 15)

y −μ

and variance σl j . Then p(y j |l) = √ 1 exp(− 12 ( j σl j l j )2 ). 2πσl j TABLE II presents an example showing the mean and variance of 36 distribution curves. l = j denotes that μl j is near 1 and

l = j indicates that μl j is in the vicinity of zero. We have Nc Nc 1 y j −μl j 2 √ 1 j =1 log p(y j |l) = j =1 log[ 2πσl j exp(− 2 ( σl j ) )]. Then, the formula (9) can be derived. The first term in the

WEN et al.: NOVEL CLASSIFICATION METHOD OF HALFTONE IMAGE VIA STATISTICS MATRICES

4731

Fig. 14. Sketch map of the distributions of values of yl1 and Gaussian distributions. M10 acts as the input vector, Nc = 6 and L = 11. (a) l = 1. (b) l = 2. (c) l = 3. (d) l = 4. (e) l = 5. (f) l = 6.

right hand side of formula (9) is a constant being independent of class l and hence is ignored. The second term is a constant being related closely with the class l. It is denoted by Al . We c c 2y j μl j −y 2j obtain Nj =1 log p(y j |l) ∝ Al + 12 Nj =1 ( ). 2 σl j

Nc 

Nc  μl2j √ log p(y j |l) = −Nc log( 2π) + (− log σl j − ) 2σl2j j =1 j =1 c 2y μ − y j lj 1 j ( ) 2 σl2j

N

+

2

(9)

j =1

The decision rule obtained above can be converted to c 2y μ − y j lj 1 j ( )] 2 σl2j

N

g2 (y) = arg max[ Al + l

2

(10)

j =1

The first term Al in the rule g2 (y) describes the prior knowledge on the class l and can be computed by utilizing many training samples before decision. The second term is related closely with the matrix dot product between the statistics matrices and the class feature matrices. Since the likelihood probability function is used in the rule g2 (y), this method is referred to as the maximum likelihood (ML) method. Ideally the optimal resolution can be obtained if the minimum total square error approaches to zero in Algorithm 2. Then, the mean and variance of yl j distribution are  1 l= j μl j = 0 other wi se and σl j = 0. As a result, the geometrical curves display Nc straight lines at different position. In this case, the above rule does not work but the decision rule g1 (y) is still active. Therefore, the proposed classification method is: if σl j = 0, l = 1, . . . , Nc and j = 1, . . . , Nc , we use the rule g1 (y); otherwise, we use the rule g2 (y). But in practice, the occurrence of the case of σl j = 0 is almost non-existence.

C. Error Analysis A key assumption for the decision rule g2 (y) is that the actual output value of yl j obeys the Gaussian distribution. However, in practice, does it really obey the Gaussian distribution? How much are the errors? The following will answer these questions through carrying out many experiments. Fig.14 shows the distributions of values of yl1 , l = 1, . . . , 6, and Gaussian distributions with the same mean and variance. The distribution curves of values of yl1 have the following properties: 1) We obtain six distribution curves after six class samples operate on the class feature matrix W1 ( j = 1), respectively. The mean of the first distribution curve is near 1 (i.e., l = 1 and j = 1), while the means of other five distribution curves are around 0. This rule is also applicable to the samples with other class feature matrices ( j = 1). 2) The curves of Fig. 14(e) and 14(f) are closer to Gaussian distributions than other curves do. 3) Compared with Gaussian distribution, the errors appear mainly at the position near the peak, but the distribution curves of values of yl1 , l = 1, . . . , 6, are almost consistent with Gaussian distributions in the intervals [−∞, μl1 − 3σl1 ] and [μl1 + 3σl1 , +∞]. 4) Compared with Fig. 13(a), the cross area between the distribution curve when l = j and other five distribution curves is less than 4.6%. Thus, we can conclude that the classification error should be less than 4.6%. Furthermore, it seems that the curves in Fig.13 obey the Laplace distribution. Fig. 15 shows that the distribution of values of yl j is closer to Gaussian distribution than to Laplace distribution. Kulback-Leibler (KL) divergence will reach 0.047 when N = 18000. This is the reason that we choose Gaussian distribution. Thus, the aforementioned supposition is feasible so long as there are enough training samples.

4732

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 11, NOVEMBER 2014

TABLE III P ERFORMANCE OF F OUR C LASSIFICATION M ETHODS

Fig. 15. KL divergence between the distribution of values of yl j and Gaussian (Laplace) distribution in [−∞, +∞] with different sample size N . M10 acts as the input vector and L = 11.

V. E XPERIMENT R ESULTS We test the proposed method on the image sets from two public dataset1 which includes 4000 original image (256 × 256). Each of the original images is transformed into six halftone images, respectively, by the above six ED filters (see TABLE I). Therefore, we can acquire 24000 halftone images. We randomly choose 12000 samples as training samples (labeled) and the rest are regarded as test samples represented by A. We also use other two test sample sets B and C in experiments. Set B involves 50% of sample set A, and set C includes 12000 training samples. A large amount of experiments is performed to evaluate the performance of the proposed method including the classification error rate, the selection of parameters, the various image attacks and the computing time. All experiments are conducted under the environment of Windows XP, matlab 7.0, an Intel Core Ci5-2500CPU, 3.3 GHz processor and 4GB of RAM. A. Performance Evaluation Method In order to evaluate the performance of the proposed method, we present a comparison between several similar methods under the same data set. The classification error rate can be used as the performance evaluation of the classifier. In multi-class problem, there exist two parameters: the average classification error rate and the classification error rate variance. Let n j represent the test sample size of the j th class, and let m j denote the correct classification number of test sample. Then we can acquire the classification error rate of j th class by e j = (n j − m j )/n j ∗ 100%. So the average classification error rate (ACER) of Nc -class problem c is e = N1c Nj =1 e j and the classification error rate variance

Nc 1 2 (CERV) σ = j =1 (ei − e) . The smaller e indicates Nc that the classifier has a better performance. Generally e ≥ 0. The variance σ indicates that the fluctuation degree of the Nc classification error rates e j , j = 1, . . . , Nc . A large σ leads to an uneven distribution of the Nc classification error rates, resulting in unstable performance of the classifier. In addition, different training data sets and different test data sets will lead to different classification performance. Therefore, in order to evaluate a classification method, we run our classifier procedure Nr times and calculate the total average classification error rate (TACER) acting as the classification Nr performance metrics by TACER = N1r i=1 ei where ei is 1 http://decsai.ugr.es/cvg/dbimagenes/ and http://msp.ee.ntust.edu.tw/

the ACER acquired at the i th time. We also calculate the average classification rate variance (ACERV) of Nr times Nerror r by ACERV = N1r i=1 σi , where σi is the CERV acquired at the i th time. Nr = 20 in the experiments. B. Comparison With Similar Methods We compare the proposed method (M10 (M11 ) + ML, L = 15) with two similar methods which have been used to address the classification problem of halftone pattern. The first method is the method based on the least mean square (LMS) and Bayes method (LMS + Bayes) in which the Fourier spectrums of the halftone images are utilized as input of LMS [17]. The method based on the enhanced correlation function (ECF, L = 15) and BP neural network [13] (ECF+BP) is the second method. TABLE III shows both TACER and ACERV on sample set A. We have made three observations from the experiment results listed in TABLE III: a) Both ECF + BP and LMS + Bayes yield similar performance on sample set A, which indicates that there is no significant difference on the performance of either correlation-based or spectrums-based feature representation for the classification of ED patterns. Inconspicuous differences among spatial domain textures or frequency domain textures of the halftone images lead more often to wrong judgment. b) Both M10 + ML and M11 + ML produce smaller TACER and ACERV on the sample set A than other two methods do. This implies that the discriminances of the statistics matrices are good for classification. Our method represents an improvement on the classification performance. c) Both TACER and ACERV of M10 + ML are smaller, namely 1.68% and 1.67%, respectively, than that of M11 + ML. This explains the fact that feature M10 has a better discriminance than M11 does for classification. Another comparison of several popular classification methods with the same input vector (M10 ) is also conducted in our experiments. These classification methods involve MS, ML, back propagation (BP) neural network, radial basis function (RBF) neural network, support vector machine (SVM) and the nearest neighbour (NN) method. We implemented these methods by ourselves and tuned the parameters of each method for a fair comparison. To feed BP, RBF, SVM and NN, the matrix M10 is unfolded to a vector with L × L size. We use the software of BP, RBF, SVM and NN provided by Matlab 7.0 to carry out our experiments. The related parameter settings of BP, RBF, SVM and NN are given below: 1) BP network is designed as a three-layer network structure with L × L input nodes, L × L hidden layer nodes and six output nodes. Other parameters include the time of training

WEN et al.: NOVEL CLASSIFICATION METHOD OF HALFTONE IMAGE VIA STATISTICS MATRICES

Fig. 16. Performance comparison between six classification methods (A). (a) The performance comparison between MS and ML under different values of L. (b) The performance comparisons between MS, ML, BP, RBF, NN and SVM when L = 9, 11 or 13, respectively.

4733

Fig. 18. Classification performance under different image sizes (A). W1, W2, W3 and W4 represent that the sizes of both training image and test image are 64 × 64, 128 × 128, 256 × 256 and 512 × 512, respectively.

Fig. 19. Classification performance under different image scaling (A, K = 32 and L = 11). Image scaling ratio is limited to 0.9 ∼ 1.1. ratio< 1 (> 1) means that the image size is lessened (enlarged). The bilinear interpolation method in 2 × 2 neighborhoods is used to scale down or up all images.

Fig. 17. Classification performance on set A, B and C for different values of L (K = 32). (a)(b) TACER and ACERV curves when M10 acts as the input vector; (c)(d) TACER and ACERV curves when M11 acts as the input vector.

(200), the expected error (0.0001), the linear function (Purelin) as the transfer function in each layer and the L-M optimization algorithm (trainlm) as the training function. 2) As with BP, RBF network utilizes a three-layer network structure with L × L input nodes, L × L hidden layer nodes and six output nodes, and the spread of RBF is 1. 3) We extend SVM for the multi-class classification by “one against one” strategy which is more suitable for practical use than other strategies [24]. Other parameters involve the customized RBF kernel function with gamma = 0.00078125 and the parameter “boxconstraint” with the value of 128. 4) Euclidean distance is used to seek the nearest neighbor sample in NN. Fig. 16(a) shows that the performance of ML is better than that of MS when 9 < L < 23. Fig. 16(b) shows that TACER of ML is lower than that of MS, BP, RBF, NN and SVM when L = 9, 11 or 13. The main reason is that ML achieve an improved classification correct via utilizing priori knowledge of training samples. C. Influence of Parameter We investigate the influence of the parameter L in our proposed method. Our experiments are carried out on three sets A, B and C. Fig. 17 shows the classification performance of ML, respectively, with different values of L. We observe that: 1) No matter what kind of statistics matrices as the input, the TACER curves with different data sets show almost the same curves. It indicates the minimum TACER when L = 11 ∼ 19,

but on the whole, TACER in Fig. 17(a) is smaller than that in Fig. 17(c). 2) On different data sets A, B and C, TACER curves are almost coincident when L > 7. However, these TACER curves have large differences when L ≤ 7. 3) The ACERV curves in Fig. 17(b) are similar to the TACER curves. However, the ACERV curves in Fig. 17(d) change sharply. These properties further demonstrate that as the input vector of ML, M10 has better discriminating ability than M11 does. Hence, it is easy to select appropriate parameter L for our method so as to obtain a good performance. D. Performance Under Various Attacks In many practical applications, halftone image will be attacked by many operations, such as image size, image scaling, rotating and noise, before the inverse transformation. Next, we examine the classification performance of our method under these attacks in this experiment. Fig. 18 shows that with the increase of the image size, both TACER and ACERV of M10 + ML will reduce, implying that we will get better performance of the classification when image size is larger. Fig. 19 shows the performance comparison of three method (ECF + BP, LMS + Bayes, M10 + ML) under different image scaling ratios. This comparison shows that our method has better ability than the other two methods do to resist against the image scaling with an image scaling ratio within the range 0.90 ∼ 1.1. Fig. 20 demonstrates that with the increase of the rotation angle α, the TACER will increase. The TACER of the proposed method is lower than that of other two methods under the rotation angle α within the range 0 ∼ 10 degree. Our method get TACER = 66.4% when α = 20°. At this point, the performance of the classification is very bad. So our method is rotational resistance when α < 10° and the TACER is below 8%. The experimental results of three methods are shown in Fig. 21 under Gaussian noise attacks. These results indicate that a large Gaussian noise variance v lead to a large TACER. The result of our method is better than that of other two methods when 0 < v ≤ 1.

4734

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 11, NOVEMBER 2014

TABLE IV C OMPUTATIONAL T IME (S ECOND ) OF SM AND ECF (A, L = 11)

Fig. 20. Classification performance under image rotation attacks (A, K = 32 and L = 11). (a) Sketch map of image rotation attack; (b) The TACER under α ∈ (0, 10) degree.

Fig. 23.

Computational time of SM under different values of L.

TABLE V C OMPUTATIONAL T IME (S ECOND ) OF F IVE M ETHODS (A)

Fig. 21. The TACER under the Gaussian noise attacks with mean 0 and variance v (A, K = 32 and L = 11).

Fig. 22. (a) Classification performance of our method under the different data bit error rates. (b) Classification performance of our method under various scanning resolutions at fixing printing resolution 150 dpi.

In particular, when v = 1, we get TACER = 35.6% indicating a very bad classification performance. Our method can resist the attacks of Gaussian noise with zero mean and variance v below 0.1 and the TACER is lower than 4%. A halftone image also may receive attacks of print-and-scan distortion or transmission errors. Fig.22(a) shows the TACER under different data bit error rates (BER). When BER = 50%, the TACER is the largest. Just like [17], our study only considers the case of one printing resolution (150 dpi) and four scanning resolution (150, 300, 600, and 1200 dpi) for testing. the classification results are demonstrated in Fig.22(b) under various print-and-scan attacks. These results indicate that a higher scanning resolution provides a worse classification error rate. E. Computational Time Complexity We now briefly analyze the computational time complexity of the statistics matrix method in Fig. 9 and Algorithm 2. The time complexity of both the statistics matrix (SM) method in this paper and the enhanced correlation function (ECF) [13] are O(W H L 2 ), but ECF needs W H L 2 multiplication operations, while our method needs W H L 2 comparison operations and W H L 2 addition operations. So the computational time of our method is less than that of ECF as shown in TABLE IV.

Fig. 23 indicates that the computational time of SM will increase with increasing L value. Algorithm 2 involves n in inner loops, n out outer loops and N addition operations of (4) in each iteration. The computational time complexity of Algorithm 2 is O(n in n out N L 2 ). TABLE V lists the computational time of five methods, including ML, RBF, BP, SVM, NN, on the training and the testing phases. Here the sample set A are used. From TABLE V, we can see that the computational time of ML for training is generally larger than RBF and SVM but less than BP. In practical applications, training is, however, an offline process. F. Discussion and Limitation Our descriptor of SM is somewhat similar to the grey level co-occurrence matrix (GLCM). Both depend on constructing the matrices by counting the number of occurrences of pixel pair at a given displacement d in the direction θ . But there are some differences. 1) It will lead to the great varieties of pixel pairs when GLCM is applied to a multilevel gray image. For example, there are 65536 (2562) kinds of pixel pairs for 256 level gray image. So it is necessary to compute texture features as the description of the image texture. 14 measures of texture features were given in [25], afterwards five common texture features were suggested in [26]. For discrete images, one chooses the four angles (directions) θ = 0°, 45°, 90° and 135°, respectively. Generally, let d = 1, 2, 3, 4 and 8. d = 1 means the neighboring pixels, d = 2 means the pixels that are separated by one pixel, and so forth. Let θ = 0°, 45°, 90° and 135°, respectively, d = 1, . . . , R and five GLCM features are used, and then the dimension of feature vector is 20R. 2) The value ranges of θ and d in SM are larger than that in GLCM. TABLE VI shows an example of θ and d in SM

WEN et al.: NOVEL CLASSIFICATION METHOD OF HALFTONE IMAGE VIA STATISTICS MATRICES

TABLE VI E XAMPLES OF θ AND d IN SM ( R = 2)

Fig. 24. Experiment comparison between EASM and GLCM (A). GLCM2, GLCM4, GLCM8 and GLCM16 mean that GLCM is implemented on the image under level gray 2, 4, 8 and 16, respectively.

when R = 2. Furthermore, the statistics matrices are fed into a classifier as training or test samples immediately, instead of computing various measures. The dimension of feature vector is (2R + 1)(2R + 1). When R ≥ 4, the dimension of feature vector in SM is larger than that in GLCM. As a result of two gray levels of halftone image, only three kinds of pixel pairs exist and can be described by three matrices. However, it has been proved that it suffice to describe the texture of halftone image by using only two matrices. Our experiment results show that the discrimination power of M10 is better than that of M11 . Of course, we can merge M10 and M11 to a new feature as the input of classifier, e.g. let M = a M10 + bM11 satisfying a > 0, b > 0 and a + b = 1. However our experiment results show that this feature is worse than M10 . We also can apply the binning of statistics matrices for the feature descriptor, but this method will lead to TACER overtopping 31% when L = 11 and the bin number is 50. Fig. 24 shows our method has lower TACER (ACERV) than GLCM does under the level gray 2, 4, 8 and 16 respectively. In GLCM, to make a fair comparison, similarly, we divide the halftone image into many pitches with the same size K × K . Let θ = 0°, 45°, 90° and 135° respectively, and let d = 1, . . . , R, 2R < K . Five classic texture features (angler second moment, contrast, correlation, inverse difference moment and entropy) are used to get halftone image textures. The GLCM mean of all pitches is regarded as the input of the same classifier. In our experiments, the halftone images are transformed into multilevel gray images (2, 4, 8 and 16, levels respectively) according to the neighborhood pixels.

4735

SM is also different from the gray level run-length matrix (GLRLM) which was presented by Galloway [27]. SM is presented to count the total number of co-occurrence pixel pairs according to the fixed matrix pattern, that is to say, a fixed matrix can tell us both the distance d and the direction θ between co-occurrence of two pixels. The feature dimension is L × L. However, GLRLM is designed to compute the number of gray level runs of various lengths. The element M(g, l|θ ) of GLRLM gives the total number of occurrences of runs of length l at gray level g in a given direction θ . As with GLCM, the direction θ is limited to four angles: 0°, 45°, 90° and 135° and five features (short run emphasis, long run emphasis, gray level nonuniformity, run length nonuniformity and run percentage, respectively) are computed. The feature dimension is a constant 20. The lowest TACER that we get in our experiments is larger than 40% when GLRLM acts on the halftone image classification of ML or RBF method. As described in [28], both GLCM and GLRLM are the powerful tools for texture analysis, but unfortunately they are not fit for halftone image classification. Although the proposed method is based on ED kernel, it also can be extended to the classification of some other halftoning patterns, such as cluster dither with 8 × 8 matrix [29], disperse dither with Bayer’s 8 × 8 matrix [29], dotdiffusion with Knuth’s 8 × 8 matrix [30], dot-diffusion with Mese-Vaidyanathan’s 8 × 8 class matrix and 16 × 16 class matrix [31], DBS [4]. In our experiments, the TACER of these six kinds of halftone images are no more than 3% (L = 11) which is close to the classification performance of ED images. The proposed method is suitable for the classification of halftone image, but not suitable for multilevel gray images even binarization is implemented on these multilevel gray images. When our method is used for classification for multilevel image textures , TACER and ACERV will reach 37.7% and 43.6%, respectively (Kylberg texture dataset) in our experiments. VI. C ONCLUSIONS Classification of halftone image is crucial to resolve the optimal reconstruction problem of halftone images. This paper focused on feature extraction based on spatial domain and classification of ED patterns. We proposed a feature descriptor method based on statistics matrices. Subsequently, we seek the optimal class feature matrices by minimizing an object function. To improve the classification performance, we convert the prior probability function to posterior probability function and employ the maximum likelihood method to divide the test samples into six categories. Many experiments are conducted to validate our method. These experiments include performance comparison of our method with other similar methods, robustness test on three common attack schemes, performance test with different parameter value and comparison with GLCM. All these experimental results indicate that the proposed method has lower classification error rate than other similar methods do when L value is limited to 11 ∼ 19 and our method is robust against image rotating, Gaussian white noise attack, image scaling to some extent and transmission bit error

4736

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 11, NOVEMBER 2014

rate. Our method also can be extended to the classification of other halftone patterns. In the future, we are planning to be engaged in our research in two directions. First, we will focus on error analysis of ML in theoretical form. Second, we are planning to explore effective learning algorithm and new applications of inverse halftoning. R EFERENCES [1] J.-M. Guo, Y.-F. Liu, J.-Y. Chang, and J.-D. Lee, “Efficient halftoning based on multiple look-up tables,” IEEE Trans. Image Process., vol. 22, no. 11, pp. 4522–4531, Nov. 2013. [2] B. E. Bayer, “An optimum method for two-level rendition of continuoustone pictures,” in Proc. IEEE Int. Conf. Commun., vol. 26. Jun. 1973, pp. 11–15. [3] Y. F. Liu and J. M. Guo, “New class tiling design for dot-diffused halftoning,” IEEE Trans. Image Process., vol. 2, no. 3, pp. 1199–1208, Mar. 2013. [4] F. A. Baqai and J. P. Allebach, “Halftoning via direct binary search using analytical and stochastic printer models,” IEEE Trans. Image Process., vol. 12, no. 1, pp. 1–15, Jan. 2003. [5] M. Mese and P. P. Vaidyanathan, “Recent advances in digital halftoning and inverse halftoning methods,” IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 49, no. 6, pp. 790–805, Jun. 2002. [6] Y.-T. Kim, G. R. Arce, and N. Grabowski, “Inverse halftoning using binary permutation filters,” IEEE Trans. Image Process., vol. 4, no. 9, pp. 1296–1311, Sep. 1995. [7] Y. Saika, K. Okamoto, and F. Matsubara, “Probabilistic modeling to inverse halftoning based on super resolution,” in Proc. Int. Conf. Control, Autom. Syst., Gyeonggi-do, Korea, Oct. 2010, pp. 162–167. [8] R. Stevenson, “Inverse halftoning via MAP estimation,” IEEE Trans. Image Process., vol. 6, no. 4, pp. 574–583, Apr. 1997. [9] G. B. Unal and A. E. Cetin, “Restoration of error-diffused images using projection onto convex sets,” IEEE Trans. Image Process., vol. 10, no. 12, pp. 1836–1841, Dec. 2001. [10] J. M. Guo and Y. F. Liu, “Improved dot diffusion by diffused matrix and class matrix co-optimization,” IEEE Trans. Image Process., vol. 18, no. 8, pp. 1804–1816, Aug. 2009. [11] P. G. Freitas, M. C. Q. Farias, and A. P. F. de Araujo, “Fast inverse halftoning algorithm for ordered dithered images,” in Proc. 24th SIBGRAPI Conf. Graph., Patterns Images, Aug. 2011, pp. 250–257. [12] T. D. Kite, N. Damera-Venkata, B. L. Evans, and A. C. Bovik, “A fast, high-quality inverse halftoning algorithm for error diffused halftones,” IEEE Trans. Image Process., vol. 9, no. 9, pp. 1583–1592, Sep. 2000. [13] P. C. Chang and C. S. Yu, “Neural net classification and LMS reconstruction to halftone images,” Proc. SPIE, vol. 3309, no. 2, pp. 592–602, Jan. 1997. [14] Y. Kong, P. Zeng, and Y. Zhang, “Classification and recognition algorithm for the halftone image,” (in Chinese), J. Xidian Univ., vol. 38, no. 5, pp. 62–69, 2011. [15] Y. Kong, “A study of inverse halftoning and quality assessment schemes,” Ph.D. dissertation, School Comput. Sci. Technol., Xidian Univ., Xian, China, 2008. [16] Y.-F. Liu, J.-M. Guo, and J.-D. Lee, “Inverse halftoning based on the Bayesian theorem,” IEEE Trans. Image Process., vol. 20, no. 4, pp. 1077–1084, Apr. 2011. [17] Y.-F. Liu, J.-M. Guo, and J.-D. Lee, “Halftone image classification using LMS algorithm and naive Bayes,” IEEE Trans. Images Process., vol. 20, no. 10, pp. 2837–2847, Oct. 2011. [18] Y.-H. Fung and Y.-H. Chan, “Embedding halftones of different resolutions in a full-scale halftone,” IEEE Signal Process. Lett., vol. 13, no. 3, pp. 153–156, Mar. 2006. [19] R. W. Floyd and L. Steinberg, “An adaptive algorithm for spatial grayscale,” Proc. Soc. Image Display, vol. 17, no. 2, pp. 75–77, 1976. [20] D. L. Lau and G. R. Arce, Modern Digital Halftoning, 2nd ed. Boca Raton, FL, USA: CRC Press, 2008, pp. 83–89. [21] Image Dithering: Eleven Algorithms and Source Code. [Online]. Available: http://www.tannerhelland.com/4660/dithering-elevenalgorithms-source-code/, accessed Dec. 28, 2012. [22] R. A. Ulichney, “Dithering with blue noise,” Proc. IEEE, vol. 76, no. 1, pp. 56–79, Jan. 1988. [23] W. Sun and Y.-X. Yuan, Optimization Theory and Methods Nonlinear Programming. New York, NY, USA: Springer-Verlag, 2006.

[24] C. W. Hsu and C. J. Lin, “A comparison of methods for multiclass support vector machines,” IEEE Trans. Neural Netw., vol. 13, no. 2, pp. 415–425, Mar. 2002. [25] R. M. Haralick, K. Shanmugam, and I. Dinstein, “Textural features for image classification,” IEEE Trans. Syst., Man, Cybern., vol. 3, no. 6, pp. 610–621, Nov. 1973. [26] S. Aksoy and R. M. Haralick, “Feature normalization and likelihoodbased similarity measures for image retrieval,” Pattern Recognit. Lett., vol. 22, no. 5, pp. 563–582, 2001. [27] M. M. Galloway, “Texture analysis using gray level run lengths,” Comput. Graph. Image Process., vol. 45, no. 4, pp. 172–179, 1975. [28] R. W. Conners and C. A. Harlow, “A theoretical comparison of texture algorithms,” IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-2, no. 3, pp. 204–222, May 1980. [29] R. Ulichney, “Digital halftoning and the physical reconstruction function,” Ph.D. dissertation, Dept. Elect. Eng. Comput. Sci., MIT, Cambridge, MA, USA, 1986. [30] D. E. Knuth, “Digital halftones by dot diffusion,” ACM Trans. Graph., vol. 6, no. 4, pp. 245–273, 1987. [31] M. Mese and P. P. Vaidyanathan, “Optimized halftoning using dot diffusion and methods for inverse halftoning,” IEEE Trans. Image Process., vol. 9, no. 4, pp. 691–709, Apr. 2000.

Zhi-Qiang Wen is currently an Associate Professor with the School of Computer and Communication, Hunan University of Technology, Zhuzhou, China. He received the B.E. degree from the Central South University of Technology, Changsha, China, in 1997, the M.S. degree in computer application technology from Guangxi University, Nanning, China, in 2004, and the Ph.D. degree in computer application technology from Central South University, Changsha, in 2008. He is a member of the China Computer Federation. His research interests include computer vision, digital image processing, and robotics.

Yong-Xiang Hu is currently an Associate Professor with the School of Computer and Communication, Hunan University of Technology, Zhuzhou, China. He received the B.E. degree in computer application technology from Nanhua University, Hengyang, China, in 1997, the M.S. degree in computer application technology from Huanan Normal University, Guangzhou, China, in 2005, and the Ph.D. degree in biomedical engineering from Central South University, Changsha, China, in 2012. His research interests include digital image processing and pattern recognition.

Wen-Qiu Zhu is currently a Professor with the School of Computer and Communication, Hunan University of Technology, Zhuzhou, China. He received the B.E. degree in information system engineering from the National University of Defense Technology, Changsha, China, in 1991, and the M.S. degree in software engineering from Central South University, Changsha, in 2006. His research interests include digital image processing and pattern recognition.

A novel classification method of halftone image via statistics matrices.

Existing classification methods tend not to work well on various error diffusion patterns. Thus a novel classification method for halftone image via s...
4MB Sizes 4 Downloads 6 Views