A new kernel discriminant analysis framework for electronic nose recognition.

Analytica Chimica Acta 816 (2014) 8–17

Contents lists available at ScienceDirect

Analytica Chimica Acta journal homepage: www.elsevier.com/locate/aca

A new kernel discriminant analysis framework for electronic nose recognition Lei Zhang a,b,∗ , Feng-Chun Tian a a b

College of Communication Engineering, Chongqing University, 174 ShaZheng street, ShaPingBa District, Chongqing 400044, China Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong

h i g h l i g h t s

g r a p h i c a l

• This paper proposes a new discrimi• • • •

nant analysis framework for feature extraction and recognition. The principle of the proposed NDA is derived mathematically. The NDA framework is coupled with kernel PCA for classification. The proposed KNDA is compared with state of the art e-Nose recognition methods. The proposed KNDA shows the best performance in e-Nose experiments.

a r t i c l e

i n f o

Article history: Received 25 November 2013 Accepted 23 January 2014 Available online 3 February 2014 Keywords: Electronic nose Discriminant analysis Dimension reduction Feature extraction Multi-class recognition

a b s t r a c t

KNDA training Training data

Computation of kernel matrix K

Regularization ofK

Principal component analysis ofK

KPCtraining

KPCprojectionvectors

e-Nose datasets Testing data

Computation of kernel matrix Kt

Regularization ofKt

KPCtesting

NDA training W

NDA testing

Recognition results

KNDA recognition

a b s t r a c t Electronic nose (e-Nose) technology based on metal oxide semiconductor gas sensor array is widely studied for detection of gas components. This paper proposes a new discriminant analysis framework (NDA) for dimension reduction and e-Nose recognition. In a NDA, the between-class and the within-class Laplacian scatter matrix are designed from sample to sample, respectively, to characterize the betweenclass separability and the within-class compactness by seeking for discriminant matrix to simultaneously maximize the between-class Laplacian scatter and minimize the within-class Laplacian scatter. In terms of the linear separability in high dimensional kernel mapping space and the dimension reduction of principal component analysis (PCA), an effective kernel PCA plus NDA method (KNDA) is proposed for rapid detection of gas mixture components by an e-Nose. The NDA framework is derived in this paper as well as the specific implementations of the proposed KNDA method in training and recognition process. The KNDA is examined on the e-Nose datasets of six kinds of gas components, and compared with state of the art e-Nose classification methods. Experimental results demonstrate that the proposed KNDA method shows the best performance with average recognition rate and total recognition rate as 94.14% and 95.06% which leads to a promising feature extraction and multi-class recognition in e-Nose. © 2014 Elsevier B.V. All rights reserved.

1. Introduction Electronic nose (e-Nose) is an instrument comprised of a chemical sensor array with partial specificity and appropriate pattern

∗ Corresponding author at: Chongqing University, College of Computer Science, No. 174 Shazhengjie, ShaPingBa district, Chongqing, China. Tel.: +86 13629788369; fax: +86 23 65103544. E-mail address: [email protected] (L. Zhang). 0003-2670/$ – see front matter © 2014 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.aca.2014.01.049

recognition algorithm for detection of chemical analytes [1,2]. In recent years, a number of studies in e-Nose have been presented for classification applications in many fields, i.e. evaluation of tea and food quality [3–8], disease diagnosis [9–11] and environmental monitor [12–15], etc. In e-Nose, pattern recognition module with an appropriate sensor array plays an important role in usage, which decides the accuracy and robustness of e-Nose detection. The classification methodologies have been widely studied in eNose applications. First, artificial neural networks (ANN) are widely

L. Zhang, F.-C. Tian / Analytica Chimica Acta 816 (2014) 8–17

used for qualitative and quantification analysis in the initial stage of e-Nose development, such as multilayer perceptron (MLP) neural network with backpropagation (BP) algorithm [4,8,15,16], RBF neural network [4,6,17], ARTMAP neural networks [18,19], etc. Then, decision tree method has also been proposed for classification [20]. With the proposal of support vector machine (SVM) with complete and theoretical proof, SVM has been widely studied in e-Nose [9,12,14,21,22] and also demonstrate that it is superior to ANN in accuracy and robustness, due to that SVM constructed as structural risk minimization is better than the only empirical risk minimization considered in ANN. The nonlinear mapping with different kernel functions (e.g. polynomial, Gaussian function, etc.) in SVM can make a linearly inseparable classification problem in original data space linearly separable in a high dimensional feature space, and realize the classification through a hyperplane. Besides, SVM is used to solve a convex quadratic programming problem and can promise global optimum which could not be achieved by ANN in a small sample of events. Although SVM seems to be the best selection in a binary classification, the algorithm complexity of SVM for a general multi-class problem may make an actual on-line application difficult in the development of our eNose. Besides the classification methodologies, data preprocessing methods like feature extraction and dimension reduction methods [23–26] including principal component analysis (PCA), independent component analysis (ICA), kernel PCA (KPCA), linear discriminant analysis (LDA), singular value decomposition (SVD), etc. have also been combined with ANNs or SVMs for improving the prediction accuracy of e-Nose. Both feature extraction and dimension reduction aim to obtain useful features for classification. Dimension reduction can reduce the redundant information like denoising but may lose some useful information in original data. In addition, classification method has also ability to automatically depress the useless components in samples learning process. The methodologies of PCA, KPCA, LDA, and the combination of KPCA and LDA have also wide application in many fields, such as time series forecasting, novelty detection, scene recognition, and face recognition as feature extraction and dimension reduction. Cao et al. employed a comparison of PCA, KPCA and ICA combined with SVM for time series forecasting, and find that KPCA has the best performance in feature extraction [27]. Xiao et al. proposed a L1 norm based KPCA algorithm for novelty detection and obtained satisfactory effect using simulation data set [28]. Hotta proposed a local feature acquisition method using KPCA for scene classification, and the performance is superior to conventional methods based on local correlation features [29]. Lu et al. [30] and Yang et al. [31] also proposed kernel direct discriminant analysis and KPCA plus LDA algorithms in face recognitions, and given a complete kernel fisher discriminant framework for feature extraction and recognition. Dixon et al. [32] presented a PLS-DA method used in gas chromatography mass spectrometry. A kernel PLS algorithm was also discussed in [33]. Inspired by these works, 2-norm between each two sample vectors xi and xj in between-class and within-class is consideredwith a similarity matrix calculated by the Gaussian function

2

exp −xi − xj /t . Through the construction of the between-

class Laplacian scatter matrix and within-class Laplacian scatter matrix, the new discriminant analysis (NDA) framework is realized by solving an optimization problem which makes the samples between-class more separable and the samples within-class more compactable. Considering the characteristic of linearly separable in high dimensional kernel space, the Gaussian kernel function is introduced for mapping the original data space into a high dimensional space. To make the within-class Laplacian scatter

9

matrix nonsingular in the calculation of eigenvalue problem in which a inverse operation of the within-class Laplacian scatter matrix is necessary, PCA is used to reduce the dimension of the kernel space. The contribution of this paper can be concluded as the proposed new discriminant analysis framework (KNDA) based on KPCA and its application in electronic nose for rapid detection of multiple kinds of pollutant gas components. It is worthwhile to highlight several aspects of the proposed KNDA framework. First, in the NDA framework, each sample vector in between-class and within-class has been used for Laplacian scatter matrix, while in LDA only the centroid of between-class and within-class is used to calculate the scatter matrix in which each sample’s information cannot be well represented. Second, a similarity matrix by a Gassian function is used to measure the importance of each two samples xi and xj with respect to their distance xi − xj which is not con2 sidered in LDA. Third, the projection vector can be obtained by maximizing the between-class Laplacian scatter matrix and minimizing the within-class Laplacian scatter matrix. Fourth, the NDA is a supervised discriminant analysis framework and KNDA is the combined learning framework of unsupervised KPCA and supervised NDA for feature extraction and recognition. Fifth, the recognition in this paper is an intuitive Euclidean distance based method and promising the stability and reliability of the results. The organization of this paper is as follows. In Section 2, we briefly review the related work such as PCA, KPCA and LDA. In Section 3, we describe the proposed approach including the proposed new discriminant analysis framework, the proposed kernel discriminant analysis learning framework and the multi-class recognition. In Section 4, electronic nose experiments and the experimental data are presented. Section 5 presents the experimental results and discussion. Conclusion is made in Section 6.

2. Related work 2.1. PCA PCA [34] is an unsupervised method in dimension reduction by projecting correlated variables into another orthogonal feature space and thus a group of new variables with the largest variance (global variance maximization) were obtained. The PC coefficients can be obtained by calculating the eigenvectors of the covariance matrix of the original data set.

2.2. KPCA KPCA does the PCA process in kernel space which introduces the advantage of high dimension mapping of original data using kernel trick on the basis of PCA, in which the original input vectors are mapped to a high dimensional feature space F. The mapping from the original data space to high dimensional feature space can be represented by calculating the symmetrical kernel matrix of input training pattern vectors using a Gaussian kernel function shown as

K (x, xi ) = exp

2 −x − xi 2

(1)

where x and xi denotes the observation vectors, 2 denotes the width of Gaussian that is commonly called kernel parameter.

10


In general, KPCA is to perform PCA algorithm in the high dimensional feature space K, and extract nonlinear feature. The size of dimension depends on the number of training vectors. The PCA process of kernel matrix K is to perform the following eigenvalue operation K ×˛=×˛

(2)

where ˛ denotes the set of eigenvectors corresponding to d eigenvalues and denotes the diagonal matrix (ii is the eigen value, i = 1,. . .,d). The set of eigenvectors ˛|˛i , i = 1, . . ., r corresponding to the first r largest eigenvalues ordered in such a way 1 > 2 > · · ·r is the kernel principal component coefficients (projection vectors). Therefore, the kernel PC scores can be obtained by multiplying the kernel matrix K by the PC coefficients. 2.3. LDA LDA aims to maximize the ratio of between-class variance to the within-class variance in any particular data set through a transformation vector w, and therefore promise the maximum separability. Finally, a linear decision boundary between two classes will be produced for classification. To a binary classification (two classes), assume the dataset We for the two classes to be X1 and X2 , respectively. write X1 = x11 , x12 , . . ., x1N1 , and X2 = x21 , x22 , . . ., x2N2 , N1 and N2 denote the numbers

of column vectors for X1 and X2 , j

xi , i = 1, 2; j = 1, . . ., Ni

Ni

Sw =

j

xi − i

j

xi − i

T

2

i = 1, . . ., Nk ;

i − Z¯

(4)

w = argmax

−1 Sb Sw

wT Sb w wT Sw w

(5)

w|wi , i = 1, . . ., m

is the set of the eigenvectors of the

corresponding to the m largest eigenvalues.

1 N1 × N2

=

1 N1 × N2

3.1. The proposed NDA framework

H1 =

i=1

tr

i=1

best projection basis W from the training sets between class 1 and class 2 to transform the training set into a low dimensional feature spaces. To our knowledge, the two classes would be more separable if the between-class scatter matrix is larger and the within-class scatter becomes more compact. The proposed NDA is a supervised dimension reduction algorithm from sample to sample, thus the similarity matrix A and B are introduced to construct the

j=1

j

W T x1i − W T x2

N1 N2

tr W T

i=1

j=1

N1 N2

WT

tr

WT

(7)

i=1

j

x1i − x2

j=1

1 N1 × N2

i=1

j

x1i − x2

N1 N2 j=1

j

j

x1i − x2

x1i − x2

j

W T x1i − W T x2

T W

j

x1i − x2

j

T

Aij

(8)

Aij

T ij

x1i − x2

A

T

W

Aij W

N2

j

x1i − x2

T

Aij

(9)

i=1 j=1

Then, we get J1 (W ) = tr W T H1 W . Similarly, the within-class Laplacian scatter matrix can be represented as J2 (W ) =

c

1

k=1

c c

Nk2 1

k=1

= tr

Nk2

1

k=1

j=1

N1 N2

1 j x1i − x2 N1 × N2

=

Let x1 = x11 , x12 , . . ., x1N1 be the training set of class 1 with N1 samples, and x2 = x21 , x22 , . . ., x2N2 be the training set of class 2 with N2 samples. The proposed NDA framework aims to find the

c=2

Now, we let

=

3. The proposed approach

k = 1, . . ., c;

N1 N2 W T x1i − W T x2j 2 Aij

N1

where i denotes the centroid of the ith class, Z¯ denotes the centroid of the total dataset Z, and symbol T denotes transpose. If Sw is nonsingular, the optimal projection matrix w can be obtained by solving the following maximization problem

1 N1 × N2

=

i=1

The

1 N1 × N2

J1 (W ) =

(3)

T

j = 1, . . ., Nk ;

where t represents the width of Gaussian which is an empirical parameter. In this paper, t = 100. Therefore, from the viewpoint of classification, we aim to maximize the ratio of the between-class Laplacian scatter matrix J1 (W) and the within-class Laplacian scatter matrix J2 (W). The specific algorithm derivation of the proposed NDA framework is shown as follows. The between-class Laplacian scatter matrix can be represented as

= tr

Ni × i − Z¯

(6)

⎛ 2 ⎞ j −xki − xk ⎜ ⎟ ij Bk = exp ⎝ ⎠, t

=

i=1 j=1

Sb =

⎛ 2 ⎞ j −x1i − x2 ⎜ ⎟ Aij = exp ⎝ ⎠ , i = 1, . . ., N1 ; j = 1, . . ., N2 t

denotes the column vector (observa-

tion Then, we set the total dataset Z in Rd as Z = sample). X1 , X2 . Then, the within-class scatter matrix Sw and between-class scatter matrix Sb can be represented as 2

between-class Laplacian scatter and within-class Laplacian scatter. The similarity matrix A and B can be calculated as follows

Nk Nk W T xki − W T xkj 2 Bkij i=1

tr

i=1

j=1

j

W T xki − W T xk

Nk Nk

WT

tr

Nk2

WT

j=1

Nk Nk

i=1

k=1

j=1

1

c

Nk2

i=1

j=1

j

j

xki − xk

Nk Nk

xki − xk

j

W T xki − W T xk

T

j

T ij

j

T ij

xki − xk

xki − xk

Bk

Bk

ij

Bk

(10)

W

W

We let

H2 =

Nk Nk c 1 k=1

Nk2

j

xki − xk

j

xki − xk

T

i=1 j=1

Then, we have J2 (W ) = tr W T H2 W

ij

Bk

(11)


In this paper, the algorithm aims to solve a two-class problem, that is, c = 2. Therefore, we can rewrite the H2 as H2 = =

2 k=1

1

1

Nk Nk

Nk2

i=1

N1 N1

N12

i=1

j=1

j=1 j

j

xki − xk

x1i − x1

j

j

xki − xk

x1i − x1

T

ij

T

B1 +

ij

1

N2 N2

N22

i=1

J1 (W ) W T H1 W max J (W ) = max = max T J2 (W ) W H2 W

j=1

(13)

H1 ϕ = H2 ϕ

(14)

Then, the optimization problem (13) can be transformed into the following maximization problem (15)

ϕiT H2 ϕi

H1 ϕ1 = 1 H2 ϕ1 ,

H1 ϕ2 = 2 H2 ϕ2 , . . ., H1 ϕi = i H2 ϕi

(16)

Then, the maximization problem (15) can be solved as = max

i ϕiT H2 ϕi ϕiT H2 ϕi

= max i

(17)

Let ϕ1 be the eigenvector corresponding to the largest eigenvalue 1 (1 > 2 > · · · > d ), then the optimal projection basis W between class 1 and class 2 can be represented as W = ϕ1 . 3.2. The KPCA plus NDA algorithm (KNDA) In this paper, the KPCA method is combined with the proposed NDA framework for feature extraction and recognition in e-Nose application. It is worthwhile to highlight the two reasons of KPCA in this work. First, the introduction of kernel function mapping is on the basis of the consideration that in a high dimensional kernel space, the patterns would become more separable linearly than the original data space. Second, the PCA is used for dimension reduction of the kernel data space and make the number of variables less than the number of training samples so that we can guarantee that the within-class Laplacian scatter matrix H2 in Eq. (12) to be nonsingular in NDA framework. The pseudocodes of the KNDA training algorithm have been described in Table 1. The proposed NDA framework considers the two-class condition. To a multi-class (k classes, k > 2) problem, the NDA can also be useful by decomposing the multi-class problem into multiple two-class (binary) problems. Generally, “one-against-all (OAA)” and “one-against-one (OAO)” are often used in classification [35]. Study in [36] demonstrates that the OAO strategy would be a better choice in the case of k ≤ 10, while this paper studies the discrimination of k = 6 kinds of pollutant gases. Therefore, k × (k − 1)/2 = 15 NDA models are designed in this paper. 3.3. Multi-class recognition Given a test sample vector z, we first transform the sample vector z through the optimal projection vector Wi obtained using the proposed NDA method as follow z = WiT z

T

j

x2i − x2

(12) ij

B1

calculated as follows ifz − 1i 2 > z − 2i 2 ,

z∈

class 1 ;

else,

z∈

class 2

where symbol ·2 denotes 2-norm, 1i and 2i denote the centroid of class 1 and class 2 in the i-model, respectively. For a multi-class recognition, a majority voting mechanism in decision level is used based on the OAO strategy. The statics of vote number Vj for class j can be shown by Vj =

I ci = Tj ,

j = 1, . . ., k

(20)

i=1

According to H1 ϕ = H2 ϕ, we have

ϕiT H2 ϕi

k×(k−1)/2

ϕiT H1 ϕi

ϕiT H1 ϕi

j

x2i − x2

(19)

H1 and H2 have been derived in analysis, thus, we can find the projection basis W by solving the following eigenvalue problem

max

The recognition of z in a two-class (class 1 and class 2) problem can be employed in terms of the Euclidean distance 2-norm

Bk

From the angle of classification, to make class 1 and class 2 more separable, we formulate the discriminative analysis model as the following optimization problem

max

11

(18)

where I (·) denotes the binary indicator function, ci denotes the predicted label of the ith sub-classifier, and Tj denotes the true label of class j. The class label of class j with the largest vote number max Vj is the discriminated class of the test sample vector. The pseudocodes of the proposed KNDA recognition (testing) process have been described in Table 2. The diagram of the proposed KNDA electronic nose recognition method has been illustrated in Fig. 1, wherein two parts are included: KNDA training and KNDA recognition. The specific implementations of KNDA training and KNDA recognition have been illustrated in Tables 1 and 2, respectively. All the algorithms in this paper are implemented in the platform of Matlab software (version 7.8) on a laptop with Intel Core i5 CPU and 2GB RAM. 4. Experiments 4.1. Sensor array based e-Nose The sensor array based e-Nose system with Field Programmable Gate Array (FPGA) processor has been introduced in [14]. Consider the selectivity, stability, reproducibility, sensitivity and low-cost of metal oxide semiconductor (MOS) gas sensors, our sensor array in e-Nose system consists of four metal oxide semiconductor gas sensors including TGS2602, TGS2620, TGS2201A and TGS2201B. Moreover, a module with two auxiliary sensors for the temperature and humidity measurement is also used with consideration that MOS gas sensors are sensitivity to environmental temperature and humidity. Therefore, 6 variables are contained in each observation. A 12-bit analog-digital converter is used as the interface between FPGA processor and sensor array for convenient digital signal processing. The e-Nose system can be connected to a PC via a Joint Test Action Group (JTAG) port for data storage and debugging programs. The e-Nose system and the experimental platform are illustrated in Fig. 2 in which the typical response of gas sensors in the sampling process ((1) baseline, (2) transient response, (3) steady state response, (4) recover process) are also presented. 4.2. Experimental data In this paper, six familiar chemical contaminants indoor including formaldehyde, benzene, toluene, carbon monoxide, ammonia and nitrogen dioxide are studied with an e-Nose. The experiments

12


Table 1 The pseudocodes of the KNDA algorithm in the training stage. Input: The training sets x = [x1 , x2 , . . ., xm ] of k classes, the parameter t, the kernel parameter of kernel space mapping, and the threshold of accumulated contribution rate of the kernel principal components Output: Projection basis Wi , i = 1,. . .,k(k − 1)/2 and the centroid set ␮ of k × (k − 1)/2 models Step 1 (Kernel function mapping of the training sets) 1.1. Computation of the symmetrical kernel matrix Km × m of the training sets x using Eq. (1) 1.2. Regularization of the kernel matrix K 1.1.1. Centralization of K by using K = K − 1/mI × K − 1/mK × I + 1/mI × K × I 1.1.2. Normalization by using K = K/m Step 2 (Kernel principal components analysis) 2.1. Eigenvalue decomposition of K by equation K × V = × V , where and V denote eigenvalues and eigenvectors, respectively 2.2. Sort the eigenvalues in descending order 1 > 1 > · · · > m , and the sorted new eigenvectors

k=1 i=1 j k / m × 100,

2.3. Calculate the accumulated contribution rate (ACR) which is shown by ACRj = 2.3. Determine the number j of kernel principal components by using j = arg min

j = 1, . . ., m

ACRj ≥threshold

2.4. The KPCprojectionvectors for kernel principal components projection is determined as KPCprojectionvectors =

new eigenvectorsi ,

i = 1, . . ., j

2.5. Calculate the kernel principal components KernelPCtraining of the training sets by using KernelPCtraining = K × KPCprojectionvectors Step 3 (NDA framework) For i = 1,2,. . .,k(k − 1)/2, repeat 3.1. Take the training vectors of the ith pair of classes from KernelPCtraining , and calculate the between-class similarity matrix A and within-class similarity matrix B according to Eqs. (6) and (7), respectively 3.2. Calculate the between-class Laplacian scatter matrix H1 and within-class Laplacian scatter matrix H2 as shown in Eqs. (9) and (11), respectively 3.3. Solve the eigenvalue problem (14) and get the eigenvector ϕ1 corresponding to the largest eigenvalue 3.4. Obtain the projection basis Wi = ϕ1 3.5. Calculate the ith centroid pair i = 1i , 2i of class 1 and class 2 in the ith model

End for Step 4 (Output the low dimensional projection basis matrix W) Output the projection basis matrix W =

Wi , i = 1, . . ., k (k − 1) /2

and the centroid set =

i , i = 1, . . ., k (k − 1) /2

Table 2 The pseudocodes of the KNDA recognition. Input: The testing sets z = [z1 , z2 , . . ., zn ] of k classes, the kernel parameter of kernel space mapping, the KPCprojectionvectors , the projection basis matrix W and the centroid =

1i , 2i

,

i = 1, . . ., k (k − 1) /2, obtained in the KNDA training process

Output: The predicted labels of testing samples Step 1 (Kernel function mapping of the testing sets) 1.1. Computation of the symmetrical kernel matrix Ktn × n of the testing sets z using Eq. (1) 1.2. Regularization of the kernel matrix Kt 1.1.1. Centralization of Kt by using Kt = Kt − 1/nI × Kt − 1/nKt × I1/nI × Kt × I 1.1.2. Normalization by using Kt = Kt/n Step 2 (Kernel principal components projection of the testing sets) Calculate the kernel principal components KernelPCtesting of testing vectors KernelPCtesting = Kt × KPCprojectionvectors Step 3 (Multi-class recognition) For p = 1,2,. . .,n, repeat Fori = 1, . . ., k (k − 1) /2, repeat 3.1. Low dimensional projection of NDA i,p p LowDimprojection = WiT × KernelPCtesting i,p

3.2. Calculate the Euclidean distance between LowDimprojection and the centroid i , and discriminate the label ci of the pth sample in the ith classifier according to Eq. (19) End for 3.3. Compute the vote number Vj for class j of the pth sample according to Eq. (20) 3.4. Predict the label of the pth sample as j = arg maxVj End for Step 4 (Output the predicted labels of testing samples) Output the predicted labels of the n testing samples

KNDA training Training data

Computation of kernel matrix K

Regularization ofK

Principal component analysis ofK

KPCtraining

KPCprojectionvectors

e-Nose datasets Testing data

Computation of kernel matrix Kt

Regularization ofKt

KPCtesting

KNDA recognition Fig. 1. Diagram of the proposed classification method.

NDA training W

NDA testing

Recognition results


13

Fig. 2. Portable e-Nose, the experimental platform and the typical sensor response in this paper.

were employed in a constant temperature and humidity chamber in a condition of room temperature (15–35 ◦ C). The experimental process for each gas is similar in which three main steps are included. First, set the target temperature and humidity and collect the sensor baseline for 2 min. Second, inject the target gas by using a flowmeter with time controlled, and collect the steady state response of sensors for 8 min. Third, clean the chamber by air exhaust for 10 min and read the data for the sample by a laptop connected with the electronic nose through a JTAG. For more information about all the samples, we have described the experimental temperature, relative humidity, and concentration for each sample of each gas in supplementary data. The number of formaldehyde, benzene, toluene, carbon monoxide, ammonia and nitrogen dioxide samples are 188, 72, 66, 58, 60, and 38, respectively. In each sample, 6 variables (with 6 sensing units) are contained. All the experimental samples were obtained within two months by employing the e-Nose experiments continuously. To determine the training sample index, we introduce the Kennard–Stone sequential (KSS) algorithm [37] based on Euclidean distance to select the most representative samples in the whole sample space for each gas. The selection starts by taking the pair of sample vectors (p1 , p2 ) with the largest distance d(p1 , p2 ) among the samples for each gas. KSS follows a stepwise procedure that new selections are taken which would be the farthest from the samples already selected, until the number of training samples for each gas reaches. In this way, the most representative samples for each gas

can be selected as training samples and guarantee the reliability of the learned model. The merit of KSS for training samples selection is to reduce the complexity of cross validation in performance evaluation. The remaining samples without being selected would be used for model testing. The specific number of training and testing samples after KSS selection for each gas are illustrated in Table 3. 5. Results and discussion 5.1. Contribution rate analysis In the classification model, the kernel parameters 2 and the accumulated contribution rate (ACR) in KPCA are related with the actual classification performance. In experiments, six values {5, 6, 7, 8, 9, 10} of 2 and five values {95%, 96%, 97%, 98%, 99%} of the CR (the threshold of ACR) are selected for study and comparison, because these values have more positive effects in classifications than other values. Therefore, we do not use special optimizations to search the best parameters of KPCA. Totally, 30 kinds of combinations of ( 2 , CR) are studied in classification. For KPCA analysis, we perform the KPCA algorithm on the total training samples (320 samples) of all gases. The size of the kernel matrix K should be 320 multiply 320. Table 4 presents the contribution rate analysis of KPCA including the number of kernel principal components (KPCs) with their ACR lower than the threshold CR. We can see that the number of KPCs (dimension) with ACR < 95%,

Table 3 Statistic of the experimental data in this paper. Data

Formaldehyde

Benzene

Toluene

Carbon monoxide

Ammonia

Nitrogen dioxide

Training Testing Total

125 63 188

48 24 72

44 22 66

38 20 58

40 20 60

25 13 38

14


Regularized kernel discriminant analysis with a robust kernel for face recognition and verification.

Stable odor recognition by a neuro-adaptive electronic nose.

A Framework for the Multi-Level Fusion of Electronic Nose and Electronic Tongue for Tea Quality Assessment.

A Kernel Classification Framework for Metric Learning.

Multitask linear discriminant analysis for view invariant action recognition.

Electronic Nose for Recognition of Volatile Vapor Mixtures Using a Nanopore-Enhanced Opto-Calorimetric Spectroscopy.

Semisupervised kernel marginal Fisher analysis for face recognition.

Echocardiogram analysis in a pattern recognition framework.

L1-norm kernel discriminant analysis via Bayes error bound optimization for robust feature extraction.

Membership-degree preserving discriminant analysis with applications to face recognition.

Learning Kernel Extended Dictionary for Face Recognition.

A weighted string kernel for protein fold recognition.

A kernel-based approach for biomedical named entity recognition.

Electronic Nose Feature Extraction Methods: A Review.

Electronic tongue generating continuous recognition patterns for protein analysis.

The discriminant power of RNA features for pre-miRNA recognition.

Supervised orthogonal discriminant subspace projects learning for face recognition.

Generalized discriminant orthogonal nonnegative tensor factorization for facial expression recognition.

Exhaled breath analysis with electronic nose technology for detection of acute liver failure in rats.

Weighted Feature Gaussian Kernel SVM for Emotion Recognition.

Robust kernel representation with statistical local features for face recognition.

Towards a chemiresistive sensor-integrated electronic nose: a review.

Multiblock discriminant analysis for integrative genomic study.

Clustering-based discriminant analysis for eye detection.