2nd Reading February 3, 2015 12:37 1550003

International Journal of Neural Systems, Vol. 25, No. 2 (2015) 1550003 (13 pages) c World Scientific Publishing Company  DOI: 10.1142/S0129065715500033

Kernel Collaborative Representation-Based Automatic Seizure Detection in Intracranial EEG Shasha Yuan, Weidong Zhou,∗ Qi Yuan and Xueli Li School of Information Science and Engineering Shandong University, Jinan 250100, P. R. China Suzhou Institute of Shandong University Suzhou 215123, P. R. China ∗ [email protected] Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

Qi Wu, Xiuhe Zhao and Jiwen Wang Qilu Hospital, Shandong University Jinan 250100, P. R. China Accepted 11 December 2014 Published Online 5 February 2015 Automatic seizure detection is of great significance in the monitoring and diagnosis of epilepsy. In this study, a novel method is proposed for automatic seizure detection in intracranial electroencephalogram (iEEG) recordings based on kernel collaborative representation (KCR). Firstly, the EEG recordings are divided into 4s epochs, and then wavelet decomposition with five scales is performed. After that, detail signals at scales 3, 4 and 5 are selected to be sparsely coded over the training sets using KCR. In KCR, l2 -minimization replaces l1 -minimization and the sparse coefficients are computed with regularized least square (RLS), and a kernel function is utilized to improve the separability between seizure and nonseizure signals. The reconstructed residuals of each EEG epoch associated with seizure and nonseizure training samples are compared and EEG epochs are categorized as the class that minimizes the reconstructed residual. At last, a multi-decision rule is applied to obtain the final detection decision. In total, 595 h of iEEG recordings from 21 patients with 87 seizures are employed to evaluate the system. The average sensitivity of 94.41%, specificity of 96.97%, and false detection rate of 0.26/h are achieved. The seizure detection system based on KCR yields both a high sensitivity and a low false detection rate for long-term EEG. Keywords: Seizure detection; EEG; wavelet; kernel collaborative representation.

1. Introduction Epilepsy is one of the most prevalent neurological diseases, which is characterized by abnormal and excessive electrical discharges of neurons in the brain.1–4 It can affect any person at any age and any time. The episodes may vary from as low as once in a year to several times per day. In the whole world, more than 50 million people are diagnosed with epilepsy.5,6 The human knowledge about the



brain is still insufficient to understand the pathogenetic mechanism of an epileptic brain. Hence, the study of epilepsy still is an utmost important issue in the biomedical field. Electroencephalogram (EEG) can reflect the electrical activities of nerve cells in the brain7–13 and has been widely used to investigate brain disorders.14–22 Currently, in spite of rapid advances of the neuro-imaging techniques, EEG is still one of the most important diagnostic tools in

Corresponding author. 1550003-1

2nd Reading February 3, 2015 12:37 1550003

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

S. Yuan et al.

the epilepsy monitoring and seizure diagnose.23–28 Intracranial EEG (iEEG) is achieved by special electrodes implanted in the brain during a surgery. It is a meaningful method to measure the electrical activity of the brain and a mass of physiological and pathological information can be found in EEGs.29–34 However, if the analysis of EEG only depends on the visual inspection of neurologists, it must be a very time consuming and error-prone process because of the massive amount of EEG data. Therefore, automatic seizure detection using computer-aided diagnosis technology is valuable for assisting the diagnosis of epilepsy and relieving the workload of medical staff. By now, various methods have been proposed on the research of epileptic seizure detection with different degrees of success. One of the earliest seizure detectors was proposed by Gotman.35 In that method, the EEG was decomposed into halfwaves, and the amplitude of waves relative to the background, the duration and rhythmicity, etc., were extracted for seizure detection. Murro et al.36 developed a method that applied three features including relative amplitude, dominant frequency and rhythmicity of EEGs recorded from intracranial electrodes for the discriminant analysis. With the development of nonlinear dynamics theory, some nonlinear dynamics algorithms can be used for the analysis of EEG data.37–39 Furthermore, various nonlinear features have been estimated to characterize the behavior of EEG signals for seizure detection, such as approximate entropy,40 largest Lyapunov exponent,41 higher order spectra,42 blanket dimension and fractal intercept,43 etc. As a time-frequency analysis method, wavelet transform (WT) has been proved to be an effective tool for analyzing EEG signals. WT employs long time windows for low frequency information and short time intervals for high frequency information to localize abrupt changes in both time and frequency domains, which makes it suitable for the analysis of nonstationary signals such as EEG signals. Moreover, WT can decompose signals into different scales on different frequency band and provide more flexibility for analyzing EEG signals. Adeli et al.44 used the WT to analyze and characterize epileptiform discharges in the form of 3-Hz spike and wave complex in the patients with absence seizure. Khan and Gotman45 used the discrete wavelet transform

(DWT) to decompose intracerebral EEG signals into time-frequency representations and computed the features, such as energy, coefficient of variation and relative amplitude, for the chosen scales. Liu et al.46 conducted wavelet decomposition with five scales and extracted effective features, such as relative energy, relative amplitude, coefficient of variation and fluctuation index with the support vector machine (SVM) as the classifier for seizure detection. In recent years, sparse representation (SR) has shown strong ability in pattern classification. Incipiently, the SR was derived from the compressed sensing theory which was developed for reconstructing a sparse signal by exploiting its sparsity structure.47,48 Afterwards, Wright et al.49 proposed the sparse representation-based classification (SRC) algorithm for face recognition. In SRC, a test sample is sparsely coded over all the training samples and its sparse coefficient is obtained by solving l1 -minimization problem. And then it is classified through comparing the reconstruction errors. Yang et al.50 combined sparse coding with linear spatial pyramid matching for image classification. Yuan et al.51 recently combined SR with kernel trick to distinguish ictal EEGs. Zhang et al.52 proposed the collaborative representation-based classification (CRC) method which uses l2 -minimization instead of l1 minimization to compute the sparse coefficients. It is much more efficient than the SRC method. In this study, we apply the CRC method for detecting seizure events. The kernel trick is a very well-known technique in machine learning algorithms, which is a way of mapping the samples from a general space into a high dimensional inner product space using a nonlinear mapping.53,54 So far, many linear learning methods can be generalized to the corresponding nonlinear ones by using kernel tricks such as SVM,55,56 KPCA,57 and KFD.58 Since the EEG samples are always nonlinear separable within the original sample space, the kernel trick can help us represent EEG samples more accurately in the high dimensional space. This characteristic is in favor of classification ability of CRC. The kernel trick is employed in this study to improve the performance of CRC algorithm for seizure detection. In this work, we propose a novel method using WT and kernel collaborative representation-based classification (KCRC) for seizure detection. The rest

1550003-2

2nd Reading February 3, 2015 12:37 1550003

KCRC for Seizure Detection in iEEG

of this paper is organized as following: a brief introduction of the iEEG dataset is given in Sec. 2. In Sec. 3, the method used in our system is described in detail containing pre-processing, KCRC and postprocessing. The performance of this experiment is presented in Sec. 4. Section 5 is a discussion of the results and a conclusion is followed in Sec. 6.

and three extra-focal channels were previously chosen by certified epileptologists for all the patients. In our study, the three in-focus electrodes are used for seizure detection. In total, there are 87 seizure activities in this database and two to five hours containing seizures is recorded for each patient. The seizures range from less than 12 s to more than 15 min in duration. In addition, approximate 24 h of interictal EEG data without seizure events are employed for each of 20 patients, except the one patient (Patient 12) who has 46 h interictal data available. In our study, 595 h EEG data including ictal data and interictal data is used in all. The details of this EEG database for each patient are summarized in Table 1.

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

2. EEG Database The long-term iEEG database used in this study come from the Epilepsy Center of the University Hospital of Freiburg, Germany, which contains EEG recordings of 21 patients suffering from medically intractable focal epilepsy and they were recorded during pre-surgical epilepsy monitoring with invasive electrodes.59 All EEG signals were acquired using a Neurofile NT digital video EEG system with 128 channels and a 16 bit analogue-to-digital converter. The sample rate is 256 Hz. The onset and offset of the seizures were supervised based on identification of epileptic patterns preceding clinical manifestation of seizures in EEG recordings, and three in-focal

2.1. Training data For every patient, one or two seizures are selected randomly and the number of nonseizure data is triple that of seizure data for training. The diversity of seizures can improve detection performance. In total, 331 segments (4 s per segment) of seizure data and

Table 1. Details of the database we used in this study. The acronyms used in the table are SP: simple partial seizure, CP: complex partial seizure, GTC: generalized tonic-clonic seizure. Patient

Seizure origin

Seizure type

Number of used seizures

Mean seizure duration (s)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Temporal Frontal Temporal Temporal, Occipital Frontal Temporal Temporal Temporal Frontal Temporal Frontal Frontal Temporal, Occipital Temporal Frontal, Temporal Temporal Temporal Temporal, Occipital Frontal Temporal Temporal

SP,CP SP,CP,GTC SP,CP SP,CP,GTC SP,CP,GTC CP,GTC SP,CP,GTC SP,CP CP,GTC SP,CP,GTC SP,CP,GTC SP,CP,GTC SP,CP,GTC CP,GTC SP,CP,GTC SP,CP,GTC SP,CP,GTC SP,CP SP,CP,GTC SP,CP,GTC SP,CP

4 3 5 5 5 3 3 2 5 5 4 4 2 4 4 5 5 5 4 5 5

13.1 118.2 92.7 87.4 44.9 66.9 153.5 163.7 114.7 411.0 157.3 55.1 158.3 216.4 145.4 121.0 86.2 13.7 12.5 85.7 83.1



87

114.3

Total



1550003-3

2nd Reading February 3, 2015 12:37 1550003

S. Yuan et al.

993 segments of nonseizure data of 21 patients are used as the training data.

In our study, we select the Daubechies-4 wavelet to conduct wavelet decomposition on the original EEG epochs. Many studies and our previous work have confirmed that Daubechies-4 wavelet is more suitable to match the characteristics of the EEG signals.45,60–63 Moreover, the number of decomposition levels is chosen to be 5. Since the EEG signals are sampled at 256 Hz, every epoch is decomposed into five detail coefficients representing 64–128 Hz (D1), 32–64 Hz (D2), 16–32 Hz (D3), 8–16 Hz (D4) and 4–8 Hz (D5) and an approximation coefficient represents 0–4 Hz (A5). After that, three detail coefficients D3, D4 and D5 were selected for the following processing to capture the major seizure activities and eliminate the high-frequency noise in raw EEG signals. Figure 2 illustrates the details (D3–D5) of one epoch EEG data after the WT and most of the noises are filtered and suppressed. The seizure events generally include the spike and sharp waves. Differential operator can enhance the seizure activities and depress the background which facilitates automatic seizure detection.64 Suppose t1 , t2 and t3 are successive time points. If a spike occurs at t2 , the derivative results f (t2 ) − f (t1 ) and f (t3 ) − f (t2 ) will both have high value. Conversely, if the three points belong to normal background

2.2. Testing data In our work, 593.38 h EEG recordings (1.67 h seizure data and 591.71 h nonseizure data) containing 60 seizures of 21 patients are selected as the test data to evaluate the performance of our proposed algorithm. For each patient, the training data and testing data are independent of each other and have no overlap.

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

3.

Methods

The outline of the system is shown in Fig. 1. It illustrates various stages of this algorithm including preprocessing, KCRC and post-processing. Each stage will be introduced in detail in the following sections. 3.1. Pre-processing The EEG data of each channel is segmented into 4-s (1024 points) epochs separately using a moving window without overlap, which is long enough to capture the main stationary characteristics of the EEG as well as short enough to seize the burst of seizures. Then, the DWT is applied to each epoch.

Raw EEG

DWT & Differential operator

D3

KCRC

MAF & Threshold

D4

KCRC

MAF & Threshold

D5

KCRC

MAF & Threshold

Discrimination rule

Collar

Class label

(a)

residual with respect to the seizure training data

Testing sample Kernel Trick

Training set

Compute projection matrix

Sparse coefficient

Subtraction

Output

residual with respect to nonseizure training data (b)

Fig. 1. (a) Schematic diagram of the KCR-based seizure detection system. Various stages of the algorithm such as pre-processing, KCRC and post-processing are schematically shown. (b) The detailed structure of Kernel CRC. 1550003-4

2nd Reading February 3, 2015 12:37 1550003

KCRC for Seizure Detection in iEEG

(a)

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

(b)

Fig. 2. One epoch EEG data and the details (D3–D5) after the WT.

signal, the derivative results will have small value actually. Hence, a transformation based on differential operator is performed on the sub-bands of EEG epochs after wavelet decomposition in our study, which is defined as:   1  |D f (t)| , (1) F (t) = exp w where f (t) is an EEG epoch; D denotes the derivative with respect to t, and w is a positive constant, and in this work w = 100,000. Figure 3 shows the procedure of differential operator for 20-min EEG data containing one seizure event from patient 17. After the preprocessor procedure, three transformed sub-signals D3, D4, and D5 of each epoch, which are obtained through the wavelet decomposition, are used to do seizure detection. In our study, for training set X, we build three sub-training sets XD3 , XD4 and XD5 based on these levels respectively. And every testing signal is broken into three sub-signals and separately undergoes classification by the following approach. 3.2. Kernel collaborative representation-based classification 3.2.1. Collaborative representation-based classification SRC was employed for face recognition by Wright et al.49 In SR, any test sample can be represented as a sparse linear combination of training samples.

Fig. 3. The differential operator procedure of 20-min EEG data containing one seizure event from patient 17. (a) The EEG sub-signal on scale 3 after WT. (b) The preprocessed signal after differential operator transformation. The seizure event marked by the EEG experts is between the two vertical lines.

Suppose the training samples of every class compose a training set X = [X1 , X2 , . . . , XK ], we have K classes of subjects and Xi = [x1i , x2i , . . . , xni i ] ∈ Rm×ni denotes the training dataset from the ith class. For any testing sample y ∈ Rm , we could code it as y = Xα,

(2)

where α = [αT1 , αT2 , . . . , αTk ]T and αi is the coefficient vector associated with the ith class. If y belongs to ith class, thus it can be well represented by the training set of the same class y ≈ Xi αi , which implies that only the coefficients of αi have remarkable entries and αj (j = i) are equal or close to zero. We can obtain sparse coefficient vector α ˆ by solving an l1 -norm minimization problem as follows: α ˆ = arg min α1

s.t. Xα − y2 ≤ ε,

(3)

where ε is a small constant which denotes the tolerance of reconstruction error. Afterwards, Zhang et al.52 proved that it was the collaborative representation (CR) mechanism, which meant using all samples of every class in the training set represented the test sample collaboratively, but not the l1 -norm sparsity constraint that truly improved classification accuracy. Thus, they presented the CRC scheme using l2 -norm instead of l1 -norm sparsity constraint, which had significantly lower complexity. The vector

1550003-5

2nd Reading February 3, 2015 12:37 1550003

S. Yuan et al.

α can be obtained through the following formula: α ˆ = arg min{Xα − y22 + λα22 },

(4)

α

where λ is a regularization parameter and then the solution of Eq. (4) can be obtained via regularized least square (RLS) as: α ˆ = (XT X + λI)−1 XT y

(5)

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

here I denotes a unit matrix. After the coefficients vector α ˆ is calculated, the class of y could be obtained through computing the minimal residual ˆ i 2 . In addition, the identity(y) = arg min y − Xi α ˆ i 2 also has useful information, so this l2 -norm α part is added for the classification. The identity of y is shown as: ˆ i 2 /α ˆ i 2 ). identity(y) = arg min(y − Xi α

(6)

3.2.2. Kernel CRC The kernel trick is widely applied in machine learning algorithms by mapping samples into a high dimensional feature space. In this way, a testing sample can be represented as the linear combination of training samples from the same class more accurately in the high dimensional space than the original one.51,54,65 This characteristic is in favor of classification ability of CRC. For KCRC, samples are mapped into a high dimensional space first and then CRC is performed in this new feature space. Given a nonlinear mapping Φ : Rm → RF from the original data space to inner product space, a vector x ∈ Rm will become Φ(x) ∈ RF through this mapping. According to the kernel trick, we do not have to know the function Φ precisely, but the inner product of two mapped samples in the high dimensional space RF can be represented by the kernel function of these two samples in the original space Rm as following, 

T





Φ(x), Φ(x )RF = Φ(x) Φ(x ) = K(x, x ),

(8)

where β is the new sparse coefficient vector in the RF space. By multiplying a matrix on both sides of Eq. (8), we can get Φ(C)T Φ(y) = Φ(C)T Φ(X)β.

K(C, y) = K(C, X)β,

(10)

where the matrix C is generated by the training samples through the following procedure.66 Firstly, for the training set of each class Xi = [x1i , x2i , . . . , xni i ], ni j xi . Then we we calculate the mean vector µi = j=1 sort all training samples according to the ascending order of the distance between the mean vector and each sample. After that, half samples nearest µi are chosen to construct the matrix Ci = (n /2) 1 2 [µi , x i , x i , . . . , x i i ]. By combining Ci of each class, we get the matrix C = [C1 , C2 , . . . , CK ]. Hence, the sparse vector β can be computed by solving the l2 -minimization problem according to the CRC method with the analogy of Eqs. (4) and (5), and we obtain: ˆ = arg min{K(C, X)β − K(C, y)22 + λβ22 }, β β

(11) ˆ = (K(C, X)T K(C, X) + λI)−1 β × K(C, X)T K(C, y).

(12)

In summary, the algorithm of KCRC is shown as follows: (1) Construct the center samples matrix C based on the training set. (2) Map X to high dimensional space obtaining K(C, X) and normalize each column using unit l2 -norm. In our study, the kernel function we used is Gaussian RBF kernel K(X1 , X2 ) = exp(−X1 − X2 2 /2p2 ).

(13)

(3) Compute projection matrix Pk = (K(C, X)T K(C, X) + λI)−1 K(C, X)T .

(7)

where K( , ) is the kernel function in the space Rm . When we map the training samples and testing samples into high dimensional space, Eq. (2) becomes Φ(y) = Φ(X)β,

According to the definition of inner-product and the kernel trick, we can obtain

(14) (4) For a new testing sample y, calculate K(C, y) and normalize it to unit l2 -norm. (5) Calculate the sparse coefficient according to ˆ = Pk K(C, y). Eq. (12), i.e. β (6) The reconstitution residual of y on every class can be achieved as:

(9) 1550003-6

ri (y) = exp(ni /N )K(C, y) ˆi 22 /β ˆi 2 . − K(C, Xi )β 2

(15)

2nd Reading February 3, 2015 12:37 1550003

KCRC for Seizure Detection in iEEG

belongs to seizure part, its difference value must be greater than that of the sample which belongs to nonseizure part. So, the difference value is compared to a threshold. After comparison, we obtain the binary decisions: 1 — seizure; 0 — nonseizure. The value of threshold is different for each patient. Before threshold comparison, a central linear moving average filter (MAF) is applied to remove the burrs and reduce random noise, and some false decisions lasting a short time can be modified with this smoothing process. It is defined as:

(a)

xk =

N  1 xˆk+i , 2N + 1

(16)

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

i=−N

(b)

Fig. 4. An example of a detected seizure epoch. (a) The sparse coefficients obtained from the KCR. The left side of vertical line is the coefficients associated with the seizure training samples, while the right side is the coefficients with respect to the nonseizure training samples. (b) The residuals of this test EEG epoch with respect to the class of seizure and the class of nonseizure.

(7) Estimate the label of testing sample by identity(y) = arg mini ri (y). In step 6, considering the different number of training samples from each class, we multiply a weight about the ratio of the sample numbers on the residual. For a seizure epoch, its sparse coefficients associated with the seizure training samples should have larger values than the coefficients with the nonseizure training samples and the residual with respect to the class of seizure should be smaller than that with the class of nonseizure as well. Figure 4 illustrates an example of successful case which a seizure epoch is detected exactly. It can be perceived that the difference of sparse coefficients and residuals between two classes is obvious. 3.3. Post-processing For any one testing sample, we compute the two residuals with respect to both the seizure and nonseizure training set by the KCRC method. To make a binary decision, the difference value of the two residuals is applied. We use the residual with the nonseizure training samples minus the residual with the seizure training samples. If the testing sample

where x ˆ is the input signal, x is the filtered signal, 2N + 1 denotes the smoothing length of MAF. The number of N is patient specific. For each patient, 1-h continuous EEG data containing seizures from the training set was used to determine the number of N in the training stage. We adjusted the number of N to get the best classification result for this longterm EEG data. Then, the number of N was fixed in the testing stage for this patient. In this work, three channels are available for the multichannel EEG recordings, so multi-decision rule is necessary. In addition, as mentioned in Sec. 3.1, one testing epoch is broken into three sub-signals and then they are sparsely represented by corresponding sub-training set. So, there are three decisions for one testing epoch. Hence, the discrimination rule is defined as following: firstly, we put the decisions of three channels together on the basis of different subtraining sets. If the seizures are detected at least in two channels simultaneously, it will be marked as ‘seizure’. After that, we obtain three results associated with the three sub-training sets. If having at least one signed ‘1’ in the three results, it will be labeled as ‘seizure’. Otherwise, it will be marked as ‘nonseizure’. Seizure onset changes gradually and the smoothing procedure may make the beginning and ending of seizures obscure. A collar operation is used for seizure decision in the last step to compensate for the missing part of seizure.67 Each binary decision is extended x epochs from both sides in this process. 4.

Results

In this work, all the experiments were carried out in MATLAB7.0 environment running in an AMD

1550003-7

2nd Reading February 3, 2015 12:37 1550003

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

S. Yuan et al.

Sampson processor with 2.40 GHz. We implemented wavelet decomposition on EEG epochs with five scales and chose scales 3, 4 and 5 to conduct seizure detection using KCRC method described in Sec. 3.2. The training data were selected randomly from iEEG recordings for each patient and three sub-training sets were built with respect to different scales. Afterwards, when a testing EEG epoch came, the KCRC method was employed to represent it sparsely over the sub-training set and the post-processing was performed to get the final result. After post-processing, the performance of the proposed seizure detection system was evaluated with two approaches: the segment-based level and the event-based level. For the segment-based level, the EEG segments classified by our algorithm are compared with those marked by experts. There are three statistical measures: sensitivity, specificity and recognition accuracy, which are defined as following: • Sensitivity: Number of true positives/the total number of seizures marked by the EEG experts. The seizure identified by both our detector and by the EEG experts is defined as true positive (TP). • Specificity: Number of true negatives/the total number of nonseizures marked by the EEG experts. True negative (TN) represents the nonseizures identified by our method and by the experts. • Recognition accuracy: Number of correctly classified EEG epochs/the total number of EEG epochs. The results of each patient according to the three statistical measures are depicted in Table 2. The mean of sensitivity for all the 21 patients are 94.41% and the means of the specificity and recognition accuracy are both over 96.5%. Moreover, the best sensitivity of 100%, specificity of 100%, and recognition accuracy of 99.99%, are achieved for different patients, respectively. Furthermore, 11 patients (patients 1, 2, 3, 4, 6, 7, 8, 11, 12, 13, and 18), over half of all the patients, have the sensitivity of 100%. There are 15 patients having the specificities higher than 99.00% and all patients had the specificities and recognition accuracies more than 94% except the two patients (patients 5, 10). In addition, the event-based approach is much closer to the assessment of the clinical application

Table 2. The results of our detection method for each patient on segment-based level. Patient

Sensitivity (%)

Specificity (%)

Recognition accuracy (%)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

100 100 100 100 73.91 100 100 100 97.33 79.78 100 100 96.67 98.51 89.13 95.46 100 100 70 93 88.76

99.93 99.80 99.85 99.50 67.68 99.99 99.98 97.19 94.49 83.16 99.97 99.97 99.05 99.85 99.09 98.09 99.95 99.67 100 99.51 99.72

99.93 99.80 99.85 99.50 67.69 99.99 99.98 97.20 94.48 83.14 99.97 99.97 99.04 99.84 99.02 98.07 99.95 99.67 99.99 97.44 99.68

Mean

94.41

96.97

96.87

considering the feasibility of the proposed method. Hence, two statistical measures, the number of true detected seizures and false detection rate are calculated. Any detected event by our system that overlapped a seizure event labeled by the EEG experts is defined as a true seizure detection. A sequence of consecutive false positives which are not adjacent with the seizures is defined as a false detection. False positive (FP) means the number of seizure segments identified only by our algorithm but not experts. The system performance on event-based level is shown in Table 3. In our study, 60 seizures are used to test our method totally and 58 seizures are detected correctly, and for 19 patients, all seizure events are detected rightly. The two missed seizures come from patients 5 and 19 and the cause of missing is that the seizure events are with too short duration. Besides, the average false detection rate is 0.26/h for all the 21 patients and approximately half of all the patients have false detection rate lower than 0.1/h. For patient 10, the results are unsatisfactory because of electrode box disconnection and reconnection.

1550003-8

2nd Reading February 3, 2015 12:37 1550003

KCRC for Seizure Detection in iEEG

Table 3. The results of our detection method for each patient on event-based level.

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

Patient

Number of Number Sensitivity False experts-marked of true (%) detection seizures detections (h)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

2 2 4 4 3 2 2 1 4 4 2 3 1 3 3 3 4 3 2 4 4

2 2 4 4 2 2 2 1 4 4 2 3 1 3 3 3 4 3 1 4 4

100 100 100 100 66.67 100 100 100 100 100 100 100 100 100 100 100 100 100 50 100 100

0.07 0.11 0.04 0.04 1.11 0 0 0.79 0.11 1.96 0 0 0.28 0 0.07 0.25 0 0.43 0 0.14 0.07

Total

60

58

96.03

0.26

5. Discussion Our goal is to propose an automatic system to detect the seizure rapidly and accurately with high sensitivity as well as low false detection rate. The SR is introduced as a novel tool and shows its powerful capability in seizure detection. In the traditional seizure detection method, most algorithms need to conduct feature extraction and choose suitable features feeding into a classifier to make classification. The calculation of EEG features is time consuming and the selection of classifiers is difficult. However, the method proposed in this study is different from the traditional framework and there is no need to select and extract features as well as the use of classifier. In this method, the testing sample is represented sparsely by the training samples and we compare the residuals associated with each class to distinguish into seizure or nonseizure. It provides a fresh scheme for seizure detection which is different with conventional approach. The KCR is proposed based on SR. As we know, in SR, the sparse coefficient is measured by

l1 -minimization which needs time-consuming iterative operations. Although many fast algorithms have been proposed to speed up the l1 -minimization process, it is still difficult for real-time detection. However, for KCR, the l2 -minimization process is applied and the sparse vector can be analytically computed by the RLS algorithm and projection matrix Pk can be computed beforehand in the training stage. At the testing stage, we only need to calculate Eq. (12) which greatly reduces the amount of calculation and increase algorithm speed. The time consumed in the training stage is about 15 s. For 1-h EEG data, the time consumed in the testing stage is about 1 min, which is very short compared with the length of the data. Therefore, the computational burden of the proposed method is low enough and the fast speed makes it feasible to be developed into an online seizure detection system. In addition, the pre-processing and postprocessing procedures used also play an important role in this method. In the pre-processing stage, the WT decompose a signal into different scales with different wavelet coefficients which provides the diversity of seizures in different frequency bands and the flexibility in analyzing signals. Even though differential is rather sensitive to noise, the wavelet decomposition can filter out the high-frequency artifacts in EEG signals which weakens the disadvantage and avoids some artifacts being incorrectly classified as epileptic seizures. The first-order derivative of the EEG signal can highlight the spikes and suppress

Fig. 5. The receiver operating characteristic (ROC) curve of patient 16.

1550003-9

2nd Reading February 3, 2015 12:37 1550003

S. Yuan et al.

Table 4. Method

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

Improved patient specific seizure detection68 Multistage seizure detection69 Differential operator and windowed variance64 A multistage fuzzy logic algorithm70 Our proposed algorithm

Comparison of the performance for different methods.

Event-based sensitivity (%)

Number of patients used

Number of seizures tested

15

63

529



5

24

146

3

91.525

15

59

428

3

95.8

20

56

112.45

6

96.03

21

60

593.38

3

78 87.5

the background and may help to identify seizure which is not so prominent in the raw signal. In the post-processing part, smoothing and collar technique enable us to handle the burr and the imprecise boundaries between nonseizure and seizure EEGs which help to reduce misjudgment and locate the seizure more accurately. Figure 5 shows the receiver operating characteristic (ROC) curve of patient 16. The area under this ROC curve (AUC) is 0.9943, which indicates that our proposed method has a good performance for seizure detection. The Freiburg EEG database used in this study has been applied in many other seizure detection systems. Chua et al.68 developed an improved patient specific seizure detection applying a subjectindependent quadratic discriminant classifier. It was tested on 63 seizures of 15 patients and obtained a sensitivity of 78%. Raghunathan et al.69 proposed a multistage seizure detection which used a DWTbased filtering block and a “feature extraction block” containing coastline and variance energy features. The sensitivity of this method was 87.5% from five patients with 24 seizures. Majumdar and Vardhan64 employed differential operator and windowed variance to seizure detection. A sensitivity of 91.525% was obtained by their method with only 15 patients used, which is lower than our system. Moreover, our method detected all patients in the Freiburg EEG database and the testing dataset was larger than that used in the three methods mentioned above. Rabbi and Fazel-Rezai70 developed a multistage fuzzy logic algorithm using features of amplitude, frequency and entropy-based for epileptic seizure onset detection. The method was evaluated on the same iEEG database from 20 patients with 56 seizures

Total duration of data (h)

Number of channels used

and their system detected 54 seizures and 2 seizures were missed. The data from patient 10 was discarded due to the presence of electrode movement artifacts. Compared to this system, the total duration of testing data in our method is much longer and our method evaluated with 60 seizures from all 21 patients yields a comparable sensitivity. In addition, our method has lower computational complexity compared with this algorithm which used four EEG features. Table 4 depicts a comparison on the results between their methods and ours. Compared to other seizure detection systems, the proposed algorithm yields a higher sensitivity and better performance.

6.

Conclusion

In this work, we presented a novel method of seizure detection using wavelet, CR and kernel trick on iEEGs. Unlike the conventional seizure detection methods, the calculation and choice of EEG features are avoided in the proposed algorithm. The testing samples are sparsely represented over the training dataset and the residuals between raw signal and its reconstruction with each class are compared to classify seizure or nonseizure. It reduces the computational burden greatly via applying l2 -minimization in KCR and the kernel trick makes it more efficacious by improving the separability of EEG signals in high dimensional space. The experimental results show that this proposed method achieves a high segmentbased sensitivity of 94.41% and an event-based sensitivity of 96.03%. The satisfying results and fast speed of the algorithm make it possible to be used for realtime seizure detection in clinical applications.

1550003-10

2nd Reading February 3, 2015 12:37 1550003

KCRC for Seizure Detection in iEEG

Acknowledgments The support of the Key Program of Natural Science Foundation of Shandong Province (No. ZR2013FZ002), the Program of Science and Technology of Suzhou (No. ZXY2013030), the Development Program of Science and Technology of Shandong (No. 2014GSF118171), and the Fundamental Research Funds of Shandong University (No. 2012DX008, 11170074611102) is gratefully acknowledged.

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

References 1. W. A. Hauser, J. F. Annegers and W. A. Rocca, Descriptive epidemiology of epilepsy: Contributions of population-based studies from Rochester, Minnesota, Mayo Clin. Proc. 71(6) (1996) 578–586. 2. S. Sanei and J. A. Chambers, EEG Signal Processing (John Wiley & Sons Ltd, Chichester, 2007), p. 161. 3. R. A. B. Badawy, G. D. Jackson, S. F. Berkovic and R. A. L. Macdonell, Cortical excitability and refractory epilepsy: A three-year longitudinal transcranial magnetic stimulation study, Int. J. Neural Syst. 23(1) (2013) 1250030. 4. R. A. B. Badawy, S. J. Vogrin, A. Lai and M. J. Cook, On the midway to epilepsy: Is cortical excitability normal in patients with isolated seizures? Int. J. Neural Syst. 24(2) (2014) 1430002. 5. T. Gandhi, B. K. Panigrahi, M. Bhatia and S. Anand, Expert model for detection of epileptic activity in EEG signature, Expert Syst. Appl. 37(4) (2010) 3513–3520. 6. K. Vonck, M. Sprengers, E. Carrette, I. Dauwe, M. Miatton, A. Meurs, L. Goossens, V. D. Herdt, R. Achten, E. Thiery, R. Raedt, D. V. Roost and P. Boon, A decade of experience with deep brain stimulation for patients with refractory medial temporal lobe epilepsy, Int. J. Neural Syst. 23(1) (2013) 1250034. 7. H. Adeli and S. Ghosh-Dastidar, Automated EEGBased Diagnosis of Neurological Disorders: Inventing the Future of Neurology (CRC Press, Florida, 2010), p. 72. 8. M. Ahmadlou and H. Adeli, Wavelet-Synchronization Methodology: A new approach for EEG-based diagnosis of ADHD, Clin. EEG Neurosci. 41(1) (2010) 1–10. 9. M. Ahmadlou and H. Adeli, Functional community analysis of brain: A new approach for EEG-based investigation of the brain pathology, Neuroimage 58(2) (2011) 401–408. 10. M. Ahmadlou, H. Adeli and A. Adeli, Fractality and a wavelet-chaos-neural network methodology for EEG-based diagnosis of autistic spectrum disorder,J. Clin. Neurophysiol. 27(5) (2010) 328–333.

11. Z. Sankari and H. Adeli, Probabilistic neural networks for EEG-based diagnosis of Alzheimer’s disease using conventional and wavelet coherence, J. Neurosci. Methods 197(1) (2011) 165–170. 12. G. Lee, M. Kwon, S. Kavuri and M. Lee, Actionperception cycle learning for incremental emotion recognition in a movie clip using 3D fuzzy GIST based on visual and EEG signals, Integr. Comput.Aided Eng. 21(3) (2014) 295–310. 13. C. Zhang, H. Wang, H. Wang and M. Wu, EEGbased expert system using complexity measures and probability density function control in alpha subband, Integr. Comput.-Aided Eng. 20(4) (2013) 391– 405. 14. L. B. Good, S. Sabesan, S. T. Marsh, K. S. Tsakalis and L. D. Iasemidis, Control of synchronization of brain dynamics leads to control of epileptic seizures in rodents, Int. J. Neural Syst. 19(3) (2009) 173– 196. 15. Z. Sankari, H. Adeli and A. Adeli, Intrahemispheric, interhemispheric and distal EEG coherence in alzheimer’s disease, Clin. Neurophysiol. 122(5) (2011) 897–906. 16. M. Ahmadlou and H. Adeli, Fuzzy synchronization likelihood with application to attentiondeficit/hyperactivity disorder, Clin. EEG Neurosci. 42(1) (2011) 6–13. 17. M. Ahmadlou, H. Adeli and A. Adeli, Spatiotemporal analysis of relative convergence (STARC) of EEGs reveals differences between brain dynamics of depressive women and men, Clin. EEG Neurosci. 44(3) (2013) 175–181. 18. A. Shoeb, J. Guttag, T. Pang and S. Schachter, Noninvasive computerized system for automatically initiating vagus nerve stimulation following patient specific detection of seizures or epileptiform discharges, Int. J. Neural Syst. 19(3) (2009) 157–172. 19. M. Ahmadlou, H. Adeli and A. Amir, Graph theoretical analysis of organization of functional brain networks in ADHD, Clin. EEG Neurosci. 43(1) (2012) 5–13. 20. G. Rodr´ıguez-Berm´ udez, P. J. Garc´ıa-Laencina and J. Roca-Dorda, Efficient automatic selection and combination of EEG features in least squares classifiers for motor-imagery brain computer interfaces, Int. J. Neural Syst. 23(4) (2013) 1350015. 21. F. Cong, A. H. Phan, P. Astikainen, Q. Zhao, Q. Wu, J. K. Hietanen, T. Ristaniemi and A. Cichocki, Multi-domain feature extraction for small eventrelated potentials through non-negative multi-way array decomposition from low dense array EEG, Int. J. Neural Syst. 23(2) (2013) 1350006. 22. L. J. Herrera, C. M. Fernandes, A. M. Mora, D. Migotina, R. Largo, A. Guillen and A. C. Rosa, Combination of heterogeneous EEG feature extraction methods and stacked sequential learning for sleep stage classification, Int. J. Neural Syst. 23(3) (2014) 1350012.

1550003-11

2nd Reading February 3, 2015 12:37 1550003

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

S. Yuan et al.

23. K. Lehnertz, Non-linear time series analysis of intracranial EEG recordings in patients with epilepsy — An overview, Int. J. Psychophysiol. 34(1) (1999) 45–52. 24. U. R. Acharya, S. V. Sree, S. Chattopadhyay, W. Yu and P. C. A. Ang, Application of recurrence quantification analysis for the automated identification of epileptic EEG signals, Int. J. Neural Syst. 21(3) (2011) 199–211. 25. T. S. Nelson, C. L. Suhr, D. R. Freestone, A. Lai, A. J. Halliday, K. J. McLean, A. N. Burkitt and M. J. Cook, Closed-loop seizure control with very high frequency electrical stimulation at seizure onset in the GAERS model of absence epilepsy, Int. J. Neural Syst. 21(2) (2011) 163–173. 26. S. Ghosh-Dastidar, H. Adeli and N. Dadmehr, Mixed-band wavelet-chaos-neural network methodology for epilepsy and epileptic seizure detection, IEEE Trans. Biomed. Eng. 54(9) (2007) 1545–1551. 27. S. Ghosh-Dastidar and H. Adeli, Improved spiking neural networks for EEG classification and epilepsy and seizure detection, Integr. Comput.-Aided Eng. 14(3) (2007) 187–212. 28. A. Temko, G. Boylan, W. Marnane and G. Lightbody, Robust neonatal EEG classification through adaptive background modeling, Int. J. Neural Syst. 23(4) (2013) 1350018. 29. U. R. Acharya, R. Yanti, Z. J. Wei, M. M. Ramakrishnan, T. J. Hong, R. J. Martis and L. C. Min, Automated diagnosis of epilepsy using CWT, HOS and Texture parameters, Int. J. Neural Syst. 23(3) (2013) 1350009. 30. R. J. Martis, U. R. Acharya, J. H. Tan, A. Petznick, L. Tong, C. K. Chua and E. Y. Ng, Application of intrinsic time-scale decomposition (ITD) to EEG signals for automated seizure prediction, Int. J. Neural Syst. 23(5) (2013) 1350023. 31. D. Sherman, N. Zhang, S. Garg, N. V. Thakor, M. A. Mirski, M. A. White and M. J. Hinich, Detection of nonlinear interactions of EEG alpha waves in the brain by a new coherence measure and its application to epilepsy and anti-epileptic drug therapy, Int. J. Neural Syst. 21(2) (2011) 115–126. 32. Z. Sankari, H. Adeli and A. Adeli, Wavelet coherence model for diagnosis of alzheimer’s disease, Clin. EEG Neurosci. 43(3) (2012) 268–278. 33. S. Ghosh-Dastidar, H. Adeli and N. Dadmehr, Principal component analysis-enhanced cosine radial basis function neural network for robust epilepsy and seizure detection, IEEE Trans. Biomed. Eng. 55(2) (2008) 512–518. 34. S. Ghosh-Dastidar and H. Adeli, A new supervised learning algorithm for multiple spiking neural networks with application in epilepsy and seizure detection, Neural Netw. 22 (2009) 1419–1431.

35. J. Gotman, Automatic recognition of epileptic seizures in the EEG, Electroencephalogr. Clin. Neurophysiol. 54(5) (1982) 530–540. 36. A. M. Murro, D. W. King, J. R. Smith, B. B. Gallagher, H. F. Flanigin and K. Meador, Computerized seizure detection of complex partial seizures, Electroencephalogr. Clin. Neurophysiol. 79(4) (1991) 330–333. 37. H. Adeli, S. Ghosh-Dastidar and N. Dadmehr, Alzheimer’s disease and models of computation: Imaging, classification, and neural models, J. Alzheimers Dis. 7(3) (2005) 187–199. 38. H. Adeli, S. Ghosh-Dastidar and N. Dadmehr, A spatio-temporal wavelet-chaos methodology for EEG-based diagnosis of alzheimer’s disease, Neurosci. Lett. 444(2) (2008) 190–194. 39. U. R. Acharya, S. V. Sree, P. C. A. Ang, R. Yanti and S. Jasjit, Application of non-linear and wavelet based features for the automated identification of epileptic EEG signals, Int. J. Neural Syst. 22(2) (2012) 1250002. 40. V. Srinivasan, C. Eswaran and N. Sriraam, Approximate entropy based epileptic EEG detection using artificial neural networks, IEEE Trans. Inf. Technol. Biomed. 11(3) (2007) 288–295. ¨ 41. E. D. Ubeyli, Lyapunov exponents/probabilistic neural networks for analysis of EEG signals, Expert Syst. Appl. 37(2) (2010) 985–992. 42. U. R. Acharya, S. V. Sree and J. S. Suri, Automatic detection of epileptic EEG signals using higher order cumulant features, Int. J. Neural Syst. 21(5) (2011) 403–414. 43. Y. Wang, W. Zhou, Q. Yuan, X. Li, Q. Meng, X. Zhao and J. Wang, Comparison of ictal and interictal EEG signals using fractal features, Int. J. Neural Syst. 23(6) (2013) 1350028. 44. H. Adeli, Z. Zhou and N. Dadmehr, Analysis of EEG records in an epileptic patient using wavelet transform, J. Neurosci. Methods 123(1) (2003) 69–87. 45. Y. U. Khan and J. Gotman, Wavelet based automatic seizure detection in intracerebral electroencephalogram, Clin. Neurophysiol. 114(5) (2003) 898–908. 46. Y. Liu, W. Zhou, Q. Yuan and S. Chen, Automatic seizure detection using wavelet transform and SVM in long-term intracranial EEG, IEEE Trans. Neural Syst. Rehabil. Eng. 20(6) (2012) 749–755. 47. D. L. Donoho, Compressed sensing, IEEE Trans. Inform. Theory 52(4) (2006) 1289–1306. 48. E. J. Candes and M. B. E. Wakin, An introduction to compressive sampling, IEEE Signal Process. Mag. 25(2) (2008) 21–30. 49. J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry and Y. Ma, Robust face recognition via sparse representation, IEEE Trans. Pattern Anal. Mach. Intell. 31(2) (2009) 210–227.

1550003-12

2nd Reading February 3, 2015 12:37 1550003

Int. J. Neur. Syst. 2015.25. Downloaded from www.worldscientific.com by UNIVERSITY OF WATERLOO on 03/07/15. For personal use only.

KCRC for Seizure Detection in iEEG

50. J. Yang, K. Yu, Y. Gong and T. Huang, Linear spatial pyramid matching using sparse coding for image classification, in Proc. CVPR (2009), pp. 1794– 1801. 51. Q. Yuan, W. Zhou, S. Yuan, X. Li, J. Wang and G. Jia, Epileptic EEG Classification Based On Kernel Sparse Representation, Int. J. Neural Syst. 24(4) (2014) 1450015. 52. L. Zhang, M. Yang and X. Feng, Sparse representation or collaborative representation: Which helps face recognition? in Proc. IEEE Int. Conf. Computer Vision (2011), pp. 471–478. 53. K. R. M¨ uller, S. Mika, G. R¨ atsch, K. Tsuda and B. Sch¨ olkopf, An introduction to kernel-based learning algorithms, IEEE Trans. Neural. Netw. 12(2) (2001) 181–201. 54. L. Zhang, W. Zhou, P. Chang, J. Liu, Z. Yan, T. Wang and F. Li, Kernel sparse representationbased classifier, IEEE Trans. Signal Process. 60(4) (2012) 1684–1695. 55. C. J. C. Burges, A tutorial on support vector machines for pattern recognition, Data Min. Knowl. Discov. 2(2) (1998) 121–167. 56. D. Li, L. Xu, E. D. Goodman, Y. Xu and Y. Wu, Integrating a statistical background-foreground extraction algorithm and SVM classifier for pedestrian detection and tracking, Integr. Comput.-Aided Eng. 20(3) (2013) 201–216. 57. B. Sch¨ olkopf, A. Smola and K. R. M¨ uller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput. 10(5) (1998) 1299–1319. 58. S. Mika, G. Ratsch, J. Weston, B. Sch¨ olkopf and K. R. M¨ uller, Fisher discriminant analysis with kernels, in Neural Networks for Signal Processing IX (1999), pp. 41–48. 59. Freiburg Seizure Prediction Project, Freiburg, Germany, Available at: http://epilepsy.uni-freburg.de/ freiburg-seizure-prediction-project/eeg-database. 60. T. Kalayci and O. Ozdamar, Wavelet preprocessing for automated neural network detection of EEG spikes, IEEE Eng. Med. Biol. 14(2) (1995) 160–166.

61. A. Subasi, EEG signal classification using wavelet feature extraction and a mixture of expert model, Expert Syst. Appl. 32(4) (2007) 1084–1093. 62. S. Yuan, W. Zhou, Q. Yuan, Y. Zhang and Q. Meng, Automatic seizure detection using diffusion distance and BLDA in intracranial EEG, Epilepsy Behav. 31 (2014) 339–345. 63. W. Zhou, Y. Liu, Q. Yuan and X. Li, Epileptic Seizure Detection Using Lacunarity and Bayesian Linear Discriminant Analysis in Intracranial EEG, IEEE Trans. Biomed. Eng. 60(12) (2013) 3375– 3381. 64. K. K. Majumdar and P. Vardhan, Automatic seizure detection in ECoG by differential operator and windowed variance, IEEE Trans. Neural Syst. Rehabil. Eng. 19(4) (2011) 356–365. 65. J. Yin, Z. Liu, Z. Jin and W. Yang, Kernel sparse representation based classification, Neurocomputing 77(1) (2012) 120–128. 66. S. Yang, Y. Han and X. R. Zhang, A sparse kernel representation method for image classification, IJCNN (2012), pp. 1–7. 67. A. Temko, E. Thomas, W. Marnane, G. Lightbody and G. Boylan, EEG-based neonatal seizure detection with Support Vector Machines, Clin. Neurophysiol. 122(3) (2011) 464–473. 68. E. C. Chua, K. Patel, M. Fitzsimons and C. J. Bleakley, Improved patient specific seizure detection during pre-surgical evaluation, Clin. Neurophysiol. 122(4) (2011) 672–679. 69. S. Raghunathan, A. Jaitli and P. P. Irazoqui, Multistage seizure detection techniques optimized for low-power hardware platforms, Epilepsy Behav. 22 (2011) 61–68. 70. A. F. Rabbi and R. Fazel-Rezai, A fuzzy logic system for seizure onset detection in intracranial EEG, Comput. Intell. Neurosci. 2012 (2012) 1.

1550003-13

Kernel collaborative representation-based automatic seizure detection in intracranial EEG.

Automatic seizure detection is of great significance in the monitoring and diagnosis of epilepsy. In this study, a novel method is proposed for automa...
327KB Sizes 0 Downloads 8 Views