Med Biol Eng Comput (2015) 53:609–622 DOI 10.1007/s11517-015-1264-0

ORIGINAL ARTICLE

Support vector machine and fuzzy C‑mean clustering‑based comparative evaluation of changes in motor cortex electroencephalogram under chronic alcoholism Surendra Kumar1 · Subhojit Ghosh2 · Suhash Tetarway3 · Rakesh Kumar Sinha1 

Received: 1 May 2014 / Accepted: 27 February 2015 / Published online: 13 March 2015 © International Federation for Medical and Biological Engineering 2015

Abstract  In this study, the magnitude and spatial distribution of frequency spectrum in the resting electroencephalogram (EEG) were examined to address the problem of detecting alcoholism in the cerebral motor cortex. The EEG signals were recorded from chronic alcoholic conditions (n = 20) and the control group (n = 20). Data were taken from motor cortex region and divided into five subbands (delta, theta, alpha, beta-1 and beta-2). Three methodologies were adopted for feature extraction: (1) absolute power, (2) relative power and (3) peak power frequency. The dimension of the extracted features is reduced by linear discrimination analysis and classified by support vector machine (SVM) and fuzzy C-mean clustering. The maximum classification accuracy (88 %) with SVM clustering was achieved with the EEG spectral features with absolute power frequency on F4 channel. Among the bands, relatively higher classification accuracy was found over theta band and beta-2 band in most of the channels when computed with the EEG features of relative power. Electrodes

* Rakesh Kumar Sinha [email protected] Surendra Kumar [email protected] Subhojit Ghosh [email protected] Suhash Tetarway [email protected] 1

Department of Bio‑Engineering, Birla Institute of Technology, Mesra, Ranchi 835215, Jharkhand, India

2

Department of Electrical Engineering, National Institute of Technology, Raipur 492010, Chhatisgarh, India

3

Department of Physiology, Rajendra Institute of Medical Sciences, Ranchi 834009, Jharkhand, India



wise CZ, C3 and P4 were having more alteration. Considering the good classification accuracy obtained by SVM with relative band power features in most of the EEG channels of motor cortex, it can be suggested that the noninvasive automated online diagnostic system for the chronic alcoholic condition can be developed with the help of EEG signals. Keywords  Alcohol · Cerebral motor cortex · Electroencephalogram · Support vector machine · Fuzzy C-mean clustering

1 Introduction Alcoholism is considered a primary chronic disease characterized by impaired control over drinking. Certain common symptoms associated with alcoholism are, craving (a strong need), impaired control (the inability to limit), physical dependence (withdrawal symptoms such as nausea, sweating, shakiness, and anxiety) when alcohol use is stopped after a long period of heavy drinking [4, 10, 11]. Though the short-term effects of alcohol can only cause slurred speech, nausea, vomiting, and disturbed sleep, chronic drinking may permanently damage the vital functions of many important organs including brain, kidney, and liver [15, 21]. Besides that, alcohol dependency is also associated with hypertension, coronary heart disease and cancer in different organs [12]. The existing literature in this regard also suggest that long-term heavy drinking can lead to shrinking of the brain and cause deficiencies in the fibers that carry information between brain cells, resulting in a wide range of neurological and neuropsychiatric disorders [8, 24, 25, 28]. It is often difficult for clinicians and researchers to identify the subjects suffering from alcoholism. The

13

610

subjects may complain about their digestion, pain or weakness, but due to social issues, hardly reveal their abuse of alcohol. To confirm the alcohol dependence of the subject, a doctor who suspects the alcohol problem with the subject may ask a series of questions but in general denial is a hallmark in alcoholism [27]. It is also important to note that blood and urine alcohol tests are not useful in diagnosing chronic alcoholism because these tests indicate consumption only within a time frame of alcohol intake. On the other hand, information extracted from the cerebral cortical activities from the alcoholic subjects may provide concrete platform for the establishment of alcoholism [6, 17]. The mood-altering mechanisms in alcoholism and its behavioral effects are not yet explored completely. Understanding of the neuropsychological mechanisms underlying alcohol craving is important in the effective analysis and treatment of alcohol dependence [1]. The brain signal or electroencephalogram (EEG) is an established dynamic index of cortical activation, cognitive function and consciousness and is, therefore, an intermediate phenotype for many behaviors in which arousal is implicated [29, 31]. The constantly changing EEG patterns depend on several factors associated with both internal and external environment, and it can also provide long-term insight of the psychophysiological dynamics of many chronic disorders including alcoholism. Advanced digital signal processing and soft computing tools can be considered very important tools in setting definite EEG spectral variations as a marker of alcoholism as it has been demonstrated in various psychopathological conditions. Owing to existing differences in the spectral properties of EEG, alcohol addiction can be examined. It is established that alcohol affects the motor system most prominently and thus the cerebral motor cortex region is considered to be most vulnerable to alcoholism [5]. In this context, the present work examines the EEG spectral changes, if any, on different motor cortex region by using the data extracted from C3, C4, CZ, P3, P4, PZ, F3, F4 and FZ electrodes. The present study aims at examining the utility of the EEG to detect the changes in brain electrical activity of alcoholic persons. Further, this study is designed to examine the differences in magnitude and distribution of different EEG frequency bands and determine whether the alteration in a particular band of EEG frequency spectrum is a consequence of alcohol use. Furthermore, with the help of these alcoholic EEG data, a procedure based on the combined framework of power spectrum, linear discrimination analysis (LDA), support vector machine (SVM) and fuzzy C-mean (FCM) clustering has been proposed for the identification of EEG spectral changes due to alcoholism.

13

Med Biol Eng Comput (2015) 53:609–622

2 Materials and methods 2.1 Subjects This study considers a total of 40 right-handed male subjects between 32 and 38 years of age and was equally divided in two groups, (1) alcoholic and (2) control. The alcoholic subjects consume regularly the same variety of alcohol known as Mahuwa, fermented from flower of Madura Longifolia for more than 10 years and are assured that they were not abused to any other addiction. The average alcohol content of this beverage is 13.45 % (w/v) [2]. These subjects were clinically confirmed for chronic alcoholism and picked from the local society with the help of an expert clinician. It was ensured that on the day of EEG recording they had not consumed alcohol before the EEG recording so that the permanent marker of the alcohol can be established. On the other hand, the healthy persons who have never taken any kind of alcohol or tobacco in their life time are considered as control subjects. 2.2 Data recording This research methodology was approved by the Institutional Ethical Committee of Rajendra Institute of Medical Sciences, Ranchi, India. The experiments were conducted following the ‘Ethical guidelines for Biomedical Research on Human Subjects’ of the Indian Council of Medical research. The EEG data were recorded using RMS System (Recorders & Medicare Systems Pvt. Ltd., India). Following 10–20 electrode placement system, EEG electrodes were placed and 19 channels of EEG signals were recorded. The digitization of signals was done with the sampling frequency of 256 Hz. The unipolar reference region was linked at the right and left earlobes, and the Nasion electrode is used as ground. Recording was carried out in a radiofrequency (RF) shielded soundproof room with controlled temperature (24 ± 1 °C), relative humidity of 45–50 %. The electrode impedances were maintained below 5 KΩ for the signal recording. Each EEG record was visually examined and further preprocessed to remove common artifacts. The data were exquisite with amplification gain of 104. In the present work, fast Fourier transformation (FFT) is applied on 2 s epochs. Butterworth filter of fourth order with bandpass filter of 0.5–40 Hz was applied to EEG epochs before calculation of FFT. 2.3 Feature extraction by FFT The frequency spectrum approach using FFT involving transformation of a signal from time domain to frequency domain is widely used for extracting features in EEG signals. The spectrum can be considered as a harmonic decomposition of the variations in the signal and is hence, equivalent to the

Med Biol Eng Comput (2015) 53:609–622

611

analyses of the distribution in the time domain properties of a particular electrophysiological phenomenon. The FFT utilizes the information contained in a signal as a random process to describe the domination of various frequency components. The FFT of the time domain waveforms epochs were decomposed in five sub-bands of frequency: delta (0.5–3.99 Hz), theta (4–7.99 Hz), alpha (8–11.99 Hz), beta-1 (12–15.99 Hz) and beta-2 (16–30 Hz) to get quantitative measures of the contribution of the various frequencies contained in the EEG signal. In the literature, the EEG frequency bands are also presented with the symbols δ (delta), ө (theta), α (alpha), β-1 (beta-1) and β-2 (beta-2), respectively. The Coolley–Tukey algorithm [23] has been used for the calculation of FFT. The algorithm provides a faster approach for calculating the discrete Fourier transform (DFT) by breaking the DFT into a number of DFT’s of smaller length, based on the following formula:

X(k) =

N 

(j−1)(k−1)

The high-dimensional feature vector was obtained following feature extraction, and it might contain some superfluous information that is not useful for classification. In this context, detecting alcoholism following the feature extraction, the extracted features were conditioned with normalization and dimension reduction is carried out by LDA. The LDA, SVM and FCM clustering algorithms were used for determining the effects of alcohol on EEG frequency spectrum. 2.4 Statistical methods In this study, feature vector of absolute power, relative power and peek power frequency of each band of EEG frequency spectrum in each derivation were implemented to normalize the distribution of power and coherence values, respectively. To stabilize the hypothetical differences

(1)

x(j)ωN

j=1

where ωN = exp((−2πi)/N) and x(j) is the time sequence to which the transformation is applied. After the evaluation of the DFT sequence, the power is computed as N 1  2 P= x (k) N

(2)

k=1

where N is number of data points in sequence. Following sub-band power evaluation, the feature vectors are calculated in term of absolute power, relative power and peak power frequency from their distribution defined as: (a) Absolute power The actual power in the EEG sub-band is computed as

Absolute power (Delta) =

205 

(3)

Px (k)

n=1

(b) Relative power The relative power is the ratio of total power in a band and total power in signal represented as Relative power(Delta) =

 205  n=1

Px (k) ÷

1024  n=1



Px (k) × 10

(4)

(c) Peak power frequency The peak power frequency (PPF) is the frequency where peak power is obtained in a given sub-band. Figure 1 presents the comparison between the raw EEG data from chronic alcoholic and control subjects along with their power spectrum.

Fig. 1  A sample diagram of the raw EEG data (2 s epochs) and their power spectrum of control and alcoholic subjects taken from CZ electrode. a EEG of CZ electrode position from control subjects. b Power spectral distribution of CZ electrode position from control subjects. c EEG of CZ electrode position from alcoholic subjects. d Power spectral distribution of CZ electrode position from alcoholic subjects

13

612

Med Biol Eng Comput (2015) 53:609–622

between EEG power spectra of the chronic alcoholism with respect to the control subjects, the drowned features were analyzed on each frequency band by using student’s t test with a grouping factor and a within-subject factor (electrode position). 2.5 Dimension reduction using linear discriminate analysis

Y = xt W .

When the combination of different frequency bands (31 combinations of five sub-frequency bands) were used, the feature vector will be in high dimension. This type of features requires large memory space as well as the computational time to teach the classifier and hence the dimension reduction is required. LDA is a popular dimension reduction algorithm, which aims at maximizing class covariance between different classes along with minimizing the class covariance within them, so that the class separability is preserved [7]. LDA is explored for the project axis on which the data points of different classes are distant, while the data points of the same class are close. This optimized projection of LDA can be computed by eigen decomposition on the scatter matrix of a given data. For the stable solution through LDA, the scatter matrix is required to be nonsingular, which makes LDA infeasible for cases in which the number of features is larger than number of samples. To overcome this singularity problem, dimension reduction is done in two stages. In the first stage, the dimension is reduced by principal component analysis (PCA) and singular value decomposition (SVD). The SVD(X) produces a diagonal matrix S of the same dimension as X, with nonnegative diagonal elements in decreasing order, and unitary matrices U and V, such that X  = U × S × V. Following SVD the eigenvalues and eigenvectors are calculated for the dataset. In the second stage, the LDA is used. The objective function of LDA is given as:

A=

aT sb a arga max T a sw a

Sb =

c 

(5)

mk (µk − µ)(µk − µ)T

(6)

k=1

Sw =

m c k   k=1

i=1

ik

ik

mk (µ − µ)(µ − µ)

T



(7)

where A is the transformation vector, μ is the total sample mean vector, mk is the number of sample in the kth class, µik is the ith sample in the kth class, Sb is the inter class scatter matrix and Sw is the intra class scatter matrix.

13

The eigenvector with highest eigenvalue of matrix S(S = Sw−1 Sb ) provides a direction for best class separation. The LDA reduces the original feature space from M dimension to N dimension (M > N). A new dataset y is created as a linear combination of all input feature x with weight W based on the following equation (8)

where W = [w1 , w2 . . . . . . wM ] is the matrix created with M eigenvector of matrix [S] contenting highest eigenvalues. 2.6 Support vector machine as classifier Support vector machine (SVM) is a supervised learning model based on statistical learning theory. While classifying by SVM, a set of hyper-planes are constructed in a high-dimensional space. The hyper-plane is constructed by mapping of n-dimensional feature vector into a k-dimensional space via a nonlinear function Φ(x) with the aim of minimizing the margin between two class of data. For assigning data to two different classes, the hyper-plane equation is given as:

Y (x) = W T Φ(x) =

K 

ΦWk (x) + W0

(9)

(k=1)

where W = [W1, W2, W3, ……, WK] is the weight vector and the W0 represents the bias weight. Proper separability is achieved by the hyper-plane for the long distance between the neighboring data point of both the class by use of kernel function. Kernel functions are generally used for mapping the feature space to a highdimensional space in which the class is linearly separable among the different types of kernels, i.e., linear, quadratic, classical and Gaussian. For the present work, linear kernel has been adopted. 2.7 Fuzzy C‑mean clustering as classifier Clustering algorithms involves location of well separated and compact clusters of some population. In conventional algorithms, a partition of the population, M-dimensional Z = {(xk , yk ) : k = 1, 2, , M}, is generated, so that each member of the population is assigned to a particular cluster. The algorithms uses the ‘rigid partition’ derived from classical set theory, where the degree to which a particular data element belongs to a cluster, contains values either 0 or 1, with 0 indicating null membership and 1 indicating full membership. FCM is a method of clustering, which allows one piece of data to belong to two or more clusters. Unlike the conventional clustering, fuzzy clustering involves overlapping clusters, i.e., the membership degree

Med Biol Eng Comput (2015) 53:609–622

613

to which a data element belongs to a particular cluster can attain any real value between 0 and 1. For a pre-defined number of clusters C, the elements of the population should fulfill the following conditions:

(a)

µik ∈ [0, 1] (b)

C 

1 ≤ i ≤ C, 1 ≤ k ≤ M µik = 1

1 ≤ k ≤ M.

(10)

convergence of fuzzy clustering is heavily dependent on the initial selected cluster centers. To overcome the initialization problem, each clustering process is repeated 100 times, with the cluster center for each group considered as the initial point for the subsequent iteration. Flow chart for the EEG data acquisition, feature extraction as well as classifier training and testing is shown in Fig. 2.

i=1

where µik denotes the degree to which k data element belongs to the cluster i. In the present work, widely used FCM clustering algorithm [3] is applied to detect alterations in FFT of motor cortex EEG signals due to chronic alcoholism by partitioning the space consisting of features extracted from individual subjects. The algorithm aims to minimize the distance between data points and cluster prototypes (centers) based on the following objective function [35].

J=

M C  

(µik )m dik2

(11)

i=1 k=1

where m > 1 is a weighting exponent, which determines the degree of fuzziness of the resulting clusters and dik is the distance between the data points and the cluster centers given as:

dik2 = (Zk − Vi )T (Zk − Vi )

(12)

The algorithm is initialized to some randomly selected clusters, which are iteratively updated to minimize the cost function. The elements of the membership matrix µik and the vector of cluster centers Vi are given as:

� 2 −1 � � � � Z −V� c � � �     �k i �   µik =  � �    j=1 �Z − V � �k j � 

M (µik )m Zk Vi = k=1 M m k=1 (µik )

(13)

(14)

The absolute power, relative power and peak power frequency of the frequency spectra of EEG epochs have been used to characterize the signal as a function of frequency sub-bands into control and alcoholic groups. The data points obtained from the spectrum is normalized in the range of [0–1] before the classification. Initially, the two cluster centers representing the alcoholic and control groups were selected randomly. This is followed by iteratively updating the centers, with the aim of minimizing the Euclidian distance between the data points represented in feature space and the cluster center. The final

3 Results 3.1 Changes in EEG frequency spectrum The variation in the relative power coefficients between control and alcoholic groups in the frequency domain for EEG epochs extracted from different electrodes is shown in Fig. 3, and the distribution of absolute power is depicted in Fig. 4. It has been observed from the percentage power distribution of EEG frequency spectrum from different subbands of nine electrodes position that the delta power in the chronic alcoholism is reduced with respect to control subjects (p 

Support vector machine and fuzzy C-mean clustering-based comparative evaluation of changes in motor cortex electroencephalogram under chronic alcoholism.

In this study, the magnitude and spatial distribution of frequency spectrum in the resting electroencephalogram (EEG) were examined to address the pro...
1020KB Sizes 0 Downloads 4 Views