Detection of neuroelectric signals from multiple data channels by optimum linear filter methods.

Electr0encephalography and Clinical Neurophysiology, 1975, 38: 19b 198 q Elsevier Scientific Publishing Company, Amsterdam Printed in The Netherlands

TECHNICAL

191

CONTRIBU'IlON

D E F E C T I O N OF N E U R O E L E C T R 1 C S I G N A L S F R O M M U L T I P L E DATA C H A N N E L S BY O P T I M U M L I N E A R F I L T E R M E T H O D 1 NtAHMOOD J. NAHVI, CHARLES D . WOODY, ROBERT U N G A R AND A . R . SttARAFAT 4ria Mehr Utlit~ersil 3 , Tt,hcan ( lran) and U C L A Mental Retaidation ('enter, Los 4n#eles. Cal~/i 90024 ( U.S. 4.)

{Accepted for publication: September 27, 19'74)

It ~as recently shown (Woody and Nahvi 1973) that the theory of optimum linear filtering (Wiener 1933: Zadeh and Ragazzini 1952) can be used as a general mathematical descriptor for detection of certain cortical neuroelectric signals in noise. The signals examined were gross potentials recorded from a single locus in the coronal-precruciate cortex of the cat, and their appearance was related to the initiation of conditioned facial movements. Detection of the signal was a useful approximation of detection of the incipient movement. The signal-to-noise ratio at single cortical recording loci provided signal detection at 83°~, success levels (Pal=0.83) with an associated false alarm probability of 5.5 x 10-3 (pf = 5.5 x 10-3). However, this was inadequate for signal detection levels required for th e successful operation of a prosthetic device for motor control, the envisioned application of such a detection process. A possible means for improving detection capability might lie in combining optimum filter analyses for comparable data obtained from multiple cortical recording loci. This report examines the outcome of such an approach.

The output of the filter, Yo, is then passed through a threshold discriminator with level set according to a prespecified constraint such as minimization of a cost function (cf Woody and Nahvi 1973). When the threshold level is exceeded, the signal is recognized. If the associated noise is gaussian, the relationship between probability of detection, P~, probability of false alarm, Pf, and the signal-to-noise ratio, d, at the filter's output, is given by Fig. 1, A. A particular region or level of detection performance will be defined by a lower bound of Pd together with an upper bound of Pf. Such a region is shown in Fig. 1, A, for Pd ~>0.999 and Pf 410 5. This region was arbitrarily chosen as representing a satisfactory level of detection of signals for controlling the operation of a motor prosthesis. It is seen that to achieve such a detection level, the signal-to-noise ratio at the output of the matched filter must at the very least be greater than 7.8. Previous investigations in cat (Woody and Nahvi 1973) and in man (Vaughan et al. 1968) indicate that such a signal-to-noise ratio may not be realizable from data from any single cortical recording locus.

DETECTION AS A F U N C T I O N NOISE CHARACTERISTICS

POSSIBLE I M P R O V E M E N T IN DETECTION BY I N C O R P O R A T I N G I N F O R M A T I O N FROM SEVERAL DATA CHANNELS

OF SIGNAL A N D

When a signal, So (t), of known shape and of finite duration, T, is embedded in stationary additive noise, no(t), with zero mean, its presence can in theory be detected best by a linear matched filter (Wiener 1933). The filter is matched to the signal such that its impulse response, h(t), is equal to s,, (T t). The output of the filter at the end of signal duration, i.e., at t = T. is Yo = Soz + mo where So2 is total energy of the signal and m o is the filtered noise output (Davenport and Root 1958). At this moment (t = T ) the signal-to-noise ratio, d, at the output of filter is maximum and may be computed from signal and noise characteristics (Nahvi 1974).

1 Research supported by Ministry of Sciences and Higher Education, Tehran; U.S. Public Health Service HD-05958, HD-04612: and the Department of Mental Hygiene, State of California.

Cortical recordings from the cat (Woody 1970) suggest that signals reflecting the occurrence of a pa~icular facial movement may be recorded, simultaneously, at multiple (K) channel-loci over motor regions of the cortex. The separate channels might reasonably be assumed to have different noise characteristics and different signal-to-noise ratios. The possible extent of improvement in detection by combining information from multiple data channels and the effect on detection of correlated noise (i,e., dependency) between the channels can be defined by mathematical extension of the single channel detection theory. The characteristic waveshape of the individual signals can vary from locus to Locus, as would be expected from biophysical considerations concerning differences in current generators and paths ofextracellular current spread (Rail and Shepherd 1968), without affecting the mathematical formulation, since the signal template may be appropriately altered

192

M.J. NAHVI et al.

A

[3

9999

.9999

999

.999

99

.99

98

98

165

a? 9

9

8

.8 .?

7

.5 .3

i

i

3

o ~ 2 3 4 5 6 T 8 9 Io ,

~2

3

5

IO

a

15

20

K

Fig. 1. A : Relationship between probability of signal detection (PaL the probability of false alarm (pf), and filtered output signal-to-noise ratio (d) for one single channel. The stippled region represents a useful working area of successful detections vs. errors, or false alarms, for the operation of a prosthetic device for motor control. The surrounding lined region representing application of less rigorous criteria to definition of the useful working area illustrates how changes in Pd, P~, and d interact in altering the profile of the working area following changes in its definition. B : Probability of detection, P,~, as a function of adding channels, while Pf is kept constant (Pf= 10 5). Each channel is taken as having Pdo=0.8, Pfo=0.1 (do= 2.1).

at each filter prior to combination of the filter outputs. The duration (T) of each signal, s~ (t). could presumably be defined by physiologic investigations as outlined previously (~f Woody 1967; Woody and Nahvi 1973). For purposes of the present formulations, the signal duration is taken equal for all signals. The length of signal duration does not have any significant effect on the outcome of the matched filter detector provided it is long enough to contain all signals' energies and provided'that all signals occur within a certain time period. The information contained in the filter output from K data channels may be incorporated either before the decision concerning the presence or absence of a signal in each channel is made, for example by analog combination, or afterwards. The outcome of these decision schemes for K data channels with independent and with dependent noise will now be considered. 1. Data ~hannels with independent noise a. Analo# combination of K data channel outputs The effect of incorporating the information from K independent data channels by combining the analog outputs of the respective matched filters (Fig. 2, B) may be described as follows :

channel to be independent from that of the other channels, then the noise power, o-2, in the combined filter output is : Elm2] =0. 2 = ~ E [ m f] = i=l

"Yi = i-I

V i

mi =

S i2 + 1

S2

+ m

(Eq. 1}

and thc maximum signal-to-noise ratio in y is :

i=l

If each channel has the same signal-to-noise ratio. d i = d O

for all i

normalization can be performed to achieve: 0.i2 = 0.~ for all i Then : 0.i2 = K0.iz

0.2 i

Y =

a~ i=1

1

and

i=l

where y is the combined filter output, S~ is signal energy in the i-th data channel, S 2 is the total energy of all signals, m~ is the noise output of the i-th matched filter and m is the sum of all noise outputs. If we assume noise of each

S~ d x,'Ka i where

\ K i::1

(Eq. 3)

193

M U LTI PLE ('HANNEL SIGNAL DETECTION

di

Si~ =

(Y0

TABLE I

.

The case where each channel has a different signal-tonoise ratio can also be described by Eq. 3 after finding that combination of channels which, when added, will maximize d. Means for doing this are described later. To proceed with the general mathematical formulation, let all data channels be assumed to have similar Pd and similar Pf at some o p t i m u m threshold setting such as has been described for the single channel case (Woody and Nahvi 1973). Such an assumption facilitates the analysis, but does not limit application of the outcome, the latter being easily extendible to the case where channels have unequal Pd and unequal Pf. ;nder the assumption I

d = \," K d o

(Eq. 4)

and, thus, the total output signal-to-noise ratio of the combined channels is improved only by a factor of , f K , K being the n u m b e r of channels aggregated. Fig. 1, B shows the resulting effect on Pd of the aggregate, as channels are added, while Pf, the probability of a false alarm, is held constant at 10 5 with d=2.1. In reality, when signals are recorded from several locations, some data channels will have an inferior signal-tonoise ratio with respect to that of others..Yhe question arises to what degree the output of these "inferior" channels can be incorporated into the total aggregate without decrementing the output of the aggregate. Ideally, only the output of those data channels which would not degrade the resulting output signal-to-noise ratio of the aggregate should be included, that is, one should find that set of {a~} and {S~z} by which d ~s maximized. Adding channels in order of their highest signal-to-noise ratios, it is seen that to achieve d [MAX], the signal-to-noise ratio of each added data channel should be no less than the signal-to-noise ratio of the aggregate divided by ,fKK. d=dK[MAX]

=

1

~ d j > x / K d K ~[MAX]

X/~K i: 1

Consider the following example of addition of two channels with signal-to-noise ratios dl and d2 respectively. d=

dj + d 2 x/ 2

Assuming d~ > d 2, we need d2 > (\.'2-- l)dt if addition o f d z is to result in improvement in the Signal-to-noise ratio of the aggregate (d) over that of d I . Table I shows values of d for combination of two data channels for which, respectively, dl = 10 and d 2 varies from I to 30. It is observed that only for 5 ~ d 2 < 2 5 does the combination of the two channels improve the output signalto-noise ratio.

h. Aggreqution of binary outputs of K data channels Another approach for combining the information from K data channels might be to combine their separate binary outputs derived from making an initial decision on the presence or absence of a signal at each channel. A simple approach to the analysis of this case might be to apply a

V a l u e s of d for a n a l o g c o m b i n a t i o n

dl

d2

10 10 l0 l0 10 10 10 10 10 10 10 10 10 10

2 3 4 5 6 7 8 9 10 15 20 25 30

d -

of t w o data c h a n n e l s .

d l + d2

-42

7.7 8.4 9.1 9.8 10.5 11.2 l 1.9 12.6 13.3 14.0 17.7 21.2 24.8 28.2

l

majority combination decision rule (to be called m-logic) with the outcome of each channel equally weighted. Under such a rule the probability of detection, Pd, is simply obtained by the binomial expression : 'majority of channels detect I Pd = prob [ the signal/signal present (

J>:Jo \ , J /

P~j, is the probability of an ordered configuration of K outputs in which j channels have detected the signal and the remaining i channels have missed it. Jo, the decision rule, defining the number of channels at which detection must have occurred in order to determine detection by the aggregate, is as follows: For majority logic, Jo = (K + 1)/2, where K is a non-even integer. For (m l) logic, J0 = K/2, where K is a non-odd integer. II 'ac assume identical data channels with the same Pa,, and Pf,,, then Pi,i -- (1 - Pdo) i P~,,

(Eq. 6)

The probability of false alarm can be computed similarly : J majority of channels recognize t J

Pf = prob ta signal/signal absent

J >Jo

where q~j is the probability of an ordered configuratton ol K outputs in which j channels have incorrectly recognized a signal and the remaining i channels have correctly recognized its absence : qij -- (1 -- Pro) ~Pi,

(Eq. 8)

The effect of combining the output of K filter channels on Pa and Pr of the aggregate is shown in Fig. 2, A with the same

M. 1. YAHV] et al.

194

C

A 161 9999 _2 IO

Pd

999

_3 IO

99 98 3 Z I 05

_4

~o

9 8 ?

_5 I0

000

5 -6 I0

000~

3 ,

r

I0

5

,

15

"

'

20 K

25

I

,

30

40

35

K

D

B

.99oJ9 9999 999

999

99

99

98

98

o--

9

9 8

8

_,5

7

I0

7

5

5

3

3 5

IO

15

ZO

25

50

35

K

I0

15

20

25

K

Fig. 2. A : Probability of detection, Pd, and the probability of false alarm, Pf, obtained from aggregation of binary outputs of k independent identical channels (Pd 0= 0.8 and Pro = 0.1) under m-logic rule. B: Comparison of Pd obtained from aggregation of binary outputs of K identical channels (Pdo = 0.8) under different "majority" (m - 1, m, m ~- 1) logics. C : Plot of P (Pd or Pf, see text) vs. Po (Pdo or Pro respectively) of K combined channels. These curves can give probability of detection and probability of false alarm obtained from aggregate of binary outputs of K identical independent channels. Note that for Pdo < 0.5 aggregation invariably results in deterioration of detection and for Pfo > 0.5 aggregation invariably results in increase of false alarm rate. D : Probability of detection, Pd, obtained from aggregate of binary output of K identical independent channels while each channel's detection level is adjusted to keep the total probability of false alarm constant, i.e., Pf = 10- 5.

do=2.1 assumed as in Fig. 1, B. In Fig. 2, B the effect of choosing different decision rules on the output of the aggregate is illustrated. The effect of the choice of logic in combinatorials is essentially a trade off between improvement in Pd and deterioration in Pf. In Fig. 2, C, P (which is equivalent to Pd or Pf in this case) is plotted for combinations of different values of P0(Pdo or Pro common to all channels) under the simple m-logic rule. From this figure one observes that aggregation of K channels with Pdo0.5 deteriorates the detection outcome, Improvement in Pd vs. the number of channels with do=2.1 added by the combinatorial method (m-logic) is shown in Fig. 2, D. Pf is kept constant at 10- 5 by appropriately adjusting the threshold levels on the outputs of the individual

channel filters (Woody and Nahvi 1973). On comparing Fig. 1, B and 2, D, the analog method appears to yield greater improvement in Pd for small K's because some information that is preserved by analog addition is lost by application of the combinatorial logic. However, as K increases, this information is increasingly expressed in the combinatorials of the single channel outputs; hence, as K increases, the output of the combinatorial method approaches that of the analog method. It should be noted that the combinatorial method may provide a simple way for dealing with latency variations associated with the signals. Detection following analog aggregation depends upon strict synchrony, in time, of the maxima of the outputs of the matched filters which, if not met,

195

MULTIPLE CHANNEL SIGNAL DETECTION

reduced from 3 when p = 0 to 1.33 (when p =0.5). The effect of the above improvement in signal-to-noise ratio on the probability of detection is shown in Fig. 3, B. Channels used in this figure are identical to the ones used in the previous section, i.e., d0=2.1, but with correlated noise. Again, Pf is kept constant, Pf = 10- 5, to facilitate display of relationship between other variables and the comparison of results with those of Fig. l, B. Note that limits in improvement ofd/d o determine access or lack thereof to the "useful portion" of Fig. 1, that is, the portion of the figure at which Pd is sufficiently high and Pf' sufficiently low to permit reasonably successful operation of a prosthetic device. An example of K channel aggregates which fail to reach the useful portion of this graph because of dependency is shown in Fig. 4, A. In this example it is assumed, on the basis of previous analysis of neuroelectric data in the cat (Woody and Nahvi 1973), that at a single recording site we can obtain Pao=0.8, Pro=0.1 (d0-2.1). If we want to reach the useful area, we need to have a m i n i m u m o f d =7.8. If p=0.1, even with K = ~c we cannot obtain such an improvement because the limit on the obtainable signalto-noise ratio of the aggregate is :

will result in deterioration of the output signal-to-noise ratio. With the binomial combinatorial method, it may be possible simply to allow longer time periods for combinatorial assembly inclusive of variations in latency. 2. Data channels with correlated noise a. Analog combination ~?f the outputs We now consider the case when the noise in K data channels is not independent, i.e., E [ m i m j ] = plj~iaj

0~< rPijI ~ I

where Pu is the correlation coefficient between m~ and mj. In this case the output noise power, a*', of the aggregate is : E l m 2] = g [ 2 ~ m i m j ] i

- 02

j

If, for simplicity of computation, we assume: (1) that all signals have the same energy, Sg ; (2) that noise at the output of each filter has the same noise power, a2o ; and (3) that the correlation coefficient is the same between all noise outputs, i.,,., p~ - p for all i,j, then: E l m 2 ] = a 2 K [1 + p ( K -

1)] = 62

(Eq.9)

and the aggregate output signal-to-noise ratio, d, is S2

d=

KSo

=

6

[ K / V 1+p(K-1)

aox//K[l+p(K-1)]

limd = K~-r

do

=

/1 / "~ P

x'P

In this figure the m a x i m u m obtainable d for other values o f p is also indicated. b. Aggregation of binary output The Pa and Pc produced by aggregation of binary outputs of channels under a majority logic was given by Eq. 5 and Eq. 7 where (K) was the number of possible outcomes of K channels with j =signal detected and i=signal not detected. The data channels were assumed to be identical, with the same Pdo and ProWith dependency between channels, P~.j and qi.i cannot be obtained from Eq. 6 and Eq. 8 as before. For the special case when channels are identical and have the same noise correlation coefficient p, then:

(Eq. 10) In Fig. 3, A the improvement in signal-to-noise ratio, i.e., d/d0, obtained from Eq. 10 is plotted vs. K, the number of channels incorporated, for different values of p. All curves fall below the curve for p = 0 , and except for p = 0 , all curves approach an asymptotic value, i.e. : d lim K.... do

do ~ = 6.7

(Eq. ll)

This indicates that: (1) the signal-to-noise ratio of the analog sum decreases as the dependency between noise of channels increases ; and (2) except for p = 0, one cannot, in the presence of correlated channel noise, arbitrarily increase d/d 0 to any desired level, simply by increasing the number of recording electrodes, even if K - , o c . As an example, for K = 9 , d/d o is

Pi.j

--

P~.J

= (PI.,)~

Pi,i' P0,j

j >i

i

13

A 9999 999

4

99 98-

3.5 3 Q.

d/do 2.5 2

2

5

3

3

1.5 l

9 8 7

10

20

30

40

50

60

70

I I0

15

20

25

30

35

40

45

K

Fig. 3. A : Improvement in output signal-to-noise ratio (d/d0) for K identical channels with correlated noise, p, the crosscorrelation coefficient, is an index of noise dependency between channels. Note asymptote as K-* zc. B: Improvement in Po by analog s u m m a t i o n of output of K matched filters with correlated noise for different values of correlation index, p, while probability of false alarm is kept constant, P f = 10 5. For each channel Pdo=0.8 and Pro=0.1 (do=2.1).

196

~t. J. YAHV~ et a/.

A, 9999 K=

c,o

999

~.

~ 'o

,,.

.

.

-~

-:

~

.99 98

.9999

P,=8,

999

9 8 7

. . . . . .

~=.85

P,,=,8 ~ '~= ,9

p .5 .3 .99 JJ

/,i

98

Po=,l, ,x= I

I

02 h OI 002 001

,,

nl

Pt,=.l, ~=.2. P,=.I, ==.5

9

'~.

OOOI

.8

00001

7

i

0

2

4

6

8

I0

5

12

I0

15 20 25

30

I

35 40

40

K

d

Fig. 4. A : Limits of improved signal-to-noise ratio for channels with correlated noise and do = 2.1 showing lack of access to the useful working area (Pd ~>0.999. Pt ~< 10 s) when p 1>0.1. B: Plot of P(P,j and Pr) given by Eq. I I t\~r P,, ( P,~,, : II ,~ :t nd l'r, ' = 0. I ) and for various values of c~. These curves show m a x i m u m possible improvement in Pd and P~ obtained b 5 m-logic rtu]c from aggregation of binary outputs of channels with different noise correlation between channels. Correlation shown ;ts :~. the conditional probability. To obtain Po.(s ~, we note:

Fig. 4, B) then it is observed that improvement in Pd and P~ is decreased as dependency between channels' noise Js increased. As one expects, there is no improvement in Pd and

P 0 . . ~ P 0 . . + l + Pz.. PL., =: Pl,~ " Pc),.

l

P, for v

Therefore : Po..+z = Po,.+ c - Pl.z Po..

(Eq. 12)

with initial conditions

E X P E C T E D D E G R E E OF D E P E N D E N C Y BETWEEN BACKGROUND NOISE FROM MULTIPLE CHANNEL CORTICAL RECORDINGS

Po.o = 1 and P0.~ = Pdo and similarly q0.,,2-q0.°~-qa,l'qo,n q,~.o- 1 and qo.~ = Pf,,

I.

C~llculation of~ (or P~, Land q t. ~) from ? may be obtained from the authors.

(Eq. 13)

Solution of the above equations may be obtained from the authors. In Fig. 4, B values of P (Pd and Pf) obtained from aggregation of K dependent channel binary outputs under an m-logic rule are plotted for various degrees of dependencies, :~. ~ is the conditional probability of the output of one channel signifying a signal detection contingent upon the output of another channel having concomitantly signified a signal detection. Therefore Pc.c=(l-~)Pa,, and q ~ . ~ = ( 1 - z ~ ) P r , (Eq. 14) if channels are identical and have P,~o=0,8, Pro=0.1 (as m

Cross-correlation of bipolar surface recordings ~ from frontal vs. parietal cortex was performed for data from three h u m a n s with histories of temporal lobe epilepsy. Recognizable epileptic activity was present in the electrocorticograms of two of the three patients. Data samples (three sets, each patient) from the recordings were taken at times when such activity was minimal. The zero-lag correlation coefficients, 0~

Spike detection from noisy neural data in linear-probe recordings.

Major Depression Detection from EEG Signals Using Kernel Eigen-Filter-Bank Common Spatial Patterns.

An effective filter for IBD detection in large data sets.

Evolution of Nucleotide Punctuation Marks: From Structural to Linear Signals.

Optimized multiple-quantum filter for robust selective excitation of metabolite signals.

Genomic prediction based on data from three layer lines: a comparison between linear methods.

Block Sparse Compressed Sensing of Electroencephalogram (EEG) Signals by Exploiting Linear and Non-Linear Dependencies.

Derivation of auditory filter shapes from notched-noise data.

An analog delay circuit for on-line visual confirmation of discriminated neuroelectric signals.

Atrial electrical activity detection using linear combination of 12-lead ECG signals.

Evaluation of Multiple Methods for Detection of Gastrointestinal Colonization of Carbapenem-Resistant Organisms from Rectal Swabs.

A comparative study of improvements Pre-filter methods bring on feature selection using microarray data.

Comparative assessment of methods for the fusion transcripts detection from RNA-Seq data.

Linear ubiquitination signals in adaptive immune responses.

Deconvolution of isotope signals from bundles of multiple hairs.

Retrieval of frequency spectrum from time-resolved spectroscopic data: comparison of Fourier transform and linear prediction methods.

Linear effects models of signaling pathways from combinatorial perturbation data.

Wedge filter effects on dosimetric parameters of a linear accelerator.

Detection of estrus by three methods.

Neurophysiology of HCN channels: from cellular functions to multiple regulations.

Multiple linear cylindromas.

Multiple cutaneous linear neuromas.

Detection of coloured stimuli by independent linear systems.

Separation of multi-unit nerve impulse trains by a multi-channel linear filter algorithm.