Auditory filter shapes derived with noise stimuli* Roy D. Patterson M. R. C. Applied PsychologyUnit, Cambridge,England,CB2 2EF. (Received11 August1975;revised3 November1975)

A wide-bandnoisehavinga deepnotchwith sharpedgeswasusedto maska tone.The notchwascentered on the tone, and thresholdwas measuredas the width of the notchwasincreasedfrom 0.0 to 0.8 timesthe tonefrequency (0.5, 1.0,or 2.0 kHz). The spectrum levelof the noisewas40 dB SPL. If it is assumed that the auditoryfilter is reasonably symmetricat theseintensities, thenthe shapeof the filter centeredon the tone can be estimatedfrom the first derivativeof the curve relatingtone thresholdto the width of the notch in the noise. The 3-dB bandwidths of the filters obtained were about 0.13 of their center frequency.

In the regionof the passband, the Gaussiancurveprovidesa goodapproximation to the shapeof the derivedfilters.The equivalent rectangular bandwidths of the Gaussianapproximations are about0.20 of their centerfrequency,whichis comparable to the c•ical-band estimates of R. Zwicker,G. Flottorp,and S.S. Stevens ["Criticalbandwidth in loudness summation," J. Acoust.Soc.Am. 2•, 548-557(1957)]. The Gaussian approximation cannotbe usedoutsidethe passband, because the tailsof the derivedfiltersdo not fall as fast as the Gaussian curve.

Subject Classification: [43]65.58,

[43]65.35,

[43]80.50.

INTRODUCTION

Several investigators, using quite distinct techniques, have attempted to measure the attenuation characteris-

tic or shape of the auditory filter. timated

the attenuation

characteristic

De Boer (1967) esof the cat's

audi-

tory filter by correlating a broad-band noise stimulus with the auditory nerve impulses recorded in response

to the noise. Houtgast(1974) estimated the shapeof the human auditory filter from the data obtained in a tonemasking experiment in which the masker was a broad-

band noise with a rippled spectrum. Patterson (1974) also derived the filter shape using a tone-in-noise experiment and human observers, but the masker was a low-passed or high-passed noise with a sharp cutoff. An experiment similar to Patterson's was performed by

Margolis and Small (1975) and yielded comparable results.

The filter shapes obtained by de Boer and Houtgast both suggested that the auditory filter had a broad, fairly flat top, flanked by rather sharp skirts; the band-

width of their filters ranged from 15% to 25% of the center frequency. The experiments of Patterson and Margolis and Small also led to the conclusion that the auditory filter had sharp skirts; however, their filters had no appreciable flat top. The passbands of the filters

were in the range of 4% to 7% of the filter's center frequency. In this paper, the different techniques and their respective filter shapes are reviewed; it is argued that one of the assumptions made by Patterson and Margolis and Small in their filter derivation is probably incorrect, and this hypothesis is subjected to experimental

test.

'

A. Neurophysiologicalfilter shapes Perhaps the most ingenious method for measuring

filter shape is that of de Boer (de Boer, 1967, 1968, 1969; de Boer and Jongkees, 1968; de Boer and Kuyper, 1968). De Boer (1967) presented a white noise to a cat and, with the aid of a micropipette, recorded the impulses emanating from a single cochlear nerve fiber. 640

J. Acoust.Soc. Am., Vol. 59, No. 3, March 1976

The train of neural spikes that followed the noise stimulation was cross correlated with a recording of the noise which had been appropriately delayed to allow the stimulus

sufficient

time

to travel

down

the basilar

mem-

brane to the point of transduction. Theoretically, if the input to a filter is a white noise and the filter's output is cross correlated with its input, the result is the impulse response of the filter. De Boer generated a detailed version of the impulse response for a particular fiber by repeatedly stimulating the fiber and correlating the noise input with the neural output. Finally, the impulse response was played repetitively and scanned with a wave analyzer to determine the attenuation characteristic of the auditory filter associated with that particular The

cochlear attenuation

fiber. characteristics

associated

with

two

cochlear fibers whose best frequencies were about 1.0 and 3.0 kHz are shown in Fig. 1 by the solid curve and the upper dashed curve, respectively. The data are re-

plotted from Figs. 8B and 8C of de Boer (1969). The bandwidths of these filters, 3 dB down from the peak, are 160 and 750 Hz. Correspondingly, the psychophysical experiments reviewed in Zwicker et al. (1957) indicate

that the critical

bandwidth

in man

is about

160

and 500 Hz at 1.0 and 3.0 kHz, respectively. Since the cochlea in the cat is only about two-thirds as long as in man, this correspondence in bandwidths should not be overemphasized. On the other hand, as Evans and Wil-

son (1973) have pointedout, it would appear on the basis of this and similar data that the frequency selectivity available at the output of the cochlea is sufficient to account for the basic psychophysical data summarized in Zwicker's critical-band estimates.

The frequency tuning curve provides a more traditional measure of the frequency selectivity of the cochlea. The tuning curve is obtained by sweeping a tone through the frequency region that stimulates a particular fiber and recording the amplitude of the tone required to produce a constant rate of firing in the fiber. De Boer was able to hold the units that produced the attenuation characteristics shown in Fig. 1 long enough to determine Copyright¸ 1976 by the AcousticalSocietyof America

Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

640

641

R.D. Patterson'Auditow filter shapes

641

m

10

FIG.

1.

Auditory filter

cat and man.

The

solid

shapes in the curve

short-dash curve are the filter associated



with

1.0-

and the

shapes

and 3.0-kHz

units

in the cat as derived by de Boer (1969). The filled and open diamonds are the inverted tuning curves for the same fibers. The open circles and the longdash curve show the filter shapes at 1.0 kHz reported by Houtgast (1974)

20

20

30

301andPatterson (1974),respectively, for human

observers.

RELATIVE FREQUENCY

their frequency tuning curves. These data are presented in Fig. 1 as filled and open diamonds for the fibers with best frequencies of 1.0 and 3.0 kHz, respectively. For comparison purposes, the tuning curves have been normalized and inverted; that is, the diamonds in Fig. 1 show the sensitivity of the fiber at various frequencies relative to the sensitivity at the best frequency. Despite the difference in stimuli, tones for the tuning curves and noise for the impulse responses, the methods yield reasonably comparable measures of the frequency selectivity that is available at this early stage of the neural auditory system.

Wilson and Evans (1971) and Evans and Wilson (1973) have taken this comparison one step further, and shown that the tuning curve data obtained from a single unit of a cat can be used to predict the response of that fiber to

"rippled" noise. Ripplednoise, whoselong-term spectrum varies sinusoidally on a linear frequency scale, is produced by adding a white noise to a copy of itself that has been delayed by T sec. When the delayed version of the noise is added to the original in phase, the rippled

noisewill havepeaksat 0 andevery multiple of 1/T Hz; when the polarity of the delayed noise is reversed,

the

ripple will showa minimum at 0 and multiples of The spectrumof the rippled noise, N(f), is

N(f ) =No(1ñMcosZtrf.T ),

(1)

where No is the spectrum level of the original noise and

M is the modulation depth determined by the attenuation of the delayed version of the noise.

Throughout their

experiment, Wilson andEvans (1971) chosevalues of T such that either

a maximum

or a minimum

occurred

in

the rippled noise at the best frequency of the fiber they were holding. For each value of T, they measured the change in the firing rate of the fiber when the polarity of the delayed noise was reversed. When T is smaI1, the maxima and minima of the rippled noise are widely spaced, and changing the polarity of the delayed noise produces its greatest effect; as T is increased, the

maxima and minima occur more frequently, and the dff-

ference in firing rate following a polarity reversal

de-

creases.

This effect can be explained if the firing rate of the fiber is likened to the output of a filter whose attenuation characteristic has the form of a normalized, in-

verted tuning curve. The output of this hypothetical auditory filter is largely determined by the stimulus energy near the center frequency of the filter. In those conditions where T is small, the maxima and minima are broad relative to the passband of the filter. When a maximum is positioned at the center frequency of the

filter, the output is large; but when the polarity of the delayed noise is reversed and a minimum occurs in the passband, the output of the filter is markedly reduced. In those conditions where T is large, the peaks of the ripple are more densely packed and the passband of the filter encompasses a number of maxima and minima. In this case, when the polarity of the delayed noise is reversed, the output of the filter does not vary appre-

ciably because, althoughthe peaks do shift, the number of peaks in the passband remains roughly constant. But Evans and Wilson (1973) went beyond this qualitative analysis, calculating the filter outputs that would be expected in the various conditions of their experiment when the normalized, inverted tuning curve was used as an estimate of the shape of the auditory filter. And their calculations confirmed that the tuning curve of a

particular fiber provides a good basis for predicting that fiber's response to rippled noise. B. Psychophysicalfilter shapes

Houtgast(1972, 1974)used a rippled noise stimulus in a psychophysical experiment with human observers. His observers were required to detect a pulsed sinusold in the presence of a continuous rippled noise background. Masked

threshold

was determined

as the duration

and

polarity of the delayed noise were varied using a tech-

nique similar to Wilson and Evans (1971). Whereas Evans and Wilson (1973) compared the frequency selectivity of tuning curves with the selectivity indicated by

J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

642

R.D. Patterson:Auditory filter shapes

642

their rippled noise data, Houtgast was able to derive the shape of the auditory filter from his data directly. Houtgast also included two additional conditions in which the delay was adjusted so that the tone was midway between a maximum and a minimum; in one case

and [H(f)[•' in Eq. (2), the result is

P•=KNof of.•[lqMcos(rng)]

I

nearest minimum, and in the other case the maximum was above and the minimum

below the tone.

These

}

x 1+• [an cos(ng)+ bnsin(ng)] dg,

the nearest maximum was at a lower frequency than the

where rn is an integer.

(4)

The integral was evaluated

stimulus conditions, where the noise spectrum has a distinct slope in the immediate vicinity of the tone, can

over an integer numberof cycles, from f0/2 to 3f0/2,

be contrasted to reveal whether the auditory filter symmetric.

tributes to the integral is M eos(rng)am eos(rng), pro-

is

and, as a result, the only trigonometric

term that con-

duced when the index of summation n is equal to m.

The model of noise masking on which Houtgast's derivation of the filter shape is based has been described many times (Fletcher, 1940; Schafer et al., 1950;

SinceMaincos•'(rng) =-}Main [1 - cos(2mg) ], Eq. (4) reduces

to 1

Swets, Green, and Tanner, 1962). It is assumedthat, in order to improve the detectability of the tone, an auditory filter is centered at the tone frequency. The tone and any components of the noise in its immediate vicinity pass through the filter largely unattenuated, whereas the remaining

noise components are progres-

sively attenuated as their distance from the tone increases. Thus the power of the tone at threshold is simply a weighted proportion of the noise power enter-

ing the ear, and the weighting function is the auditory filter shape. If the power of the tone at threshold and

the spectrumof the maskingnoise are Ps andN(f), respectively, and if the shape of the auditory filter is

[H(f) [•', thenthe relationshipcanbe expressedas follows:

Ps =Kf.•N (f)[H(f) [•' df,

(2)

where K is a proportionaltry constant. Houtgast held the level

of the tone constant

at 45 dB and varied

the

mean spectrum level of the noise No to determine the level at which the tone was just audible. threshold

data describe

how the level

Thus his

of the noise

must

Thus, by representing the filter with a Fourier

series

and using a masker with a cosinusoidaI spectrum, Houtgast was able to extract the cosine Fourier

coefficients

an from the corresponding threshold noise levels. Similarly the sine coefficients bn can be obtained from the experimental conditions wherein the masker spectrum is sinusoidal, such that the tone is midway between a maximum

and a minimum

of the masker.

The coefficients derived for a 1.0 kHz tone are presented in Fig. 1 as open circles. The data are from

Houtgast(1974, Fig. 9. 5, direct masking). In Houtgast (1972), a curve was fitted to a subsetof these data. The curve, which has been omitted for clarity, had basically the same shape as the solid line from de Boer in Fig. 1; it intersected a horizontal line 10 dB down from the peak at about 835 and 1180 Hz on the low and high sides of the filter, respectively. Thus Houtgast's findings are in rather good agreement with de Boer's, which is perhaps surprising since de Boer's data were gathered from eoehlear fibers in the eat and Houtgast's are from a psychophysical experiment on humans.

be varied as a function of the polarity and duration of the delay to maintain constant detectability, that is, a constant noise level at the output of the auditory filter.

Houtgast did not measure the full shape of the auditory filter at other center frequencies. He did, however,

Houtgast derived the shape of the filter from the threshold data by means of the model summarized in

curve--to

Eq. (2) and an intriguing application of Fourier analysis. P• and K are constantsin this situation and thus do not affect the shapeof the filter, and the functionN(f) is knownfrom Eq. (1). What is required, then, is a general description of the filter which permits an analytic solutionto the integral in Eq. (2) and which, at the same time, is written in terms of useful unknowns. Houtgast employed a trigonometric series of the form

use rippled noise--and the simplifying assumption that the auditory filter could be approximated by a Gaussian estimate

the bandwidth

of the filter

for

five

observers at frequencies ranging from 0. 25 to 4.0 kHz. The equivalent rectangular bandwidth at 1.0 kHz was found to be about 170 Hz. Although the bandwidth estimates showed considerable spread as a function of observer, the average bandwidth across observers was in extremely good agreement with the critical band curve presented originally by Zwicker, Flottorp, and Stevens

(1957) and later reinforced by Scharf (1970). The experiments on auditory filter shape reviewed to this point give the impression that the frequency selectivity of the auditory system may be largely, if not exclusively, determined in the cochlea, and that succeeding stages of neural processing do little to sharpen auditory filter shape. There are, however, several papers on auditory filter shape which suggest that a much sharper filter is available in some psyehophysieal experiments.

[H(5•fø )I•'= 1+•• 1[an cos2?m(ff7fø)+bn sin2?rn (ff7 fø)] (s) to represent the filter; a• and bn are the Fourier coefficients, n is an integer, andY0 is the tone frequency. By and large it will prove more convenient to measure frequency from the tone frequency, and to measure it relative to the tone frequency. This relative frequency

variable will bedesignatedg, andsog= (f-fo)/fo anddf =fodg. Now, if Eqs. (1) and (2) are substitutedfor N(f)

Patterson (1970, 1971, 1974) measuredthresholdfor

J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

643

R.D. Patterson:Auditory filter shapes

643

a tone of constant frequency as the cutoff of a low-passed noise was increased. As expected, tone threshold rose from absolute threshold when the noise edge was far below the tone to wide-band threshold as the noise edge passed the tone. The experimental paradigm is shown

schematically in Fig. 2(a). The shadedarea represents the noise that is effective in masking the tone. Care was taken to insure that the power spectrum of the lowpassed noise fell abruptly beyond the cutoff frequency so that the noise edge could be approximated by a step

function as shown in the diagram.

Patterson argued,

tion was considered satisfactory. Finally, he showed that this filter was quite successful in predicting some

typical noise maskingdata of Webster et al. (1952), Egan and Hake (1954), andGreenwood(1961). Margolis and Small (1975) reported a similar, although independent, determination of the filter shape. They also used a low-passed noise with an abrupt cutoff to mask a tone of constant frequency, and measured the detectability

of the tone as a function of the noise cut-

off. • However,their noisewas generateddigitallyby summing the appropriate values of sinusolds spaced 1

similarly to Evans and Wilson (1973), that threshold rises because the noise at the output of the filter increases as the noise edge approaches the tone. He pointed out that if the auditory filter is centered on the tone, then each threshold provides an estimate of the

area under the filter up to the position of the noise edge. Therefore, he concluded, the shape of the filter can be obtained by taking the derivative of the tone threshold curve.

Hz apart from zero to the cutoff frequency.

Thus, the

edge of the noise was extremely sharp and provided a better approximation to a step function than did Patter-

son's"multiplied"noise(afterGreenwood, 1961). Margolis and Small (1975) measured the shape of the filter for tones at 0. 5, 1.0, and 4.0 kHz and obtained filter

shapes,like Patterson,by differentiatingthe curverelating detectability

to the noise cutoff.

The filters

were

quite similar to those reported by Patterson, although In mathematical terms, Patterson, like Houtgast,

isolated IH(f)I •' in Eq. (2) by choosinga noisemasker that simplified the integration. In this case Eq. (2) becomes

NOISE

•.Af/fo =/Vofo 3I(g) l"ag,

(6) UDITORY

FILTER

where Af is the distance between the tone and the noise

edge. The derivative of Eq. (6) with respect to Af/fo is

dPs _KNofoIH(•oof) •'

(v)

a(Af /f o) -

fo

Therefore, the filter shape is given by the derivative of the threshold curve divided by the constant KNof o.

NOISE

The experiment was run at five tone frequencies (0. 5, 1.0, 2.0, 4.0, and 8.0 kHz) andeach of these conditions was replicated using a high-passed noise in place of the low-passed noise, a procedure which leads to a more sensitive measure of the upper skirt of the filter. The curve with the broad dashes in Fig. 1 shows the derived filter shape in the case where the tone was 1.0 kHz. The passband of this filter is clearly much narrower than the passbands of the filters presented earlier. The 3-dB bandwidth of the filter is 59 Hz, whereas the bandwidth estimated by Houtgast for the same center frequency using the Gaussian filter shape approximation was 170 Hz. And, in general, Patterson's estimates were about one-third those reported by Zwicker,

Flottorp, and Stevens(1957); that is, they were much more in line with thoseobtainedby Fletcher (1940, 1953) and Hawkins and Stevens(1950), both of whom used the critical

ratio

method.

Patterson found that the estimated shapes were well approximated by the expression

H where

1.29a

- [1+(Af/otfo)•'] •'' is the 3-dB bandwidth

function is symmetric

(8) of the filter.

This

on a linear frequency scale.

The

data showed a slight asymmetry, with the upper skirt being less sharp; however, the symmetric approxima-

(b)

fo

t NOISE

P

fo FREQUENCY

FIG. 2. Schematic representation of auditory filtering when the masker is a low-passed noise, (a) and (b), and a notched noise, (e). The simplest model, wherein the filter is assumed to be centered on the tone, is portrayed in (a). The area where the noise and the filter overlap represents the noise that is effective in masking the tone. When the masker is a lowpassed noise and the filter is shifted, as in (b), detection improves in those eases where the shift produces more noise reduction than signal reduction.

When the masker is a notched

noise, as in (e), shiftingthe filter produceslittle, if any, improvement in detection, because the noise reduction on one

side of the tone is accompanied by a noise increase on the other

side.

J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976

Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

644

R.D. Patterson:Auditory filter shapes

644

bandwidthsreportedby Margolis andSmall (1975)and

ly, this filter shape is probably not as reliable. It is also worth noting that the extreme asymmetry in tuning curves that span an amplitude range of 60 dB or more

Patterson (1974) were 30 and 29 Hz at 0. 5 kHz, 57

is not as apparent in that portion of the curve within 30

and 59 Hz at 1.0 kHz, and 240 and 200 Hz at 4.0 kHz.

or 40 dB of the tip, that is, at low intensities. This lack of asymmetry was revealed by de Boer, who plotted the tip of his tuning curves on a linear rather than logarithmic frequency scale as in Fig. 1. The experiment of Wilson and Evans (1971) did not include conditions

they were shiftedup in frequencyby about15 Hz and thus were slightly less symmetric.

The respective

Taken at face value, the experiments just described would suggestthat the neural stages of the auditory system subsequentto the cochlea are capable of sharpening the cochlear auditory filter in at least somecircumstances. However, one of the assumptionsmadebothby Patterson and Margolis and Small is probably incorrect.

Specifically, in the derivationof the filter shapeit was assumed that the filter is centered on the tone as shown

in Fig. 2(a). If the auditow filter has reasonablysteep skirts and a fairly flat section about the center frequen-

cy, thenthe signal-to-noiseratio at the outputof the filter is not at its maximum when the filter is centered

on the tone. Rather the signal-to-noise ratio can be im-

provedby centeringthe filter somewhatto the side of the tone, as shownin Fig. 2(b). For in that case, the skirt of the filter markedly reduces the amount of noise

leaking throughthe filter without producinga comparable reduction of the tone, because the attenuation characteristic is flatter in the region of the tone. Therefore,

where asymmetry could be assessed. Houtgast (1974) specifically included a test for asymmetry in his design. No pronounced asymmetry was encountered, but the amplitude range of his experiment is limited to about 10 dB and therefore does not provide a strong test for

asymmetry. Both Patterson (1974) and Margolis and Small (1975) found a small asymmetry, but in the opposite direction to that of the tuning curve. If the filter is reasonably symmetric, it can be assumed to be centered at a point near the tone when the masker is a notched noise, because this will be the re-

gion where the signal-to-noise ratio at the output of the filter is greatest. The two cases, where the filter is

shifted and not shifted, are shownin Fig. 2(c). When

passbandsthan those reported by Patterson or Margolis

the filter is centered on the tone, approximately equal amounts of noise pass under each skirt; when the filter is shifted up in frequency, less noise passes under the lower skirt, but the increase in noise entering via the upper skirt more than compensates for the reduction on

andSmall couldaccountfor the thresholdstheyobtained.

the low side.

if it is assumed that the filter

is not centered on the

tone but rather is positioned where it produces the maximum signal-to-noise ratio, then filters with wider

C. Filter shapefrom notched noise To test the assumption that the filter was centered

on the tone, Patterson'sexperimenthas beenreplicated using a notchednoise, that is, a broad-bandnoisewith a gap in the region of the tone. The notchednoise has the advantageof minimizing the improvement in signalto-noise ratio that can be achieved by shifting the filter.

This experiment is the topic of the remainder of this paper. In short, however, the auditory filter shapes derived

with the aid of the notched noise have much

wider passbands thanthosereportedin Patterson(1974) and, in fact, the bandwidthsare close to thoseof Zwicker, Flottorp, and Stevens(1957) andHoutgast (1974). This result leads once again to the conclusion that the frequency selectivity evident at the output of the cat's cochlea is comparable to that displayed by the auditory filter measured in noise-masking experiments with

human

observers.

The experiment consisted of centering the notch in the masker on the signal tone and measuring tone threshold

as a function of the width of the notch.

The

paradigmis presentedschematically in Fig. 2(c). It was assumed that at these fairly low intensities the audi-

tory filter is reasonably symmetric. Supportfor this assumption comes from several of the experiments reviewed earlier.

De Boer's filter

corresponding to the

1.0-kHz fiber (the solid line in Fig. 1) shows25- and 27-dB attenuation at points 300 Hz above and below the center frequency. The filter for the 3.0-kHz fiber is not as symmetric; however, de Boer's techniqueis limited to frequencies below 4. 0-kHz and, consequent-

The nature

of the tradeoff

associated

with

shifting the filter is perhaps most clearly revealed if one imagines that the area under the filter is divided into thin vertical rectangles, as when an integral is approximated by a summation. Consider the two rectangles adjacent to each noise edge; in both cases the

rectangleclosert? the toneis larger. If the filter is shifted up in frequencyby the width of one rectangl% the noise passing through the filter below the center frequency is reduced by an amount equal to one small rectangle, but the noise coming through above the center frequency is increased by one large rectangle. And thus the total noise power at the output of the filter is increased. At the same time, the power of the tone is slightly decreased. And so, for a symmetric filter, the maximum signal-to-noise ratio is obtained when the filter

is centered

on the tone.

The edges of the noise were made particularly

sharp,

as in Patterson (1974), so that they could be approximated by step functions. This, in turn, makes it possible to extract the filter shape from the general masking equation because, on substituting the step functions for

N(f), Eq. (2) becomes

Img)l ag+zCVofo '/•/•o

=



/f0

Iu(g)I ag.

And if the filter is symmetric, then Eq, (9) reducesto

V,=2KNof ofa• /•'o [U(g) ]•' ag. The derivative of this equationwith respect to ,xf/fo is

4')l

d(af /fo) )'

J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976

Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

(•O)

645

R.D. Patterson:Auditory filter shapes

645

rivative.

ample, when the tone frequency is 1.0 kHz such that feo is 0. 3 kHz, the masker is composed of two noise bands with flat tops 0. 0 kHz wide and skirts that fall 34 dB in the first 100 Hz outside the passband.

I. METHOD

B. The independent variable

Two broad noise bands with sharp edges were positioned symmetrically about a tone f0, and tone threshold

the relative

And so, as before, an analytic expression for the shape of the auditory filter can be produced by fitting a polynomial to the set of tone thresholds and taking its de-

Ps was measured as a function of the distance from the tone to the edge of the noise bands &f. The experiment was replicated at three tone frequencies, 0. 5, 1.0, and

2.0 kHz. The independentvariable &f/fo rangedfrom

The independentvariable in the experiment, zXf/fo, is distance

between

the tone and the noise

edge. For the purpose of defining •Xf, the position of the noise edge is taken to be the half-power point on the shoulder of the noise because this point provides an excellent approximation to the equivalent rectangular

0. 0 to 0. 4; that is, notch width 2•f was varied from

cutoff for these noise bands.

0. 0 to 0. 8 times the tone frequency.

produced with the aid of two low-pass Butterworth filters in series, the half-power point does not occur at the nominal cutoff, fmsñfoo, but rather at a point slightly closer to fins. The half-power points, which can be

A. Spectrum of the masking noise It was important in this experiment that the edges of the noise maskers be sharp, inasmuch as the derivation of the filter shape presented in the introduction includes the assumption that the noise edges can be approximated by step functions. A modulation technique suggested by

Greenwood (1961) and described in detail by Patterson (1974) was used to produce the sharp edges. Each band was prepared by low-pass filtering a broad-band noise

with two Khronhite filters (model 3342) in series, and subsequently multiplying the resultant low-passed noise with a sinusold. The slope of the edge of the multiplied

noise increases with the decibel/octave rating of the

low-passfilters andthe ratio fms/feo,wherefinsis the frequency of the multiplying sinusold and foo is the cutoff frequency of the low-pass filter. The Khronhite model 3342 is an eight-stage Butterworth filter, and so the weighting applied by each filter was

[1+(f/fe o)lø ]ß

-

(11)

Since there were two of these filters in series, the spectrum of the low-passed noise at the output of the filter

foundby settingEq. (12) equalto NO/2 and solvingfor f, occur

at

fao•=fms+feo( x/-•- 1)•/•0 : and thus for

the band below

the tone

(13)

Af=fo- (fins+ 0. 9464feo) and for

the band

above

the tone

(14)

&f= (fins- 0. 9464feo)-f0. C. Threshold procedure

The tonethresholdswere determinedwith a blocked, two-alternative forced-choiceprocedure. The masking noise was on continuouslythroughoutthe experiment. Eachtrial was composedof a 0. 2-sec warninginterval, two 0. 6-sec observation intervals, oneof which contained

1

second

Since the noise bands are

was

No

[ 1+ (f/feo)•ø]•',

the tone, and a 0. 9-sec response interval. The intervals were designated by lights; 0. 3-sec pauses were interposed between the intervals.

The tone was turned on

and off over a comparatively long time (0. 1 see) to minimize the spread of signal energy and so prevent off-frequency listening. The trials were presented in block• of 20 during which the stimulus parameters were not altered. The basic behavioral measure was percent

where N o is the spectrum level of the broad-band noise.

correct

The low-passed noise was then multiplied by a sinusold to produce the required noise band. The multiplication process shifts the position of the low-passed noise so

combinationof tone frequencyf0 andnoisewidth •Xf/fo,

that those components of the noise between 0 and f•o Hz are shifted to between fm• and fm• +f•o, and a mirror image of the low-passed noise appears in the region fm• to f•-foo. The spectrum of the resulting multiplied noise is given by

{1+ [(f- fins)/feo]tø} •' ø The complete noise masker N(f) is the sum of two such bands, one below and the other above the tone. Separate noise generators were used to insure the independence

of the noise bands. For a particular tone frequency,

per 20 trials.

To obtain an estimate of threshold, for a particular the observer was presented a run composedof 14 blocks of 20 trials.

All stimulus parameters were held con-

stant during a run except for signal power, which was varied in the region of threshold between blocks. The 14 values of percent correct produced in a run were

plotted as a function of signal power and a psyehometrie function was fitted to the points by eye. Threshold was taken to be that tone power where the psyehometrie

functionintersected the line designating75% correct identification of the observation interval containing the tone.

Data from

a second and sometimes

a third

run

were later added and a final psyehometrie function was

the cutoff foo was the same for all four of the Khronhite

fitted to the total set of points. Thus, each threshold is

filters used in the production of the two noise bands; for tone frequencies of 0. 5, 1.0, and 2. 0 kHz, leo was 0. 2, 0. 3, and 0. 4 kHz, respectively. Thus, for ex-

frequencywere gatheredbefore anotherwas introduced;

based on 500 to 800 trials.

All of the thresholds associated with a particular tone

J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

646

R.D. Patterson: Auditow filter shapes

646

the thresholds were not gathered in a systematic order.

data are very similar in form. The inflection point occurs somewhat earlier in the 2.0-kHz data, but the

D.

shape is essentially

Observers

A total of 10 observers participated in the experiment; all had audiometrically normal hearing in the range below 4.0 kHz. One observer, MS, completed all of the experimental conditions. The remaining nine observers

A. Filter shapesand bandwidthsfrom the averagedata A filter

are results

observers

for five

observers

shape was extracted from each of the three

sets of thresholds in Fig. 3(a) by fitting a polynomial to

were assigned three to each tone frequency, and they completed all of the thresholds at that frequency. One of the observers assigned to the 2.0-kHz frequency, BS, was available for a slightly longer period, during which he also completed the thresholds at 0. 5 kHz. Thus, there

the same.

the appropriate data and taking the derivative of the

polynomial. The deiails of the fitting process are described in Appendix A.

The filter shapes produced in

this manner are presentedin Figs. 3(b) and 3(c); the symbols on the curves do not represent data but rather are provided to identify the center frequency of the fil-

at 0. 5 kHz and four

ter.

at 1.0 and 2.0 kHz.

In Fig. 3(b) attenuationis plotted on a linear

scale, placing the emphasis on that portion of the attenII.

RESULTS

AND

AUDITORY

FILTER

SHAPES

uation

characteristic

maximum.

The average thresholds are shownin Fig. 3(a) for the

within

tuned filters with similar

three tone frequencies, 0. 5, 1.0, and 2.0 kHz. For clarity, the data at 1.0 and 2.0 kHz have been displaced down 5 and 10 riB, respectively. The data at 0. 5 kHz show that although the relationship between tone threshold and notch width is reasonably linear, there is a

about

15 dB of the filter

As expected, the plot shows three wellshapes.

The half-po•wer point of the 2.0-kHz filter,

indicated

by the horizontalline in Fig. 3(b), occurs at Af/fo

"backwardS" shapeto the curve, with an inflection

= 0. 0518, and thus the 3-dB bandwidth, BW, of the 2.0kHz filter is 207 Hz. Similarly, the 3-dB bandwidths of the 0. 5- and 1.0-kHz filters are 69. 2 and 140 Hz, re-

point occurring in the region/xfifo =0. 2. The 1.0-kHz

spectively.

AVERAGE

DATA

These values are more than double those

AUDITORY

60•-0 Signal Frequency'

.1

.2

FILTERS

.3

.4

!

!

0

500 Hz

1000 Hz 2000

Hz 10

(c)

20

o

30

o

40

(a)

0

.1

.2

.3

A

0

.1

.2

.3

A

FIG. 3. Thresholdsignalpoweris plottedas a functionof relative notchwidthAf/fo in (a); ZXfis the distancefrom the signalto the edgesof thenoise,andf0 is the signalfrequency.Thethreecurvesshowthedataobtained with signalfrequencies of 0.5, 1.0, and2.0 kHz as representedby the circles, squares,andtriangles, respectively. The auditoryfilter shapesderivedfrom the datain (a) are plottedin (b) usinga linear ordinate, andagainin (c) usinga logarithmicordinate. The abscissais the same for all three sections of the figure. J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

647

R.D. Patterson: Auditory filtershapes

647

reportedby Patterson(1974)for the sametonefrequencies, whichsuggests that the auditoryfilter was not

0. 83 of those reportedby Zwicker et al. (1957).

centered on the tone in that experiment.

notparticularlysurprising.BothZwickeret al. (1057)

The similarity in the rate of growth of bandwidthis

Although the

presentbandwidths are wider, the rate of growthof

and more recently Scharf (1070) have pointed out that virtually all bandwidthexperiments have foundthe same rate of growth. The difference, traditionally, has been

bandwidthwith center frequency is remarkably similar.

In bothcases, the growthis foundto be close to linear whenlogBWis plottedas a functionof logf0, andin the

thatexperiments whichemploythe"empirical"method

case of the notched noise experiment the line is

provide bandwidthestimates two to three times those

10logBW =7.91logf0 - 2.71.

found with the "critical ratio" method. The present ex-

(15)

perimental techniquedoes not fit into either of these categoriesparticularly well, since it was designedto

Patterson (1974) reported slope and intercept values of 8. 34 and- 7.37, respectively, for a similar fit. Thus,

measure the shape of the auditory filter. Nevertheless, it is more similar to a "critical ratio" experiment and thus it is somewhat surprising to find that the bandwidth

the two lines have essentially the same slope and differ

primarily in overall level, the mostrecentbandwidths being about2.2 times the older ones. The 3-dB bandwidth underestimates

estimates are much closer to those of Zwicker the equivalent

rectangularbandwidth of thesefilters by about14%.

these latter estimates now appear excessively narrow

The equivalent rectangular bandwidthsare 78.5, 160, and 238 Hz for the 0.5-, 1.0-, and 2.0-kHz filters, respectively. The correspondingestimates from Zwicker,

bringsus backto EvansandWilson's(1073)suggestion: that the frequency selectivity available at the outputof the cochlea would appear to be sufficient to accommodate the results obtained in a wide range of psychophysical experiments.

Flottorp, andStevens(1957)are approximately110, 160, and 300 Hz. As before, lines fitted to these two sets of values are nearly parallel with slopes of 8.01 for the former and 7.24 for the latter. Again the basic difference is in the overall level; the intercepts are

The abscissa in Fig. 3(b) is a normalized frequency scale, and so the fact that the 0. 5- and 1.0-kHz filters are virtually identical indicates that the frequency selec-

- 2.43 and + 0. 679, respectively. But in this case the level difference is noticeably smaller; in the region of

tivity of the system is essentially proportional to center frequencyin this region. By comparison, the 2.0-kHz

interest, 0. 5 to 2.0 kHz, the present values are about OBSERVER

et aL

(1057) and Houtgast(1074) than to those of Patterson (1074)or Margolis and8mall (1075). The fact that

AUDITORY

MS 0

.1

.2

FILTERS .3

.4

Signal Frequency' o

500 Hz

[]

1000

Hz

•,

2000

Hz

60

20 iJJ

oc oc



o

•o

,

(b)

(a)

0

.1

.2

.3

.4

0

.1

.2

.3

A

"ho FIG. 4. In (a), threshold signal poweris plotted asa function of relativenotch widthandsignalfrequency for observer MS. The corresponding filter shapesare shownin (b) and(c) usinglinear andlogarithmicordinates,respectively. J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

648

R.D. Patterson: Auditoryfilter shapes INDIVIDUAL

648

OBSERVERS

AUDITORY 0

Signal Frequency 1000 Hz



.1

FILTERS

.2

.3

.4

[]MS

60

v RB 0

FO

o

SS

10

(c)

03 50 20

0

(j 40

30

o

z

40

(a) 20

i

(•

.1

i

i

.2

i

.3

.4

0

.1

.2

A•/f o

.3

.4

A•/fo _

FIG. 5. In (a), threshold signal power is plotted as a functionof relative notchwidth andobserver for the case where the signal frequency is 1.0 kHz. The individual filter shapes are shownin (b) and (c) using linear and logarithmic ordinates, respec.tively.

filter is somewhat sharper, indicating that relative frequency selectivity improves with center frequency beyond 1.0 kHz. The three filter shapes are replotted in

Fig. 3(c), this time using a logarithmic ordinate which emphasizes

the tails of the attenuation characteristic

at

the expense of the central section. The frequency scale is the same in all three sections of the figure. Figure 3(c) shows that the 0. 5- and 1.0- kHz characteristics do not diverge in the tails and that the faster rate of at-

tenuation displayed by the 2. 0-kHz filter in Fig. 3(b)

continuesin the frequencyregion beyond•f/fo = o. 2. However, the data of MS, the one observer who com-

pleted all three center frequency conditions, indicate that the average curves overestimate selectivity difference.

the size of the

chance assignment of more sensitive observers to the 2.0-kHz

condition.

The 1.0-kHz average curve in Fig. 3(a) is an average of data from four observers. The individual data points and polynomial.fits are presented in Fig. 5(a). When the notch in the noise is narrow, tone threshold shows

little variation as a functionof observer; however, as the notch widens, consistent and significant differences emerge. In the frequency range 0. 3 to 0. 4, for example, the best observer, SS, can detect a tone almost an order of magnitude less intense than that audible to the least sensitive observer, RB. The filter

shapes obtained from the individual ob-

servers' data are presented in Figs. 5(b) and 5(c). As would be expected, the most sensitive observer in Fig.

B. Individual filter shapes

5(a), SS,produced thefilter withthe mostsevereskirt,

The data and filters of observer MS are presented in Fig. 4 using the same format as that of Fig. 3. As in the average data, the 0. 5- and 1.0-kHz threshold curves are virtually parallel, and so the correspondingfilters

shownby the circles in Fig. 5(c). Similarly, the

in Figs. 4(b) and4(c) are almostidentical. Similarly, MS's 2. 0-kHz threshold curve is steeper than his 0. 5and 1.0-kHz curves, and consequentlyhis 2.0-kHz filter displays relatively more selectivity. But the in-

threshold data of observer FO show greater sensitivity

than thoseof observer MS in the region above0. 25; correspondingly,in Fig. 5(c), the skirt of FO's filter is seen to apply more attenuation than that of MS.

The correspondence would at first appear to break down with observer RB. His threshold data show that he is the least sensitive observer when the notch is

crease in selectivity between 1.0 and 2. 0 kHz is not as

wide, which would seem to conflict with the fact that his

great as that shownin Fig. 3(c), indicatingthat part of the effect seen in the average data is probably due to the

filter in Fig. 5(c) is slightly more severe than that of MS. However, RB is also less sensitive when the notch

J. Acoust.Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

649

R.D. Patterson: Auditory filter shapes

649

TABLE I. Filter shape statistics. In (a), the observers are ordered according to data range; in (b), according to bandwidth. The rows marked with asterisks are the means and standard deviations without observer BG. The fitting process that produced these values was restricted that were fiat at the center frequency.

(a) f0 Average data

Filter

range

range

BW

K

a0 55.9

33.58

34.5

69.2

0. 497

1.0

33.67

34.0

140.0

0. 320

57.1

2.0

36.48

39.9

207.0

0. 404

59.8

2.0

*37.47

41.0

0.456

60.5

OBS DR

OBS BW

36.5

37.7

JB

DR

61.2

0. 623

56.5

35.3

35.4

BS

MM

65.8

0.488

55.6

35.2

36.2

MM

MS

70.5

0.460

55.7

31.8

32.3

MS

JB

72.8

0.563

56.6

29.1

30.7

DR

BS

80.8

0. 370

55.2

33.58

34.46

70.22

0. 5008

55.92

2.58

6.63

Mean std dev

observers'

(b)

Data

0.5

0.5

Individual

to filters

1.0

data

Mean

Mean

39.0

SS

SS

124.0

0. 341

33.3

FO

RB

136.0

0. 391

57.9

31.7

31.3

MS

MS

144.0

0. 308

57.1

31.4

32.1

RB

FO

167.0

0. 243

56.5

33.67

33.93

142.8

0. 3208

57.08

15.7

0. 054

56.8

0.52

43.2

51.3

MB

BS

195.0

0. 605

35.7

39.5

BS

BG

196.0

0. 262

57.8

33.7

36.4

BG

MB

208.0

0.497

60.7

33.5

34.1

MS

MS

239.0

0. 322

59.5

36.48

40.33

209.5

0.4225

59.83

6.62

17.8

std dev

std dev

0.54

33.4

3.02

2.0

0. 087

38.2

std dev

Mean

'213.0

*37.47

*

41.63

'214.0

7.18

'18.5

is narrowest; the polynomial fitted to his threshold data intercepts the ordinate slightly above those of the other observers, and it is this relative sensitivity that is reflected in the filter shapes. Continuing this argument,

the first column in Table I(a) shows the relative sensitivity of each observer as measured by his data range,

that is, his thresholdat Af/fo =0. 0 minus his threshold at Af/fo= o. 4. The observers have been groupedby center frequency and then ordered by data range.

The

secondcolumnof Table I(a) gives the range of each filter over the same frequency region. A comparison of these two columns shows that the data-range order predicts the filter-range order well; there are only two reversals in the filter-range order (BS and MM at 0. 5

kHz andMS and RB at 1.0 kHz), and in each case the difference in data range is less than 0. 5 dB. And finally it is perhaps worth pointing out that the

central portions of Figs. 4(a) and 4(c) also correspond. For example, RB is relatively more sensitive than FO

or MS in the region 0.1 to0.2,

as shownin Fig. 5(a) by

the convergence of RB's threshold curve with those of

61.3

0.137

1.34

0. 4747

60.50

0.117

0.75

C. Individual bandwidths

It might seem only reasonable to expect the consistent and often large differences in the tails of the threshold

curves to predict the bandwidthdifferences amongthe observers. For exampl.e, observer SS, whosethreshold curve drops fastest in Fig. 5(a), has the smallest 3-dB bandwidthat the 1.0-kHz center frequency. However, in general, data range does not predict bandwidth at all well. Although the tail of SS's threshold curve is

almost 10 dB below that of RB, his 3-dB bandwidthis only marginally narrower than that of RB, as can be

seenin Fig. 5(b); their filters are displacedby only 12 Hz as they fall through the gap in the horizontal line. The

3-dB

bandwidths

of the observers

are

shown

in

Table I(b). This time, after grouping, the observers have been ordered in accordance with bandwidth, and a

comparisonof the two orderings[the last columnin Table I(a) and the first column in Table I(b)] reveals the lack of correspondence between bandwidth and data range or filter range. One observer, DR, at 0. 5 kHz, has both the smallest data range and the smallest band-

FO andMS. This too is reflected in Fig. 5(c), where

width.

it can be seen that RB's filter

Thus it would appear that the frequency selectivity of the central portion of the auditory filter, characterized

lies below those of FO

and MS in the region 0. 1 to 0. 2. J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976

Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

650

R.D. Patterson: Auditory filter shapes

650

by the bandwidth, is reasonably independent of the frequency selectivity exhibited by the skirts of the filter as measured by filter range. This implies that masking experiments which employ a wide-band noise masker are not likely to predict the relative success of individ-

whenAf/fo is 0. 0, 0. 15, and0. 25, respectively). But the 2.0-kHz filter is 27 dB downat '•f/fo = 0. 25, where-

ual observers

discrepancy between the 2.0-kHz filter and its Gaussian approximation is similar to those shown for the 0. 5-

in tone-masking

experiments

signal and masker are widely separated. cates that an auditory filter

where the

It also indi-

shape such as that pre-

sentedin Patterson (1974) must have restricted success, because its 3-rib bandwidth and skirt height were determined by the same parameter. III.

GAUSSIAN

APPROXIMATION

TO THE FILTER

as the 0. 5- and 1.0-kHz

down, respectively.

filters

are 21.4

and 20. 8 dB

The 2. 0-kHz filter is 21.8 dB

downat '•f/fo = 0. 20, andover this frequencyrangethe

and 1.0-kHz

filter.

The Gaussian approximation has the form

IH(af/fo) =exp[-rr(af/foBW•.i• )•'],

(16)

whereBW•.ais the equivalentrectangularbandwidth. a The Gaussian curve is flatter than the filter

SHAPE

The procedure for extracting the auditory filters from the data was designed to produce an accurate rather than a simple expression for the shape of the filter. To this end, the data were fitted with a fifthorder polynomial so that the derived filters would not be artificially constrained by the fitting process. As a

shapes

near the center frequency and, as a result, the bandwidths of the approximations are wider and the proportionality constants smaller than those associated with the derived filters. The equivalent rectangular bandwidths

of the dashed curves in Fig. 6(a) are 99. 5 and 209.Hz for the 0. 5- and 1.0-kHz filter shapes. And the corresponding proportionality constants are 0. 433 and 0. 9.79.

result, the expression for the filter shape has five parameters, which seems excessive when compared with

of 343 Hz; however, the Gaussian approximation that

the smooth shapesshownin Figs. 3(b) and 3(c). In an

was restricted to the region below '•f/fo = O.9.0is more

effort to find a mathematically

comparable to those at the lower center frequencies and

more tractable

expres-

The dashed curve for the 9.. 0-kHz

filter

has a bandwidth

sion, the filter shape was compared with several common functions. No good match for the entire filter shape was discovered.

The skirts

of the universal

resonance

curve do not fall fast enough to provide a satisfactory approximation, and the skirts of the filter suggested in

Signal

Patterson (1974) fall too rapidly. Althoughthe Gauss-

Frequency:

¸ 500 Hz I-I 1000 Hz A 2000 Hz

ian curve does not provide a good fit to the filter over its entire range, it does provide a reasonable approximation in the region of the passband. Since the Gaussian curve is a particularly convenient filter shape and the passband is the important part of the filter function in most cases, the Gaussian approximation to the auditory filter shapes will be described here in detail.

The average filters of Fig. 3(c) have been replotted in Fig. 6(a), with the 1.0- and 2. 0-kHz filters displaced down 10 and 20 riB, respectively.

The solid curves

with the center frequency symbols are the filter

shapes.

They displayan inflectionpointin the region'•f/fo •0. 2, where they change from curving downwards to curving upwards. The solid curves, without symbols, indicate the Gaussian approximations to the filters over the entire frequency range. Since the Gaussian function is a

parabola on these coordinates, it cannot accommodate

the changein curvature of the filters, with the result that there are major deviations between the filters and the fits in the central section and at the center frequency. The dashed curves present the Gaussian approxi-

mationsto the filters up to '•f/fo = 0. 25. The filters fall 20 dB or more over this frequency range.

Thus, for

most purposes, this is the important part of the attenuation characteristic. For the 0. 5 and 1.0 kHz filters, the effect of the change in curvature is minimal in this region and the Gaussian function provides a good approximation; the discrepancy between the filters and the fits exceeds 1 dB only at the high-frequency ends of the curves. The inflection point occurs earlier for the 2.0-kHz filter, and consequently the deviations between

the fit and the filter are larger (1. 5, 2. 0, and 4. 0 dB

40

o

.1

.2

.3

.4

•/fo FIG. 6.

Gaussian approximations to the auditory filters.

The

solidlineswithsymbolsare theaveragefiltersfromFig. 3(e);the curves for 1.0 and 2.0 kHz are displaced down 10 and 20 dB for clarity. In each case, the solid line without symbols is the Gaussian

fit

to the entire

filter

and the dashed

up to Af/fo = 0.25.

J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

curve

is the fit

651

R.D. Patterson:Auditoryfilter shapes

651

the bandwidth of that fit is 316 Hz, with a proportionality constant of 0. 200. The differences between these bandwidth estimates and those provided by Houtgast

icant differences among the observers emerge and grow. Thus, the ability of the average filters of Fig. 3 to predict the masking data of a specific observer will

(1074), who also used Gaussianapproximations, are

decrease

negligible. Both sets of estimates are in good agreement with those of Zwieker, Flottorp, and Stevens (057).

e r inc re ase s.

Af/fo = 0. 25. CONCLUSIONS

The auditory filter notched

noise

masker

shapes derived with the aid of the reveal

a well-tuned

filter

whose

skirts fall steadily from about6 dB downat Af/fo = O. 1 tO around35 dB downat Af/fo = 0. 4. The 3-dB bandwidths of the 0. 5-, 1.0-, and 2. 0-kHz filters are 69. 2, 140, and 207 Hz, respectively; a reasonable summary ot of the relationship between the 3-dB bandwidth and the center frequency of the filters is provided by the line

10 logBW= 7.9 logf0- 2. 7. Since these

filters

are

more

than twice

(17) as wide

the tone and the mask-

This

research

was

carried

out at the Defence

Civil Institute of Environmental

Medicine,

and

Downsview,

Ontario, with the assistance of W. Garland, B. Crabtree, and B. Rodden. I would also like to thank M. M. Taylor for numerous helpful discussions. APPENDIX

A

I. Derivation of the auditory filter shape and the proportionality constant This section of Appendix A describes how the polynomial was fitted to a specific set of tone thresholds, and how the filter shape and proportionaltry constant were subsequently derived in terms of the polynomial coefficients produced by the fitting process.

At the end of the Introduction, it was concluded that the shape of the auditory filter is proportional to the derivative

of the tone threshold

curve

obtained

when

the

masker is a notched noise. The first step, then, was

to fit a polynomial,Q,(Af/fo), to the data. The fit had the form

as those

reported by Patterson (1974) and Margolis and Small (1975), it would appear that the auditory filter was not centered on the tone in those experiments as assumed, and that they primarily measured the shape of the skirts of the filter. The wider bandwidths emanating from the notched noise experiment also support Evans and Wil-

son's (1973) contention that the primary neurons of the VIIIth nerve display sufficient frequency selectivity to predict much of the noise-masking data obtained in psychophysical experiments.

The central portion of the filter shape, where the attenuation is less than about 22 dB, is well approximated by a Gaussian curve whose equivalent rectangular bandMore specifically,

[H(Af/fo) I•'=exp[-rr(Af/foBW•.a )•'],

10logPs = Qn(Af/fo),

(A1)

where

Qn(Af/fo)=ao+ a•(Af/fo)+ a•.(/x f /fo)•'+''' + an(Af/fo)n. (A2) The polynomial was fitted to a logarithmic rather than a linear measure of threshold because the variability of the threshold estimates was more uniform on the logarithmic scale, as indicated by the fact that the slopes of the psychometric functions did not vary appreciably as the notch in the noise was widened. Taking the anti-

logarithm of both sides of Eq. (A2) gives

Ps= exp[(ln10/10)Qn(af/fo)].

(A3)

The derivative of the tone threshold curve is, then,

width, BW•.R,is about20%of the center frequency.

(18)

dP, /ln10\,/,•f\ [(ln10• d(,•f/fo)=•-•)Q[,•o )eXp[\-•/Q•(•0f)], (A4, where Q•(/•f/fo)is the derivativeof Q•('•f/fo).

where BW•.a is given by

(19)

10 logBW•.a = 8. 3 logf0- 2.3

and the proportionaltry constant K is 0. 34. The equivalent rectangular bandwidth values are in good agreement with those presented by Zwicker, Flottorp, and

Stevens(1957) and by Houtgast(1974), who also used a Gaussian approximation.

between

ACKNOWLEDGMENTS

The Gaussian approximation should be reasonably successful in predicting tone-in-noise thresholds whenever the signal and masker occupy the same frequency region. However, it should be noted that the approximation will consistently underestimate threshold when the signal and masker are separated by more than about one-quarter of an octave, because the Gaussian curve falls well below the derived filter in the region above

IV.

as the distance

Outside the passband, the

filter shape flattens out while the Gaussian curve falls ever more rapidly.

In the region of the passband, there is little variation in either the shape or bandwidth of the filters

across

The precise relationship between the filter

shape and

the tone threshold curve is detailed in Eq. (10)o Thus the desired expression for the filter

shape is obtained

by substitutingEq. (A4) into Eq. (10) as follows:

[n(f /f o)

( /2XVofo)

o) (/'f /f o)

x exp[(ln10/10) Qn(Af/fo)].

(A5)

Since the slopes of the curves in Fig. 3(a) are always

negative, IH(/xf/fo)I•' is invariablypositive. The only unknownon the right-hand side of Eq. (A5)

observers; for the three center frequencies used, the

is the proportionaltry

mean-to-sigma

inated from the equation if the attenuation applied by

ratio of the bandwidth was about 10.

As

Af/fo increasesfrom 0. 2 to 0. 4, consistentandsignif-

constant K.

And K can be elim-

the filter is expressed as a proportion of the attenuation

J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976

Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

652

R.D. Patterson:Auditoryfilter shapes

applied at the center frequency of the filter,

652

that is, if

IH(0) I•' is set to equalto 1. For, in that case, when Eq. (AS) is evaluatedat 0 as follows,

[H(0)[2=_ (1/2KNofo)(lnlO/lO)Q,• (0)exp[(ln10/10)Qn (0)], (A6)

andtheknownvaluesof IH(0)Iu, Q•(O),andQn(O),(1, a•, and a0, respectively) are substitutedinto Eq. (A6) to produce

TABLE A-I.

Polynomial coefficients.

These values summa-

rize the fits to the averagethresholddata shownin Fig. 3(a). The lines in Fig. 3(a) are produced by the coefficients in con-

junctionwith Eq. (A1). The filters in Figs. 3(b) and3(c) are producedby the same coefficientsin conjunctionwith Eq. (AS).



G0

0.5 1.0 2.0

55.80 56.79 59.47

•1 -- 51.26 -- 49.65 -- 69.56

•2 -- 308.2 -- 295.3 -- 566.2

•3 322.0 289.1 1859.0

•4 1772.0 1589.0 -- 575.8

•5 -- 2946.0 -- 2492.0 -- 2253.0

1= - (1/2KN0f0)(ln10/10)al exp[(ln10/10)a0], (A?) then the normalized filter shape is found by dividing Eq.

(A5) by Eq. (A7). And thus,

filteris obtained fromthethreshold curve,amplifies any irregularities in the fit. In all probability, the threshold variations simply represent sampling vari-

]H(af /f o)]u: (1/ax)Q, •(a/l/o)exp{(ln10/10) x [Qn(Af/fo)- ao]}.

(AS)

This method of splitting the attenuation applied by a

filter into a constantpart K and a variable part lH(Af/

ability and, consequently,with the individualobservers, a smooth curve was drawn through the data by eye and the polynomial was fitted to points taken from the

smooth curve, instead of fitting the polynomial to the

f0)[u also makesit possibleto derive an expressionfor

raw data directly. In the majority of applications, the

K.

shape of the filter's passbandwas more important than the shape of the tails, and so more thresholds were measured in the region of the passband. This emphasis

In this case, it is only a matter of multiplying both

sides of Eq. (A7) by K to produce

K=- (1/2Nofo)(ln10/10)a • exp[(ln10/10)a0 ].

(A9)

Since a• is invariably negative, K is always positive. In fitting the data, the degree of the polynomial was

varied from 1 to 5. The first-order polynomial, a line,

accountedfor better than 90%of the variability in the data in every case; however, the data were consistently above the line in the frequency region 0.05 to 0.15, consistently below the line in the region 0. 25 to 0. 35 and above the line again at 0. 4, indicating that a polynomial of at least third order was warranted.

Third-

and

fourth-order polynomials both provided excellent approximations to the data, but the coefficient of the high-

was preserved when fitting the polynomial to the smooth curves by choosing one smooth-curve point per data point.

III.

Filter shapenear the center frequencyand

bandwidth variability

It would seem reasonable to assume that the auditory filter exhibits a maximum at its center frequency and is continuousat that point; or, in other words, that the

filter is flat at •f/fo = 0. 0. The filter shapederivation describedin Secs. I andII of AppendixA doesnot, of

case, Eq. (20) becomesvery large as Af/fo rises above

necessity, producefilters that are flat at zero, and when the filter shapeswere first produced using this procedure some of them displayed peaks or notches

0. 4 and it is not possible to extrapolate even a short distance beyond the data. No general solution to this

would appear to arise becausethere are no data points

est-order term was often positive in these fits.

In this

about the center frequency.

The peaks and notches

problem was found; however, when a fifth-degree poly-

at negativevaluesof Af/fo tO restrict thepolynomial,

nomial was fitted to the average data, the coefficient of the highest-order term was always negative and the

and because the threshold curves do not include a data

curve obtained from Eq. (20) proceeded toward zero in an orderly fashion in the region above the data.

Con-

sequently, the fifth-degree fits were used in producing the filter shapeseven thoughthe fifth term is not always negative for individual observers. The coefficients for the average data are presented in Table A-I.

pointat Af/fo= 0. 0. The importanceof measuring threshold at zero, as opposedto 0. 02, was not recognized until the filters were derived, by which time the observers were no longer available. The peaks and

notchesdo not extendbeyondaboutAf/fo = 0. 03 andthus are not particularly important as far as the overall

shapeof the filters is concerned. However, the attenuation applied by the filter is measured relative to the

II. Smoothing the individual data

attenuationat the center frequencyand,consequently,

Initially, the polynomials were fitted to the raw data. This procedure proved quite satisfactory in the case of the average data, and indeed the threshold curves and

the presence or absence of a peak or notch affects the position of the half-power point and thus the 3-dB band-

filter shapesshownin Fig. 3 were producedthis way. In the region above 0. 2, however, where the data points are more widely spaced, small undulations appeared in the filters of individual observers when the polynomial was fitted to their raw data.

The deviations in the

threshold data of a particular observer are relatively

'width.It wasalsoevident thatthisapparent artifactof the fitting process was contributing to bandwidth variability, so a method for restricting the fitting process to flatter filters was incorporated into the procedure. The shape of the filter in a particular frequency region is primarily determined by the relative level of the

small, as can be seenin Fig. 5(a). Nevertheless,they lead to irregularities in the filter becausethe polyno-

thresholds in the same region, and it was foundthat the size of the peaks and notchescouldbe controlled, without affecting the shape of the filter in the region above

mial fit is sufficiently accurate to follow the threshold

Af/fo= 0. 03, by addinga pointto thedataat Af/fo= O.0

variations; and the differentiation process, wherebythe

and adjusting its level over a small range. Two cri-

J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

653

R.D. Patterson: Auditoryfilter shapes

653

(2) Criterion II:

In addition, a closer examinationof

teria for positioning the added point were evaluated in terms of the variability of the resulting BW and K

Table

values:

tions, small as they are, may still be excessive.

(1) Criterion I: In the first, the addedpoint was adjusted so that the filter was between 0. 1 and 0. 11 dB

downat/•f/fo = 0. 01, a procedurethat essentially eliminated the peaks and notches. All of the curves in Figs. 3-6 and the values in Tables I and A-I were produced with the aid of this criterion. Bandwidth variability is not commonly discussed in papers dealing with the critical band or auditory filter shape, perhaps because the number of observers is typically three or less, but what information there is suggests that the variability

is high. Houtgast (1074) reported that the bandwidthsof his five observers ranged from 70 to 200 Hz at 0. 5 kHz, from 140 to 200 Hz at 1.0 kHz, and from 280 to 440 Hz at 2.0 kHz. The corresponding bandwidth ranges presented in Table I are substantially smaller; in each

case, the range is less than 5%of the center frequency. 8wets, Green, and Tanner (1062) estimated filter bandwidth in seven different but highly similar experimental conditions. For each observer they reported a meanto-standard-deviation

ratio

calculated

across

I leads to the conclusion

that the standard

devia-

A

comparison of the column of bandwidths with the column of proportionality constants reveals that at each center

frequency there is a tendency for K to decrease as BW

increases across observers. To understand the implications of this relationship, consider what happens when

a point is addedto the data at/•f/fo = 0. 0. As the point is adjusted upwards from a low position, the notch in the corresponding filter

becomes smaller

and smaller

and is eventually replaced by a peak; at the same time, K increases and BW decreases. This suggests that if the criterion for positioning the added point were relaxed slightly, it would be possible to reduce the variability in BW and the variability in K simultaneously. Before proceeding to illustrate this argument, it is perhaps worth pointing out that the results of this analysis will differ from those in Table I only inasmuch as the variability of BW and K will be reduced. And since the analysis involves an ad hoc restriction and the deletion

of one observer's data, it is presented here as an addendum rather than as the main analysis.

the seven

experimental conditions. The ratios were 4.4, 5. 1, and 6.6 for their three observers. No strictly comparable measure is available in the present experiment, but for each center frequency, a mean-to-standard-de-

There is one major exception to the general tendency for K to decrease as bandwidth increases, and that is the data of observer BG, who has both a narrow band-

viation

tensive experience in psychoacoustic experiments, whereas the other observers had no previous experience. In addition, it is likely that BG was better motivated than the others because he did the testing. Both of these factors might be expected to increase his efficiency and so reduce his K, and since this discussion depends on the K values being comparable, BG's data

ratio

can be calculated

across

observers.

These ratios might be expected to be lower than those

of Swets, Green, and Tanner, inasmuch as the standard deviation includes between-subject variability; however, the ratio values are in fact nearer 10, which suggests that population bandwidth may not be as variable as previously suspected.

width

and the smallest

K at 2.0 kHz.

TABLE A-II. Filter shape statistics. The observers are ordered according to data range in (a) and bandwidth in (b). The fitting process that produced these statistics was restricted to specific values of K such that the variability of K was reduced by one-half relative to Table I(b). (a)

f0

0.5

Mean

(b)

Data

Filter

range

range

OBSDR

OBSBw

K

a0

36.5

37.4

JB

MM

65.0

0.494

55.7

35.3

36.2

BS

DR

66.8

0. 561

56.4

35.2

36.3

MM

MS

68.0

0.480

55.8

31.8

32.5

MS

BS

68.9

0.435

55.4

29.1

30.2

DR

JB

76.3

0. 532

56.5

33.58

34.52

69.0

0. 5008

55.96

ski dev

Individual

observers'

1.0

data

Mean

2.72

Mean

3.88

0.043

0.42

38.2

38.9

SS

SS

128.0

0. 331

56.8

33.4

34.0

FO

MS

141.0

0. 314

57.1

31.7

31.4

MS

FO

145.0

0. 282

56.7

31.4

31.7

RB

RB

147.0

0. 356

57.8

33.67

34.00

140.3

0. 3208

57.10

std dev

2.0

BW

3.00

7.40

0. 027

0.43

43.2

51.5

MB

MS

198.0

0. 398

59.8

35.7

38.9

BS

BS

212.0

0. 540

61.2

33.5

35.0

MS

MB

213.0

0.486

60.6

207.7

0.4747

60.53

0. 058

0.57

37.47

41.70

ski dev

6.83

6.85

,

J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976

Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

BG had had ex-

654

were

R.D. Patterson: Auditory filter shapes

654

omitted.

Akad. Wet. 72, 129-151.

For each center frequency, the standard deviation of K was halved by halving the deviation of each observer's K from the mean value; the new values are shown in Table A-II. Then each observer's data were refitted; that is, the point added at zero was adjusted until that observer's revised K value was produced. The statistics of the filter shapes that follow from this modified fitting process are also presented in Table A-II and, as predicted, the variation in bandwidth is substantially

reduced; the mean-to-standard deviation ratios have increased

to around

20.

The observers

are ordered

ac-

de Boer, E., and Jongkees, L. B. W. (1968). "On cochlear sharpening and cross-correlation methods, "Acta OtoLaryngal. Stockholm 65, 97-104.

de Boer, E., and Kuyper, P. (1968). "Triggered correlation," IEEE Trans. Biomed. Eng. BME-15 (3), 169-179. Egan, J.P., and Hake, H. W. (1950). "On the maskingpattern of a simple auditory stimulus, "J. Acoust. Soc. Am. 22, 622--630.

Evans, E. F., andWilson, J.P.

(1973). "Frequencyselec-

tivity of the cochlea," in Basic Mechanisms of Hearing, edited by A. R. M•ller (Academic, New York), pp. 519-551.

Fletcher, H. (1940). "Auditory patterns, "Rev. Mod. Phys. 12, 47-61.

cording to bandwidth, but the order is quite different from that in Table i and the proportionaliVy constant no longer varies systematically with bandwidth. There is also a reduction in the variability of a0, the estimate of wide-band threshold. But aside from these effects, which are all directly related to the adjusting of K, there are no important differences in the summary statistics, that is, in mean bandwidth, mean a0, mean filter range, or filter- range variability. The largest shift of the point added to zero was 0. 49 dB for observer MS at 2.0 kHz, and the average shift was just 0. 18

Fletcher, H. (1953). Speechand Hearing in Communication (Van Nostrand, New York), 2rid ed. Greenwood, D. D. (1961). "Auditory masking and the critical band, "J. Acoust. Soc. Am. 33, 484-502. Hawkins, J. E., and Stevens, S.S. (1950). "Tt•e maskingof pure tones and of speech by white noise," J. Acoust. Soc.

dB.

Margolis, R. H. (1974). "The measurementof critical masking bands," doctoral dissertation (University of Iowa). Margolis, R. H., and Small, A.M. (1975). "The measure-

Since the individual

thresholds

were

estimated

only to the nearest 0. 5 dB, the magnitude of the adjustment does not seem unreasonable, and it is not surprising to find that there are virtually no changes in the shapes of the filters produced by the modified fitting process. Finally, it should perhaps be pointed out that the correspondence between filter range and bandwidth is not improved by the modified fitting process.

*The paper also appears as DCIEM Report75 RP X6, andportions of it were presented at the 88th Meeting of the Acoustical Society of America, J. Acoust. Soc. Am. 56, S 36 (A) (1974).

•To be precise, Margolis andSmall held the tonepower constant and measured d' as a function of the position of the noise edge. While this procedure avoids the criticism that filter shape may vary with tone power, it does not provide accurate estimates of filter shape for frequencies that are remote from

the tone.

Neither

of these differences

would be

expected to alter the overall correspondencein the results of Patterson and Margolis and Small.

•'Theauthor is indebtedto bothM. M. Taylor andM. R. Schroeder for this suggestion.

3If the 3-dB bandwidthBW is usedinstead of the equivalent rectangular bandwidth,the approximationis exp[- 4 ln2(Af/

f0 BW)•]ß The 3-dB bandwidths for the 0.5-, 1.0-, and2.0kHz approximations over the first 20 dB of attenuation are 93.5, 190, and 297 Hz.

de Boer, E. (1967). "Correlation studies applied to the frequency resolution in the cochlea, "J. Aud. Res. 7, 209--217. de Boer, E. (1968). "Reverse correlation I," Proc. K. Ned. Akad. Wet. 71, 472-487. de Boer, E. (1969). "Reverse correlation II," Proc. K. Ned.

Am. 22, 6-13.

Houtgast, T. (1972). "Psychophysical experiments on grating acuity," paper presented at the Symposiumon Hearing Theory, IPO, Eindhoven, The Netherlands. Houtgast, T. (1974). "Lateral suppressionin hearing, "doctoral dissertation (University of Amsterdam, Amsterdam, The Netherlands).

ment of critical masking bands," J. Speech Hear. Res. 18, 571-587 •.

Patterson, R. D. (1970). "Experimental determination of auditory filter shape for tones masked by bands of noise, "J. Acoust. Soc. Am. 47, 107(A). Patterson, R. D. (1971). "Effect of amplitude on auditory filter shape," J. Acoust. Soc. Am. 49, 81(A). Patterson, R. D. (1974). "Auditory filter shape," J. Acoust. Soc. Am. 55, 802-809.

Schafer, T. H., Gales, R. S., Shewmaker, C. A., andThompson, P.O. (1950). "The frequency selectivity of the ear as determined by masking experiments," J. Acoust. Soc. Am. 22, 490-496.

Scharf, B. (1970). "Critical bands," in Foundations of Modern

AuditoryTheory, editedby J. Tobias(Academic,NewYork), Vol.

1.

Swets, J. A., Green, D. M., and Tanner, W. P., Jr. (1962). "On the width of critical bands, "J. Acoust. Soc. Am. 34, 108--113.

Webster, J. C., Miller, P. H., Thompson, P. O., andDavenport, E. W. (1952). "The masking and pitch shifts of pure tones near abrupt changes in a thermal noise spectrum, "J. Acoust. Soc. Am. 24, 147-152.

Wilson, J.P.,

and Evans, E. F. (1971). "Grating acuity of

the ear: psychophysical and neurophysiological measures of frequency resolving power," Proceedings of the 7th International Congress on Acoustics 3, Akademiai Kiado, Budapest, pp. 397-400.

Zwicker, R., Flottorp, G., and Stevens, S.S. ical bandwidth in loudness summation, "J. 29, 548-557.

J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

(1957). "Crit-

Acoust. Soc. Am.

Auditory filter shapes derived with noise stimuli.

Auditory filter shapes derived with noise stimuli* Roy D. Patterson M. R. C. Applied PsychologyUnit, Cambridge,England,CB2 2EF. (Received11 August1975...
2MB Sizes 0 Downloads 0 Views