Auditory filter shapes derived with noise stimuli* Roy D. Patterson M. R. C. Applied PsychologyUnit, Cambridge,England,CB2 2EF. (Received11 August1975;revised3 November1975)
A wide-bandnoisehavinga deepnotchwith sharpedgeswasusedto maska tone.The notchwascentered on the tone, and thresholdwas measuredas the width of the notchwasincreasedfrom 0.0 to 0.8 timesthe tonefrequency (0.5, 1.0,or 2.0 kHz). The spectrum levelof the noisewas40 dB SPL. If it is assumed that the auditoryfilter is reasonably symmetricat theseintensities, thenthe shapeof the filter centeredon the tone can be estimatedfrom the first derivativeof the curve relatingtone thresholdto the width of the notch in the noise. The 3-dB bandwidths of the filters obtained were about 0.13 of their center frequency.
In the regionof the passband, the Gaussiancurveprovidesa goodapproximation to the shapeof the derivedfilters.The equivalent rectangular bandwidths of the Gaussianapproximations are about0.20 of their centerfrequency,whichis comparable to the c•ical-band estimates of R. Zwicker,G. Flottorp,and S.S. Stevens ["Criticalbandwidth in loudness summation," J. Acoust.Soc.Am. 2•, 548-557(1957)]. The Gaussian approximation cannotbe usedoutsidethe passband, because the tailsof the derivedfiltersdo not fall as fast as the Gaussian curve.
Subject Classification: [43]65.58,
[43]65.35,
[43]80.50.
INTRODUCTION
Several investigators, using quite distinct techniques, have attempted to measure the attenuation characteris-
tic or shape of the auditory filter. timated
the attenuation
characteristic
De Boer (1967) esof the cat's
audi-
tory filter by correlating a broad-band noise stimulus with the auditory nerve impulses recorded in response
to the noise. Houtgast(1974) estimated the shapeof the human auditory filter from the data obtained in a tonemasking experiment in which the masker was a broad-
band noise with a rippled spectrum. Patterson (1974) also derived the filter shape using a tone-in-noise experiment and human observers, but the masker was a low-passed or high-passed noise with a sharp cutoff. An experiment similar to Patterson's was performed by
Margolis and Small (1975) and yielded comparable results.
The filter shapes obtained by de Boer and Houtgast both suggested that the auditory filter had a broad, fairly flat top, flanked by rather sharp skirts; the band-
width of their filters ranged from 15% to 25% of the center frequency. The experiments of Patterson and Margolis and Small also led to the conclusion that the auditory filter had sharp skirts; however, their filters had no appreciable flat top. The passbands of the filters
were in the range of 4% to 7% of the filter's center frequency. In this paper, the different techniques and their respective filter shapes are reviewed; it is argued that one of the assumptions made by Patterson and Margolis and Small in their filter derivation is probably incorrect, and this hypothesis is subjected to experimental
test.
'
A. Neurophysiologicalfilter shapes Perhaps the most ingenious method for measuring
filter shape is that of de Boer (de Boer, 1967, 1968, 1969; de Boer and Jongkees, 1968; de Boer and Kuyper, 1968). De Boer (1967) presented a white noise to a cat and, with the aid of a micropipette, recorded the impulses emanating from a single cochlear nerve fiber. 640
J. Acoust.Soc. Am., Vol. 59, No. 3, March 1976
The train of neural spikes that followed the noise stimulation was cross correlated with a recording of the noise which had been appropriately delayed to allow the stimulus
sufficient
time
to travel
down
the basilar
mem-
brane to the point of transduction. Theoretically, if the input to a filter is a white noise and the filter's output is cross correlated with its input, the result is the impulse response of the filter. De Boer generated a detailed version of the impulse response for a particular fiber by repeatedly stimulating the fiber and correlating the noise input with the neural output. Finally, the impulse response was played repetitively and scanned with a wave analyzer to determine the attenuation characteristic of the auditory filter associated with that particular The
cochlear attenuation
fiber. characteristics
associated
with
two
cochlear fibers whose best frequencies were about 1.0 and 3.0 kHz are shown in Fig. 1 by the solid curve and the upper dashed curve, respectively. The data are re-
plotted from Figs. 8B and 8C of de Boer (1969). The bandwidths of these filters, 3 dB down from the peak, are 160 and 750 Hz. Correspondingly, the psychophysical experiments reviewed in Zwicker et al. (1957) indicate
that the critical
bandwidth
in man
is about
160
and 500 Hz at 1.0 and 3.0 kHz, respectively. Since the cochlea in the cat is only about two-thirds as long as in man, this correspondence in bandwidths should not be overemphasized. On the other hand, as Evans and Wil-
son (1973) have pointedout, it would appear on the basis of this and similar data that the frequency selectivity available at the output of the cochlea is sufficient to account for the basic psychophysical data summarized in Zwicker's critical-band estimates.
The frequency tuning curve provides a more traditional measure of the frequency selectivity of the cochlea. The tuning curve is obtained by sweeping a tone through the frequency region that stimulates a particular fiber and recording the amplitude of the tone required to produce a constant rate of firing in the fiber. De Boer was able to hold the units that produced the attenuation characteristics shown in Fig. 1 long enough to determine Copyright¸ 1976 by the AcousticalSocietyof America
Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
640
641
R.D. Patterson'Auditow filter shapes
641
m
10
FIG.
1.
Auditory filter
cat and man.
The
solid
shapes in the curve
short-dash curve are the filter associated
•
with
1.0-
and the
shapes
and 3.0-kHz
units
in the cat as derived by de Boer (1969). The filled and open diamonds are the inverted tuning curves for the same fibers. The open circles and the longdash curve show the filter shapes at 1.0 kHz reported by Houtgast (1974)
20
20
30
301andPatterson (1974),respectively, for human
observers.
RELATIVE FREQUENCY
their frequency tuning curves. These data are presented in Fig. 1 as filled and open diamonds for the fibers with best frequencies of 1.0 and 3.0 kHz, respectively. For comparison purposes, the tuning curves have been normalized and inverted; that is, the diamonds in Fig. 1 show the sensitivity of the fiber at various frequencies relative to the sensitivity at the best frequency. Despite the difference in stimuli, tones for the tuning curves and noise for the impulse responses, the methods yield reasonably comparable measures of the frequency selectivity that is available at this early stage of the neural auditory system.
Wilson and Evans (1971) and Evans and Wilson (1973) have taken this comparison one step further, and shown that the tuning curve data obtained from a single unit of a cat can be used to predict the response of that fiber to
"rippled" noise. Ripplednoise, whoselong-term spectrum varies sinusoidally on a linear frequency scale, is produced by adding a white noise to a copy of itself that has been delayed by T sec. When the delayed version of the noise is added to the original in phase, the rippled
noisewill havepeaksat 0 andevery multiple of 1/T Hz; when the polarity of the delayed noise is reversed,
the
ripple will showa minimum at 0 and multiples of The spectrumof the rippled noise, N(f), is
N(f ) =No(1ñMcosZtrf.T ),
(1)
where No is the spectrum level of the original noise and
M is the modulation depth determined by the attenuation of the delayed version of the noise.
Throughout their
experiment, Wilson andEvans (1971) chosevalues of T such that either
a maximum
or a minimum
occurred
in
the rippled noise at the best frequency of the fiber they were holding. For each value of T, they measured the change in the firing rate of the fiber when the polarity of the delayed noise was reversed. When T is smaI1, the maxima and minima of the rippled noise are widely spaced, and changing the polarity of the delayed noise produces its greatest effect; as T is increased, the
maxima and minima occur more frequently, and the dff-
ference in firing rate following a polarity reversal
de-
creases.
This effect can be explained if the firing rate of the fiber is likened to the output of a filter whose attenuation characteristic has the form of a normalized, in-
verted tuning curve. The output of this hypothetical auditory filter is largely determined by the stimulus energy near the center frequency of the filter. In those conditions where T is small, the maxima and minima are broad relative to the passband of the filter. When a maximum is positioned at the center frequency of the
filter, the output is large; but when the polarity of the delayed noise is reversed and a minimum occurs in the passband, the output of the filter is markedly reduced. In those conditions where T is large, the peaks of the ripple are more densely packed and the passband of the filter encompasses a number of maxima and minima. In this case, when the polarity of the delayed noise is reversed, the output of the filter does not vary appre-
ciably because, althoughthe peaks do shift, the number of peaks in the passband remains roughly constant. But Evans and Wilson (1973) went beyond this qualitative analysis, calculating the filter outputs that would be expected in the various conditions of their experiment when the normalized, inverted tuning curve was used as an estimate of the shape of the auditory filter. And their calculations confirmed that the tuning curve of a
particular fiber provides a good basis for predicting that fiber's response to rippled noise. B. Psychophysicalfilter shapes
Houtgast(1972, 1974)used a rippled noise stimulus in a psychophysical experiment with human observers. His observers were required to detect a pulsed sinusold in the presence of a continuous rippled noise background. Masked
threshold
was determined
as the duration
and
polarity of the delayed noise were varied using a tech-
nique similar to Wilson and Evans (1971). Whereas Evans and Wilson (1973) compared the frequency selectivity of tuning curves with the selectivity indicated by
J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
642
R.D. Patterson:Auditory filter shapes
642
their rippled noise data, Houtgast was able to derive the shape of the auditory filter from his data directly. Houtgast also included two additional conditions in which the delay was adjusted so that the tone was midway between a maximum and a minimum; in one case
and [H(f)[•' in Eq. (2), the result is
P•=KNof of.•[lqMcos(rng)]
I
nearest minimum, and in the other case the maximum was above and the minimum
below the tone.
These
}
x 1+• [an cos(ng)+ bnsin(ng)] dg,
the nearest maximum was at a lower frequency than the
where rn is an integer.
(4)
The integral was evaluated
stimulus conditions, where the noise spectrum has a distinct slope in the immediate vicinity of the tone, can
over an integer numberof cycles, from f0/2 to 3f0/2,
be contrasted to reveal whether the auditory filter symmetric.
tributes to the integral is M eos(rng)am eos(rng), pro-
is
and, as a result, the only trigonometric
term that con-
duced when the index of summation n is equal to m.
The model of noise masking on which Houtgast's derivation of the filter shape is based has been described many times (Fletcher, 1940; Schafer et al., 1950;
SinceMaincos•'(rng) =-}Main [1 - cos(2mg) ], Eq. (4) reduces
to 1
Swets, Green, and Tanner, 1962). It is assumedthat, in order to improve the detectability of the tone, an auditory filter is centered at the tone frequency. The tone and any components of the noise in its immediate vicinity pass through the filter largely unattenuated, whereas the remaining
noise components are progres-
sively attenuated as their distance from the tone increases. Thus the power of the tone at threshold is simply a weighted proportion of the noise power enter-
ing the ear, and the weighting function is the auditory filter shape. If the power of the tone at threshold and
the spectrumof the maskingnoise are Ps andN(f), respectively, and if the shape of the auditory filter is
[H(f) [•', thenthe relationshipcanbe expressedas follows:
Ps =Kf.•N (f)[H(f) [•' df,
(2)
where K is a proportionaltry constant. Houtgast held the level
of the tone constant
at 45 dB and varied
the
mean spectrum level of the noise No to determine the level at which the tone was just audible. threshold
data describe
how the level
Thus his
of the noise
must
Thus, by representing the filter with a Fourier
series
and using a masker with a cosinusoidaI spectrum, Houtgast was able to extract the cosine Fourier
coefficients
an from the corresponding threshold noise levels. Similarly the sine coefficients bn can be obtained from the experimental conditions wherein the masker spectrum is sinusoidal, such that the tone is midway between a maximum
and a minimum
of the masker.
The coefficients derived for a 1.0 kHz tone are presented in Fig. 1 as open circles. The data are from
Houtgast(1974, Fig. 9. 5, direct masking). In Houtgast (1972), a curve was fitted to a subsetof these data. The curve, which has been omitted for clarity, had basically the same shape as the solid line from de Boer in Fig. 1; it intersected a horizontal line 10 dB down from the peak at about 835 and 1180 Hz on the low and high sides of the filter, respectively. Thus Houtgast's findings are in rather good agreement with de Boer's, which is perhaps surprising since de Boer's data were gathered from eoehlear fibers in the eat and Houtgast's are from a psychophysical experiment on humans.
be varied as a function of the polarity and duration of the delay to maintain constant detectability, that is, a constant noise level at the output of the auditory filter.
Houtgast did not measure the full shape of the auditory filter at other center frequencies. He did, however,
Houtgast derived the shape of the filter from the threshold data by means of the model summarized in
curve--to
Eq. (2) and an intriguing application of Fourier analysis. P• and K are constantsin this situation and thus do not affect the shapeof the filter, and the functionN(f) is knownfrom Eq. (1). What is required, then, is a general description of the filter which permits an analytic solutionto the integral in Eq. (2) and which, at the same time, is written in terms of useful unknowns. Houtgast employed a trigonometric series of the form
use rippled noise--and the simplifying assumption that the auditory filter could be approximated by a Gaussian estimate
the bandwidth
of the filter
for
five
observers at frequencies ranging from 0. 25 to 4.0 kHz. The equivalent rectangular bandwidth at 1.0 kHz was found to be about 170 Hz. Although the bandwidth estimates showed considerable spread as a function of observer, the average bandwidth across observers was in extremely good agreement with the critical band curve presented originally by Zwicker, Flottorp, and Stevens
(1957) and later reinforced by Scharf (1970). The experiments on auditory filter shape reviewed to this point give the impression that the frequency selectivity of the auditory system may be largely, if not exclusively, determined in the cochlea, and that succeeding stages of neural processing do little to sharpen auditory filter shape. There are, however, several papers on auditory filter shape which suggest that a much sharper filter is available in some psyehophysieal experiments.
[H(5•fø )I•'= 1+•• 1[an cos2?m(ff7fø)+bn sin2?rn (ff7 fø)] (s) to represent the filter; a• and bn are the Fourier coefficients, n is an integer, andY0 is the tone frequency. By and large it will prove more convenient to measure frequency from the tone frequency, and to measure it relative to the tone frequency. This relative frequency
variable will bedesignatedg, andsog= (f-fo)/fo anddf =fodg. Now, if Eqs. (1) and (2) are substitutedfor N(f)
Patterson (1970, 1971, 1974) measuredthresholdfor
J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
643
R.D. Patterson:Auditory filter shapes
643
a tone of constant frequency as the cutoff of a low-passed noise was increased. As expected, tone threshold rose from absolute threshold when the noise edge was far below the tone to wide-band threshold as the noise edge passed the tone. The experimental paradigm is shown
schematically in Fig. 2(a). The shadedarea represents the noise that is effective in masking the tone. Care was taken to insure that the power spectrum of the lowpassed noise fell abruptly beyond the cutoff frequency so that the noise edge could be approximated by a step
function as shown in the diagram.
Patterson argued,
tion was considered satisfactory. Finally, he showed that this filter was quite successful in predicting some
typical noise maskingdata of Webster et al. (1952), Egan and Hake (1954), andGreenwood(1961). Margolis and Small (1975) reported a similar, although independent, determination of the filter shape. They also used a low-passed noise with an abrupt cutoff to mask a tone of constant frequency, and measured the detectability
of the tone as a function of the noise cut-
off. • However,their noisewas generateddigitallyby summing the appropriate values of sinusolds spaced 1
similarly to Evans and Wilson (1973), that threshold rises because the noise at the output of the filter increases as the noise edge approaches the tone. He pointed out that if the auditory filter is centered on the tone, then each threshold provides an estimate of the
area under the filter up to the position of the noise edge. Therefore, he concluded, the shape of the filter can be obtained by taking the derivative of the tone threshold curve.
Hz apart from zero to the cutoff frequency.
Thus, the
edge of the noise was extremely sharp and provided a better approximation to a step function than did Patter-
son's"multiplied"noise(afterGreenwood, 1961). Margolis and Small (1975) measured the shape of the filter for tones at 0. 5, 1.0, and 4.0 kHz and obtained filter
shapes,like Patterson,by differentiatingthe curverelating detectability
to the noise cutoff.
The filters
were
quite similar to those reported by Patterson, although In mathematical terms, Patterson, like Houtgast,
isolated IH(f)I •' in Eq. (2) by choosinga noisemasker that simplified the integration. In this case Eq. (2) becomes
NOISE
•.Af/fo =/Vofo 3I(g) l"ag,
(6) UDITORY
FILTER
where Af is the distance between the tone and the noise
edge. The derivative of Eq. (6) with respect to Af/fo is
dPs _KNofoIH(•oof) •'
(v)
a(Af /f o) -
fo
Therefore, the filter shape is given by the derivative of the threshold curve divided by the constant KNof o.
NOISE
The experiment was run at five tone frequencies (0. 5, 1.0, 2.0, 4.0, and 8.0 kHz) andeach of these conditions was replicated using a high-passed noise in place of the low-passed noise, a procedure which leads to a more sensitive measure of the upper skirt of the filter. The curve with the broad dashes in Fig. 1 shows the derived filter shape in the case where the tone was 1.0 kHz. The passband of this filter is clearly much narrower than the passbands of the filters presented earlier. The 3-dB bandwidth of the filter is 59 Hz, whereas the bandwidth estimated by Houtgast for the same center frequency using the Gaussian filter shape approximation was 170 Hz. And, in general, Patterson's estimates were about one-third those reported by Zwicker,
Flottorp, and Stevens(1957); that is, they were much more in line with thoseobtainedby Fletcher (1940, 1953) and Hawkins and Stevens(1950), both of whom used the critical
ratio
method.
Patterson found that the estimated shapes were well approximated by the expression
H where
1.29a
- [1+(Af/otfo)•'] •'' is the 3-dB bandwidth
function is symmetric
(8) of the filter.
This
on a linear frequency scale.
The
data showed a slight asymmetry, with the upper skirt being less sharp; however, the symmetric approxima-
(b)
fo
t NOISE
P
fo FREQUENCY
FIG. 2. Schematic representation of auditory filtering when the masker is a low-passed noise, (a) and (b), and a notched noise, (e). The simplest model, wherein the filter is assumed to be centered on the tone, is portrayed in (a). The area where the noise and the filter overlap represents the noise that is effective in masking the tone. When the masker is a lowpassed noise and the filter is shifted, as in (b), detection improves in those eases where the shift produces more noise reduction than signal reduction.
When the masker is a notched
noise, as in (e), shiftingthe filter produceslittle, if any, improvement in detection, because the noise reduction on one
side of the tone is accompanied by a noise increase on the other
side.
J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976
Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
644
R.D. Patterson:Auditory filter shapes
644
bandwidthsreportedby Margolis andSmall (1975)and
ly, this filter shape is probably not as reliable. It is also worth noting that the extreme asymmetry in tuning curves that span an amplitude range of 60 dB or more
Patterson (1974) were 30 and 29 Hz at 0. 5 kHz, 57
is not as apparent in that portion of the curve within 30
and 59 Hz at 1.0 kHz, and 240 and 200 Hz at 4.0 kHz.
or 40 dB of the tip, that is, at low intensities. This lack of asymmetry was revealed by de Boer, who plotted the tip of his tuning curves on a linear rather than logarithmic frequency scale as in Fig. 1. The experiment of Wilson and Evans (1971) did not include conditions
they were shiftedup in frequencyby about15 Hz and thus were slightly less symmetric.
The respective
Taken at face value, the experiments just described would suggestthat the neural stages of the auditory system subsequentto the cochlea are capable of sharpening the cochlear auditory filter in at least somecircumstances. However, one of the assumptionsmadebothby Patterson and Margolis and Small is probably incorrect.
Specifically, in the derivationof the filter shapeit was assumed that the filter is centered on the tone as shown
in Fig. 2(a). If the auditow filter has reasonablysteep skirts and a fairly flat section about the center frequen-
cy, thenthe signal-to-noiseratio at the outputof the filter is not at its maximum when the filter is centered
on the tone. Rather the signal-to-noise ratio can be im-
provedby centeringthe filter somewhatto the side of the tone, as shownin Fig. 2(b). For in that case, the skirt of the filter markedly reduces the amount of noise
leaking throughthe filter without producinga comparable reduction of the tone, because the attenuation characteristic is flatter in the region of the tone. Therefore,
where asymmetry could be assessed. Houtgast (1974) specifically included a test for asymmetry in his design. No pronounced asymmetry was encountered, but the amplitude range of his experiment is limited to about 10 dB and therefore does not provide a strong test for
asymmetry. Both Patterson (1974) and Margolis and Small (1975) found a small asymmetry, but in the opposite direction to that of the tuning curve. If the filter is reasonably symmetric, it can be assumed to be centered at a point near the tone when the masker is a notched noise, because this will be the re-
gion where the signal-to-noise ratio at the output of the filter is greatest. The two cases, where the filter is
shifted and not shifted, are shownin Fig. 2(c). When
passbandsthan those reported by Patterson or Margolis
the filter is centered on the tone, approximately equal amounts of noise pass under each skirt; when the filter is shifted up in frequency, less noise passes under the lower skirt, but the increase in noise entering via the upper skirt more than compensates for the reduction on
andSmall couldaccountfor the thresholdstheyobtained.
the low side.
if it is assumed that the filter
is not centered on the
tone but rather is positioned where it produces the maximum signal-to-noise ratio, then filters with wider
C. Filter shapefrom notched noise To test the assumption that the filter was centered
on the tone, Patterson'sexperimenthas beenreplicated using a notchednoise, that is, a broad-bandnoisewith a gap in the region of the tone. The notchednoise has the advantageof minimizing the improvement in signalto-noise ratio that can be achieved by shifting the filter.
This experiment is the topic of the remainder of this paper. In short, however, the auditory filter shapes derived
with the aid of the notched noise have much
wider passbands thanthosereportedin Patterson(1974) and, in fact, the bandwidthsare close to thoseof Zwicker, Flottorp, and Stevens(1957) andHoutgast (1974). This result leads once again to the conclusion that the frequency selectivity evident at the output of the cat's cochlea is comparable to that displayed by the auditory filter measured in noise-masking experiments with
human
observers.
The experiment consisted of centering the notch in the masker on the signal tone and measuring tone threshold
as a function of the width of the notch.
The
paradigmis presentedschematically in Fig. 2(c). It was assumed that at these fairly low intensities the audi-
tory filter is reasonably symmetric. Supportfor this assumption comes from several of the experiments reviewed earlier.
De Boer's filter
corresponding to the
1.0-kHz fiber (the solid line in Fig. 1) shows25- and 27-dB attenuation at points 300 Hz above and below the center frequency. The filter for the 3.0-kHz fiber is not as symmetric; however, de Boer's techniqueis limited to frequencies below 4. 0-kHz and, consequent-
The nature
of the tradeoff
associated
with
shifting the filter is perhaps most clearly revealed if one imagines that the area under the filter is divided into thin vertical rectangles, as when an integral is approximated by a summation. Consider the two rectangles adjacent to each noise edge; in both cases the
rectangleclosert? the toneis larger. If the filter is shifted up in frequencyby the width of one rectangl% the noise passing through the filter below the center frequency is reduced by an amount equal to one small rectangle, but the noise coming through above the center frequency is increased by one large rectangle. And thus the total noise power at the output of the filter is increased. At the same time, the power of the tone is slightly decreased. And so, for a symmetric filter, the maximum signal-to-noise ratio is obtained when the filter
is centered
on the tone.
The edges of the noise were made particularly
sharp,
as in Patterson (1974), so that they could be approximated by step functions. This, in turn, makes it possible to extract the filter shape from the general masking equation because, on substituting the step functions for
N(f), Eq. (2) becomes
Img)l ag+zCVofo '/•/•o
=
•
/f0
Iu(g)I ag.
And if the filter is symmetric, then Eq, (9) reducesto
V,=2KNof ofa• /•'o [U(g) ]•' ag. The derivative of this equationwith respect to ,xf/fo is
4')l
d(af /fo) )'
J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976
Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
(•O)
645
R.D. Patterson:Auditory filter shapes
645
rivative.
ample, when the tone frequency is 1.0 kHz such that feo is 0. 3 kHz, the masker is composed of two noise bands with flat tops 0. 0 kHz wide and skirts that fall 34 dB in the first 100 Hz outside the passband.
I. METHOD
B. The independent variable
Two broad noise bands with sharp edges were positioned symmetrically about a tone f0, and tone threshold
the relative
And so, as before, an analytic expression for the shape of the auditory filter can be produced by fitting a polynomial to the set of tone thresholds and taking its de-
Ps was measured as a function of the distance from the tone to the edge of the noise bands &f. The experiment was replicated at three tone frequencies, 0. 5, 1.0, and
2.0 kHz. The independentvariable &f/fo rangedfrom
The independentvariable in the experiment, zXf/fo, is distance
between
the tone and the noise
edge. For the purpose of defining •Xf, the position of the noise edge is taken to be the half-power point on the shoulder of the noise because this point provides an excellent approximation to the equivalent rectangular
0. 0 to 0. 4; that is, notch width 2•f was varied from
cutoff for these noise bands.
0. 0 to 0. 8 times the tone frequency.
produced with the aid of two low-pass Butterworth filters in series, the half-power point does not occur at the nominal cutoff, fmsñfoo, but rather at a point slightly closer to fins. The half-power points, which can be
A. Spectrum of the masking noise It was important in this experiment that the edges of the noise maskers be sharp, inasmuch as the derivation of the filter shape presented in the introduction includes the assumption that the noise edges can be approximated by step functions. A modulation technique suggested by
Greenwood (1961) and described in detail by Patterson (1974) was used to produce the sharp edges. Each band was prepared by low-pass filtering a broad-band noise
with two Khronhite filters (model 3342) in series, and subsequently multiplying the resultant low-passed noise with a sinusold. The slope of the edge of the multiplied
noise increases with the decibel/octave rating of the
low-passfilters andthe ratio fms/feo,wherefinsis the frequency of the multiplying sinusold and foo is the cutoff frequency of the low-pass filter. The Khronhite model 3342 is an eight-stage Butterworth filter, and so the weighting applied by each filter was
[1+(f/fe o)lø ]ß
-
(11)
Since there were two of these filters in series, the spectrum of the low-passed noise at the output of the filter
foundby settingEq. (12) equalto NO/2 and solvingfor f, occur
at
fao•=fms+feo( x/-•- 1)•/•0 : and thus for
the band below
the tone
(13)
Af=fo- (fins+ 0. 9464feo) and for
the band
above
the tone
(14)
&f= (fins- 0. 9464feo)-f0. C. Threshold procedure
The tonethresholdswere determinedwith a blocked, two-alternative forced-choiceprocedure. The masking noise was on continuouslythroughoutthe experiment. Eachtrial was composedof a 0. 2-sec warninginterval, two 0. 6-sec observation intervals, oneof which contained
1
second
Since the noise bands are
was
No
[ 1+ (f/feo)•ø]•',
the tone, and a 0. 9-sec response interval. The intervals were designated by lights; 0. 3-sec pauses were interposed between the intervals.
The tone was turned on
and off over a comparatively long time (0. 1 see) to minimize the spread of signal energy and so prevent off-frequency listening. The trials were presented in block• of 20 during which the stimulus parameters were not altered. The basic behavioral measure was percent
where N o is the spectrum level of the broad-band noise.
correct
The low-passed noise was then multiplied by a sinusold to produce the required noise band. The multiplication process shifts the position of the low-passed noise so
combinationof tone frequencyf0 andnoisewidth •Xf/fo,
that those components of the noise between 0 and f•o Hz are shifted to between fm• and fm• +f•o, and a mirror image of the low-passed noise appears in the region fm• to f•-foo. The spectrum of the resulting multiplied noise is given by
{1+ [(f- fins)/feo]tø} •' ø The complete noise masker N(f) is the sum of two such bands, one below and the other above the tone. Separate noise generators were used to insure the independence
of the noise bands. For a particular tone frequency,
per 20 trials.
To obtain an estimate of threshold, for a particular the observer was presented a run composedof 14 blocks of 20 trials.
All stimulus parameters were held con-
stant during a run except for signal power, which was varied in the region of threshold between blocks. The 14 values of percent correct produced in a run were
plotted as a function of signal power and a psyehometrie function was fitted to the points by eye. Threshold was taken to be that tone power where the psyehometrie
functionintersected the line designating75% correct identification of the observation interval containing the tone.
Data from
a second and sometimes
a third
run
were later added and a final psyehometrie function was
the cutoff foo was the same for all four of the Khronhite
fitted to the total set of points. Thus, each threshold is
filters used in the production of the two noise bands; for tone frequencies of 0. 5, 1.0, and 2. 0 kHz, leo was 0. 2, 0. 3, and 0. 4 kHz, respectively. Thus, for ex-
frequencywere gatheredbefore anotherwas introduced;
based on 500 to 800 trials.
All of the thresholds associated with a particular tone
J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
646
R.D. Patterson: Auditow filter shapes
646
the thresholds were not gathered in a systematic order.
data are very similar in form. The inflection point occurs somewhat earlier in the 2.0-kHz data, but the
D.
shape is essentially
Observers
A total of 10 observers participated in the experiment; all had audiometrically normal hearing in the range below 4.0 kHz. One observer, MS, completed all of the experimental conditions. The remaining nine observers
A. Filter shapesand bandwidthsfrom the averagedata A filter
are results
observers
for five
observers
shape was extracted from each of the three
sets of thresholds in Fig. 3(a) by fitting a polynomial to
were assigned three to each tone frequency, and they completed all of the thresholds at that frequency. One of the observers assigned to the 2.0-kHz frequency, BS, was available for a slightly longer period, during which he also completed the thresholds at 0. 5 kHz. Thus, there
the same.
the appropriate data and taking the derivative of the
polynomial. The deiails of the fitting process are described in Appendix A.
The filter shapes produced in
this manner are presentedin Figs. 3(b) and 3(c); the symbols on the curves do not represent data but rather are provided to identify the center frequency of the fil-
at 0. 5 kHz and four
ter.
at 1.0 and 2.0 kHz.
In Fig. 3(b) attenuationis plotted on a linear
scale, placing the emphasis on that portion of the attenII.
RESULTS
AND
AUDITORY
FILTER
SHAPES
uation
characteristic
maximum.
The average thresholds are shownin Fig. 3(a) for the
within
tuned filters with similar
three tone frequencies, 0. 5, 1.0, and 2.0 kHz. For clarity, the data at 1.0 and 2.0 kHz have been displaced down 5 and 10 riB, respectively. The data at 0. 5 kHz show that although the relationship between tone threshold and notch width is reasonably linear, there is a
about
15 dB of the filter
As expected, the plot shows three wellshapes.
The half-po•wer point of the 2.0-kHz filter,
indicated
by the horizontalline in Fig. 3(b), occurs at Af/fo
"backwardS" shapeto the curve, with an inflection
= 0. 0518, and thus the 3-dB bandwidth, BW, of the 2.0kHz filter is 207 Hz. Similarly, the 3-dB bandwidths of the 0. 5- and 1.0-kHz filters are 69. 2 and 140 Hz, re-
point occurring in the region/xfifo =0. 2. The 1.0-kHz
spectively.
AVERAGE
DATA
These values are more than double those
AUDITORY
60•-0 Signal Frequency'
.1
.2
FILTERS
.3
.4
!
!
0
500 Hz
1000 Hz 2000
Hz 10
(c)
20
o
30
o
40
(a)
0
.1
.2
.3
A
0
.1
.2
.3
A
FIG. 3. Thresholdsignalpoweris plottedas a functionof relative notchwidthAf/fo in (a); ZXfis the distancefrom the signalto the edgesof thenoise,andf0 is the signalfrequency.Thethreecurvesshowthedataobtained with signalfrequencies of 0.5, 1.0, and2.0 kHz as representedby the circles, squares,andtriangles, respectively. The auditoryfilter shapesderivedfrom the datain (a) are plottedin (b) usinga linear ordinate, andagainin (c) usinga logarithmicordinate. The abscissais the same for all three sections of the figure. J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
647
R.D. Patterson: Auditory filtershapes
647
reportedby Patterson(1974)for the sametonefrequencies, whichsuggests that the auditoryfilter was not
0. 83 of those reportedby Zwicker et al. (1957).
centered on the tone in that experiment.
notparticularlysurprising.BothZwickeret al. (1057)
The similarity in the rate of growth of bandwidthis
Although the
presentbandwidths are wider, the rate of growthof
and more recently Scharf (1070) have pointed out that virtually all bandwidthexperiments have foundthe same rate of growth. The difference, traditionally, has been
bandwidthwith center frequency is remarkably similar.
In bothcases, the growthis foundto be close to linear whenlogBWis plottedas a functionof logf0, andin the
thatexperiments whichemploythe"empirical"method
case of the notched noise experiment the line is
provide bandwidthestimates two to three times those
10logBW =7.91logf0 - 2.71.
found with the "critical ratio" method. The present ex-
(15)
perimental techniquedoes not fit into either of these categoriesparticularly well, since it was designedto
Patterson (1974) reported slope and intercept values of 8. 34 and- 7.37, respectively, for a similar fit. Thus,
measure the shape of the auditory filter. Nevertheless, it is more similar to a "critical ratio" experiment and thus it is somewhat surprising to find that the bandwidth
the two lines have essentially the same slope and differ
primarily in overall level, the mostrecentbandwidths being about2.2 times the older ones. The 3-dB bandwidth underestimates
estimates are much closer to those of Zwicker the equivalent
rectangularbandwidth of thesefilters by about14%.
these latter estimates now appear excessively narrow
The equivalent rectangular bandwidthsare 78.5, 160, and 238 Hz for the 0.5-, 1.0-, and 2.0-kHz filters, respectively. The correspondingestimates from Zwicker,
bringsus backto EvansandWilson's(1073)suggestion: that the frequency selectivity available at the outputof the cochlea would appear to be sufficient to accommodate the results obtained in a wide range of psychophysical experiments.
Flottorp, andStevens(1957)are approximately110, 160, and 300 Hz. As before, lines fitted to these two sets of values are nearly parallel with slopes of 8.01 for the former and 7.24 for the latter. Again the basic difference is in the overall level; the intercepts are
The abscissa in Fig. 3(b) is a normalized frequency scale, and so the fact that the 0. 5- and 1.0-kHz filters are virtually identical indicates that the frequency selec-
- 2.43 and + 0. 679, respectively. But in this case the level difference is noticeably smaller; in the region of
tivity of the system is essentially proportional to center frequencyin this region. By comparison, the 2.0-kHz
interest, 0. 5 to 2.0 kHz, the present values are about OBSERVER
et aL
(1057) and Houtgast(1074) than to those of Patterson (1074)or Margolis and8mall (1075). The fact that
AUDITORY
MS 0
.1
.2
FILTERS .3
.4
Signal Frequency' o
500 Hz
[]
1000
Hz
•,
2000
Hz
60
20 iJJ
oc oc
•
o
•o
,
(b)
(a)
0
.1
.2
.3
.4
0
.1
.2
.3
A
"ho FIG. 4. In (a), threshold signal poweris plotted asa function of relativenotch widthandsignalfrequency for observer MS. The corresponding filter shapesare shownin (b) and(c) usinglinear andlogarithmicordinates,respectively. J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
648
R.D. Patterson: Auditoryfilter shapes INDIVIDUAL
648
OBSERVERS
AUDITORY 0
Signal Frequency 1000 Hz
•
.1
FILTERS
.2
.3
.4
[]MS
60
v RB 0
FO
o
SS
10
(c)
03 50 20
0
(j 40
30
o
z
40
(a) 20
i
(•
.1
i
i
.2
i
.3
.4
0
.1
.2
A•/f o
.3
.4
A•/fo _
FIG. 5. In (a), threshold signal power is plotted as a functionof relative notchwidth andobserver for the case where the signal frequency is 1.0 kHz. The individual filter shapes are shownin (b) and (c) using linear and logarithmic ordinates, respec.tively.
filter is somewhat sharper, indicating that relative frequency selectivity improves with center frequency beyond 1.0 kHz. The three filter shapes are replotted in
Fig. 3(c), this time using a logarithmic ordinate which emphasizes
the tails of the attenuation characteristic
at
the expense of the central section. The frequency scale is the same in all three sections of the figure. Figure 3(c) shows that the 0. 5- and 1.0- kHz characteristics do not diverge in the tails and that the faster rate of at-
tenuation displayed by the 2. 0-kHz filter in Fig. 3(b)
continuesin the frequencyregion beyond•f/fo = o. 2. However, the data of MS, the one observer who com-
pleted all three center frequency conditions, indicate that the average curves overestimate selectivity difference.
the size of the
chance assignment of more sensitive observers to the 2.0-kHz
condition.
The 1.0-kHz average curve in Fig. 3(a) is an average of data from four observers. The individual data points and polynomial.fits are presented in Fig. 5(a). When the notch in the noise is narrow, tone threshold shows
little variation as a functionof observer; however, as the notch widens, consistent and significant differences emerge. In the frequency range 0. 3 to 0. 4, for example, the best observer, SS, can detect a tone almost an order of magnitude less intense than that audible to the least sensitive observer, RB. The filter
shapes obtained from the individual ob-
servers' data are presented in Figs. 5(b) and 5(c). As would be expected, the most sensitive observer in Fig.
B. Individual filter shapes
5(a), SS,produced thefilter withthe mostsevereskirt,
The data and filters of observer MS are presented in Fig. 4 using the same format as that of Fig. 3. As in the average data, the 0. 5- and 1.0-kHz threshold curves are virtually parallel, and so the correspondingfilters
shownby the circles in Fig. 5(c). Similarly, the
in Figs. 4(b) and4(c) are almostidentical. Similarly, MS's 2. 0-kHz threshold curve is steeper than his 0. 5and 1.0-kHz curves, and consequentlyhis 2.0-kHz filter displays relatively more selectivity. But the in-
threshold data of observer FO show greater sensitivity
than thoseof observer MS in the region above0. 25; correspondingly,in Fig. 5(c), the skirt of FO's filter is seen to apply more attenuation than that of MS.
The correspondence would at first appear to break down with observer RB. His threshold data show that he is the least sensitive observer when the notch is
crease in selectivity between 1.0 and 2. 0 kHz is not as
wide, which would seem to conflict with the fact that his
great as that shownin Fig. 3(c), indicatingthat part of the effect seen in the average data is probably due to the
filter in Fig. 5(c) is slightly more severe than that of MS. However, RB is also less sensitive when the notch
J. Acoust.Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
649
R.D. Patterson: Auditory filter shapes
649
TABLE I. Filter shape statistics. In (a), the observers are ordered according to data range; in (b), according to bandwidth. The rows marked with asterisks are the means and standard deviations without observer BG. The fitting process that produced these values was restricted that were fiat at the center frequency.
(a) f0 Average data
Filter
range
range
BW
K
a0 55.9
33.58
34.5
69.2
0. 497
1.0
33.67
34.0
140.0
0. 320
57.1
2.0
36.48
39.9
207.0
0. 404
59.8
2.0
*37.47
41.0
0.456
60.5
OBS DR
OBS BW
36.5
37.7
JB
DR
61.2
0. 623
56.5
35.3
35.4
BS
MM
65.8
0.488
55.6
35.2
36.2
MM
MS
70.5
0.460
55.7
31.8
32.3
MS
JB
72.8
0.563
56.6
29.1
30.7
DR
BS
80.8
0. 370
55.2
33.58
34.46
70.22
0. 5008
55.92
2.58
6.63
Mean std dev
observers'
(b)
Data
0.5
0.5
Individual
to filters
1.0
data
Mean
Mean
39.0
SS
SS
124.0
0. 341
33.3
FO
RB
136.0
0. 391
57.9
31.7
31.3
MS
MS
144.0
0. 308
57.1
31.4
32.1
RB
FO
167.0
0. 243
56.5
33.67
33.93
142.8
0. 3208
57.08
15.7
0. 054
56.8
0.52
43.2
51.3
MB
BS
195.0
0. 605
35.7
39.5
BS
BG
196.0
0. 262
57.8
33.7
36.4
BG
MB
208.0
0.497
60.7
33.5
34.1
MS
MS
239.0
0. 322
59.5
36.48
40.33
209.5
0.4225
59.83
6.62
17.8
std dev
std dev
0.54
33.4
3.02
2.0
0. 087
38.2
std dev
Mean
'213.0
*37.47
*
41.63
'214.0
7.18
'18.5
is narrowest; the polynomial fitted to his threshold data intercepts the ordinate slightly above those of the other observers, and it is this relative sensitivity that is reflected in the filter shapes. Continuing this argument,
the first column in Table I(a) shows the relative sensitivity of each observer as measured by his data range,
that is, his thresholdat Af/fo =0. 0 minus his threshold at Af/fo= o. 4. The observers have been groupedby center frequency and then ordered by data range.
The
secondcolumnof Table I(a) gives the range of each filter over the same frequency region. A comparison of these two columns shows that the data-range order predicts the filter-range order well; there are only two reversals in the filter-range order (BS and MM at 0. 5
kHz andMS and RB at 1.0 kHz), and in each case the difference in data range is less than 0. 5 dB. And finally it is perhaps worth pointing out that the
central portions of Figs. 4(a) and 4(c) also correspond. For example, RB is relatively more sensitive than FO
or MS in the region 0.1 to0.2,
as shownin Fig. 5(a) by
the convergence of RB's threshold curve with those of
61.3
0.137
1.34
0. 4747
60.50
0.117
0.75
C. Individual bandwidths
It might seem only reasonable to expect the consistent and often large differences in the tails of the threshold
curves to predict the bandwidthdifferences amongthe observers. For exampl.e, observer SS, whosethreshold curve drops fastest in Fig. 5(a), has the smallest 3-dB bandwidthat the 1.0-kHz center frequency. However, in general, data range does not predict bandwidth at all well. Although the tail of SS's threshold curve is
almost 10 dB below that of RB, his 3-dB bandwidthis only marginally narrower than that of RB, as can be
seenin Fig. 5(b); their filters are displacedby only 12 Hz as they fall through the gap in the horizontal line. The
3-dB
bandwidths
of the observers
are
shown
in
Table I(b). This time, after grouping, the observers have been ordered in accordance with bandwidth, and a
comparisonof the two orderings[the last columnin Table I(a) and the first column in Table I(b)] reveals the lack of correspondence between bandwidth and data range or filter range. One observer, DR, at 0. 5 kHz, has both the smallest data range and the smallest band-
FO andMS. This too is reflected in Fig. 5(c), where
width.
it can be seen that RB's filter
Thus it would appear that the frequency selectivity of the central portion of the auditory filter, characterized
lies below those of FO
and MS in the region 0. 1 to 0. 2. J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976
Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
650
R.D. Patterson: Auditory filter shapes
650
by the bandwidth, is reasonably independent of the frequency selectivity exhibited by the skirts of the filter as measured by filter range. This implies that masking experiments which employ a wide-band noise masker are not likely to predict the relative success of individ-
whenAf/fo is 0. 0, 0. 15, and0. 25, respectively). But the 2.0-kHz filter is 27 dB downat '•f/fo = 0. 25, where-
ual observers
discrepancy between the 2.0-kHz filter and its Gaussian approximation is similar to those shown for the 0. 5-
in tone-masking
experiments
signal and masker are widely separated. cates that an auditory filter
where the
It also indi-
shape such as that pre-
sentedin Patterson (1974) must have restricted success, because its 3-rib bandwidth and skirt height were determined by the same parameter. III.
GAUSSIAN
APPROXIMATION
TO THE FILTER
as the 0. 5- and 1.0-kHz
down, respectively.
filters
are 21.4
and 20. 8 dB
The 2. 0-kHz filter is 21.8 dB
downat '•f/fo = 0. 20, andover this frequencyrangethe
and 1.0-kHz
filter.
The Gaussian approximation has the form
IH(af/fo) =exp[-rr(af/foBW•.i• )•'],
(16)
whereBW•.ais the equivalentrectangularbandwidth. a The Gaussian curve is flatter than the filter
SHAPE
The procedure for extracting the auditory filters from the data was designed to produce an accurate rather than a simple expression for the shape of the filter. To this end, the data were fitted with a fifthorder polynomial so that the derived filters would not be artificially constrained by the fitting process. As a
shapes
near the center frequency and, as a result, the bandwidths of the approximations are wider and the proportionality constants smaller than those associated with the derived filters. The equivalent rectangular bandwidths
of the dashed curves in Fig. 6(a) are 99. 5 and 209.Hz for the 0. 5- and 1.0-kHz filter shapes. And the corresponding proportionality constants are 0. 433 and 0. 9.79.
result, the expression for the filter shape has five parameters, which seems excessive when compared with
of 343 Hz; however, the Gaussian approximation that
the smooth shapesshownin Figs. 3(b) and 3(c). In an
was restricted to the region below '•f/fo = O.9.0is more
effort to find a mathematically
comparable to those at the lower center frequencies and
more tractable
expres-
The dashed curve for the 9.. 0-kHz
filter
has a bandwidth
sion, the filter shape was compared with several common functions. No good match for the entire filter shape was discovered.
The skirts
of the universal
resonance
curve do not fall fast enough to provide a satisfactory approximation, and the skirts of the filter suggested in
Signal
Patterson (1974) fall too rapidly. Althoughthe Gauss-
Frequency:
¸ 500 Hz I-I 1000 Hz A 2000 Hz
ian curve does not provide a good fit to the filter over its entire range, it does provide a reasonable approximation in the region of the passband. Since the Gaussian curve is a particularly convenient filter shape and the passband is the important part of the filter function in most cases, the Gaussian approximation to the auditory filter shapes will be described here in detail.
The average filters of Fig. 3(c) have been replotted in Fig. 6(a), with the 1.0- and 2. 0-kHz filters displaced down 10 and 20 riB, respectively.
The solid curves
with the center frequency symbols are the filter
shapes.
They displayan inflectionpointin the region'•f/fo •0. 2, where they change from curving downwards to curving upwards. The solid curves, without symbols, indicate the Gaussian approximations to the filters over the entire frequency range. Since the Gaussian function is a
parabola on these coordinates, it cannot accommodate
the changein curvature of the filters, with the result that there are major deviations between the filters and the fits in the central section and at the center frequency. The dashed curves present the Gaussian approxi-
mationsto the filters up to '•f/fo = 0. 25. The filters fall 20 dB or more over this frequency range.
Thus, for
most purposes, this is the important part of the attenuation characteristic. For the 0. 5 and 1.0 kHz filters, the effect of the change in curvature is minimal in this region and the Gaussian function provides a good approximation; the discrepancy between the filters and the fits exceeds 1 dB only at the high-frequency ends of the curves. The inflection point occurs earlier for the 2.0-kHz filter, and consequently the deviations between
the fit and the filter are larger (1. 5, 2. 0, and 4. 0 dB
40
o
.1
.2
.3
.4
•/fo FIG. 6.
Gaussian approximations to the auditory filters.
The
solidlineswithsymbolsare theaveragefiltersfromFig. 3(e);the curves for 1.0 and 2.0 kHz are displaced down 10 and 20 dB for clarity. In each case, the solid line without symbols is the Gaussian
fit
to the entire
filter
and the dashed
up to Af/fo = 0.25.
J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
curve
is the fit
651
R.D. Patterson:Auditoryfilter shapes
651
the bandwidth of that fit is 316 Hz, with a proportionality constant of 0. 200. The differences between these bandwidth estimates and those provided by Houtgast
icant differences among the observers emerge and grow. Thus, the ability of the average filters of Fig. 3 to predict the masking data of a specific observer will
(1074), who also used Gaussianapproximations, are
decrease
negligible. Both sets of estimates are in good agreement with those of Zwieker, Flottorp, and Stevens (057).
e r inc re ase s.
Af/fo = 0. 25. CONCLUSIONS
The auditory filter notched
noise
masker
shapes derived with the aid of the reveal
a well-tuned
filter
whose
skirts fall steadily from about6 dB downat Af/fo = O. 1 tO around35 dB downat Af/fo = 0. 4. The 3-dB bandwidths of the 0. 5-, 1.0-, and 2. 0-kHz filters are 69. 2, 140, and 207 Hz, respectively; a reasonable summary ot of the relationship between the 3-dB bandwidth and the center frequency of the filters is provided by the line
10 logBW= 7.9 logf0- 2. 7. Since these
filters
are
more
than twice
(17) as wide
the tone and the mask-
This
research
was
carried
out at the Defence
Civil Institute of Environmental
Medicine,
and
Downsview,
Ontario, with the assistance of W. Garland, B. Crabtree, and B. Rodden. I would also like to thank M. M. Taylor for numerous helpful discussions. APPENDIX
A
I. Derivation of the auditory filter shape and the proportionality constant This section of Appendix A describes how the polynomial was fitted to a specific set of tone thresholds, and how the filter shape and proportionaltry constant were subsequently derived in terms of the polynomial coefficients produced by the fitting process.
At the end of the Introduction, it was concluded that the shape of the auditory filter is proportional to the derivative
of the tone threshold
curve
obtained
when
the
masker is a notched noise. The first step, then, was
to fit a polynomial,Q,(Af/fo), to the data. The fit had the form
as those
reported by Patterson (1974) and Margolis and Small (1975), it would appear that the auditory filter was not centered on the tone in those experiments as assumed, and that they primarily measured the shape of the skirts of the filter. The wider bandwidths emanating from the notched noise experiment also support Evans and Wil-
son's (1973) contention that the primary neurons of the VIIIth nerve display sufficient frequency selectivity to predict much of the noise-masking data obtained in psychophysical experiments.
The central portion of the filter shape, where the attenuation is less than about 22 dB, is well approximated by a Gaussian curve whose equivalent rectangular bandMore specifically,
[H(Af/fo) I•'=exp[-rr(Af/foBW•.a )•'],
10logPs = Qn(Af/fo),
(A1)
where
Qn(Af/fo)=ao+ a•(Af/fo)+ a•.(/x f /fo)•'+''' + an(Af/fo)n. (A2) The polynomial was fitted to a logarithmic rather than a linear measure of threshold because the variability of the threshold estimates was more uniform on the logarithmic scale, as indicated by the fact that the slopes of the psychometric functions did not vary appreciably as the notch in the noise was widened. Taking the anti-
logarithm of both sides of Eq. (A2) gives
Ps= exp[(ln10/10)Qn(af/fo)].
(A3)
The derivative of the tone threshold curve is, then,
width, BW•.R,is about20%of the center frequency.
(18)
dP, /ln10\,/,•f\ [(ln10• d(,•f/fo)=•-•)Q[,•o )eXp[\-•/Q•(•0f)], (A4, where Q•(/•f/fo)is the derivativeof Q•('•f/fo).
where BW•.a is given by
(19)
10 logBW•.a = 8. 3 logf0- 2.3
and the proportionaltry constant K is 0. 34. The equivalent rectangular bandwidth values are in good agreement with those presented by Zwicker, Flottorp, and
Stevens(1957) and by Houtgast(1974), who also used a Gaussian approximation.
between
ACKNOWLEDGMENTS
The Gaussian approximation should be reasonably successful in predicting tone-in-noise thresholds whenever the signal and masker occupy the same frequency region. However, it should be noted that the approximation will consistently underestimate threshold when the signal and masker are separated by more than about one-quarter of an octave, because the Gaussian curve falls well below the derived filter in the region above
IV.
as the distance
Outside the passband, the
filter shape flattens out while the Gaussian curve falls ever more rapidly.
In the region of the passband, there is little variation in either the shape or bandwidth of the filters
across
The precise relationship between the filter
shape and
the tone threshold curve is detailed in Eq. (10)o Thus the desired expression for the filter
shape is obtained
by substitutingEq. (A4) into Eq. (10) as follows:
[n(f /f o)
( /2XVofo)
o) (/'f /f o)
x exp[(ln10/10) Qn(Af/fo)].
(A5)
Since the slopes of the curves in Fig. 3(a) are always
negative, IH(/xf/fo)I•' is invariablypositive. The only unknownon the right-hand side of Eq. (A5)
observers; for the three center frequencies used, the
is the proportionaltry
mean-to-sigma
inated from the equation if the attenuation applied by
ratio of the bandwidth was about 10.
As
Af/fo increasesfrom 0. 2 to 0. 4, consistentandsignif-
constant K.
And K can be elim-
the filter is expressed as a proportion of the attenuation
J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976
Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
652
R.D. Patterson:Auditoryfilter shapes
applied at the center frequency of the filter,
652
that is, if
IH(0) I•' is set to equalto 1. For, in that case, when Eq. (AS) is evaluatedat 0 as follows,
[H(0)[2=_ (1/2KNofo)(lnlO/lO)Q,• (0)exp[(ln10/10)Qn (0)], (A6)
andtheknownvaluesof IH(0)Iu, Q•(O),andQn(O),(1, a•, and a0, respectively) are substitutedinto Eq. (A6) to produce
TABLE A-I.
Polynomial coefficients.
These values summa-
rize the fits to the averagethresholddata shownin Fig. 3(a). The lines in Fig. 3(a) are produced by the coefficients in con-
junctionwith Eq. (A1). The filters in Figs. 3(b) and3(c) are producedby the same coefficientsin conjunctionwith Eq. (AS).
•
G0
0.5 1.0 2.0
55.80 56.79 59.47
•1 -- 51.26 -- 49.65 -- 69.56
•2 -- 308.2 -- 295.3 -- 566.2
•3 322.0 289.1 1859.0
•4 1772.0 1589.0 -- 575.8
•5 -- 2946.0 -- 2492.0 -- 2253.0
1= - (1/2KN0f0)(ln10/10)al exp[(ln10/10)a0], (A?) then the normalized filter shape is found by dividing Eq.
(A5) by Eq. (A7). And thus,
filteris obtained fromthethreshold curve,amplifies any irregularities in the fit. In all probability, the threshold variations simply represent sampling vari-
]H(af /f o)]u: (1/ax)Q, •(a/l/o)exp{(ln10/10) x [Qn(Af/fo)- ao]}.
(AS)
This method of splitting the attenuation applied by a
filter into a constantpart K and a variable part lH(Af/
ability and, consequently,with the individualobservers, a smooth curve was drawn through the data by eye and the polynomial was fitted to points taken from the
smooth curve, instead of fitting the polynomial to the
f0)[u also makesit possibleto derive an expressionfor
raw data directly. In the majority of applications, the
K.
shape of the filter's passbandwas more important than the shape of the tails, and so more thresholds were measured in the region of the passband. This emphasis
In this case, it is only a matter of multiplying both
sides of Eq. (A7) by K to produce
K=- (1/2Nofo)(ln10/10)a • exp[(ln10/10)a0 ].
(A9)
Since a• is invariably negative, K is always positive. In fitting the data, the degree of the polynomial was
varied from 1 to 5. The first-order polynomial, a line,
accountedfor better than 90%of the variability in the data in every case; however, the data were consistently above the line in the frequency region 0.05 to 0.15, consistently below the line in the region 0. 25 to 0. 35 and above the line again at 0. 4, indicating that a polynomial of at least third order was warranted.
Third-
and
fourth-order polynomials both provided excellent approximations to the data, but the coefficient of the high-
was preserved when fitting the polynomial to the smooth curves by choosing one smooth-curve point per data point.
III.
Filter shapenear the center frequencyand
bandwidth variability
It would seem reasonable to assume that the auditory filter exhibits a maximum at its center frequency and is continuousat that point; or, in other words, that the
filter is flat at •f/fo = 0. 0. The filter shapederivation describedin Secs. I andII of AppendixA doesnot, of
case, Eq. (20) becomesvery large as Af/fo rises above
necessity, producefilters that are flat at zero, and when the filter shapeswere first produced using this procedure some of them displayed peaks or notches
0. 4 and it is not possible to extrapolate even a short distance beyond the data. No general solution to this
would appear to arise becausethere are no data points
est-order term was often positive in these fits.
In this
about the center frequency.
The peaks and notches
problem was found; however, when a fifth-degree poly-
at negativevaluesof Af/fo tO restrict thepolynomial,
nomial was fitted to the average data, the coefficient of the highest-order term was always negative and the
and because the threshold curves do not include a data
curve obtained from Eq. (20) proceeded toward zero in an orderly fashion in the region above the data.
Con-
sequently, the fifth-degree fits were used in producing the filter shapeseven thoughthe fifth term is not always negative for individual observers. The coefficients for the average data are presented in Table A-I.
pointat Af/fo= 0. 0. The importanceof measuring threshold at zero, as opposedto 0. 02, was not recognized until the filters were derived, by which time the observers were no longer available. The peaks and
notchesdo not extendbeyondaboutAf/fo = 0. 03 andthus are not particularly important as far as the overall
shapeof the filters is concerned. However, the attenuation applied by the filter is measured relative to the
II. Smoothing the individual data
attenuationat the center frequencyand,consequently,
Initially, the polynomials were fitted to the raw data. This procedure proved quite satisfactory in the case of the average data, and indeed the threshold curves and
the presence or absence of a peak or notch affects the position of the half-power point and thus the 3-dB band-
filter shapesshownin Fig. 3 were producedthis way. In the region above 0. 2, however, where the data points are more widely spaced, small undulations appeared in the filters of individual observers when the polynomial was fitted to their raw data.
The deviations in the
threshold data of a particular observer are relatively
'width.It wasalsoevident thatthisapparent artifactof the fitting process was contributing to bandwidth variability, so a method for restricting the fitting process to flatter filters was incorporated into the procedure. The shape of the filter in a particular frequency region is primarily determined by the relative level of the
small, as can be seenin Fig. 5(a). Nevertheless,they lead to irregularities in the filter becausethe polyno-
thresholds in the same region, and it was foundthat the size of the peaks and notchescouldbe controlled, without affecting the shape of the filter in the region above
mial fit is sufficiently accurate to follow the threshold
Af/fo= 0. 03, by addinga pointto thedataat Af/fo= O.0
variations; and the differentiation process, wherebythe
and adjusting its level over a small range. Two cri-
J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
653
R.D. Patterson: Auditoryfilter shapes
653
(2) Criterion II:
In addition, a closer examinationof
teria for positioning the added point were evaluated in terms of the variability of the resulting BW and K
Table
values:
tions, small as they are, may still be excessive.
(1) Criterion I: In the first, the addedpoint was adjusted so that the filter was between 0. 1 and 0. 11 dB
downat/•f/fo = 0. 01, a procedurethat essentially eliminated the peaks and notches. All of the curves in Figs. 3-6 and the values in Tables I and A-I were produced with the aid of this criterion. Bandwidth variability is not commonly discussed in papers dealing with the critical band or auditory filter shape, perhaps because the number of observers is typically three or less, but what information there is suggests that the variability
is high. Houtgast (1074) reported that the bandwidthsof his five observers ranged from 70 to 200 Hz at 0. 5 kHz, from 140 to 200 Hz at 1.0 kHz, and from 280 to 440 Hz at 2.0 kHz. The corresponding bandwidth ranges presented in Table I are substantially smaller; in each
case, the range is less than 5%of the center frequency. 8wets, Green, and Tanner (1062) estimated filter bandwidth in seven different but highly similar experimental conditions. For each observer they reported a meanto-standard-deviation
ratio
calculated
across
I leads to the conclusion
that the standard
devia-
A
comparison of the column of bandwidths with the column of proportionality constants reveals that at each center
frequency there is a tendency for K to decrease as BW
increases across observers. To understand the implications of this relationship, consider what happens when
a point is addedto the data at/•f/fo = 0. 0. As the point is adjusted upwards from a low position, the notch in the corresponding filter
becomes smaller
and smaller
and is eventually replaced by a peak; at the same time, K increases and BW decreases. This suggests that if the criterion for positioning the added point were relaxed slightly, it would be possible to reduce the variability in BW and the variability in K simultaneously. Before proceeding to illustrate this argument, it is perhaps worth pointing out that the results of this analysis will differ from those in Table I only inasmuch as the variability of BW and K will be reduced. And since the analysis involves an ad hoc restriction and the deletion
of one observer's data, it is presented here as an addendum rather than as the main analysis.
the seven
experimental conditions. The ratios were 4.4, 5. 1, and 6.6 for their three observers. No strictly comparable measure is available in the present experiment, but for each center frequency, a mean-to-standard-de-
There is one major exception to the general tendency for K to decrease as bandwidth increases, and that is the data of observer BG, who has both a narrow band-
viation
tensive experience in psychoacoustic experiments, whereas the other observers had no previous experience. In addition, it is likely that BG was better motivated than the others because he did the testing. Both of these factors might be expected to increase his efficiency and so reduce his K, and since this discussion depends on the K values being comparable, BG's data
ratio
can be calculated
across
observers.
These ratios might be expected to be lower than those
of Swets, Green, and Tanner, inasmuch as the standard deviation includes between-subject variability; however, the ratio values are in fact nearer 10, which suggests that population bandwidth may not be as variable as previously suspected.
width
and the smallest
K at 2.0 kHz.
TABLE A-II. Filter shape statistics. The observers are ordered according to data range in (a) and bandwidth in (b). The fitting process that produced these statistics was restricted to specific values of K such that the variability of K was reduced by one-half relative to Table I(b). (a)
f0
0.5
Mean
(b)
Data
Filter
range
range
OBSDR
OBSBw
K
a0
36.5
37.4
JB
MM
65.0
0.494
55.7
35.3
36.2
BS
DR
66.8
0. 561
56.4
35.2
36.3
MM
MS
68.0
0.480
55.8
31.8
32.5
MS
BS
68.9
0.435
55.4
29.1
30.2
DR
JB
76.3
0. 532
56.5
33.58
34.52
69.0
0. 5008
55.96
ski dev
Individual
observers'
1.0
data
Mean
2.72
Mean
3.88
0.043
0.42
38.2
38.9
SS
SS
128.0
0. 331
56.8
33.4
34.0
FO
MS
141.0
0. 314
57.1
31.7
31.4
MS
FO
145.0
0. 282
56.7
31.4
31.7
RB
RB
147.0
0. 356
57.8
33.67
34.00
140.3
0. 3208
57.10
std dev
2.0
BW
3.00
7.40
0. 027
0.43
43.2
51.5
MB
MS
198.0
0. 398
59.8
35.7
38.9
BS
BS
212.0
0. 540
61.2
33.5
35.0
MS
MB
213.0
0.486
60.6
207.7
0.4747
60.53
0. 058
0.57
37.47
41.70
ski dev
6.83
6.85
,
J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976
Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
BG had had ex-
654
were
R.D. Patterson: Auditory filter shapes
654
omitted.
Akad. Wet. 72, 129-151.
For each center frequency, the standard deviation of K was halved by halving the deviation of each observer's K from the mean value; the new values are shown in Table A-II. Then each observer's data were refitted; that is, the point added at zero was adjusted until that observer's revised K value was produced. The statistics of the filter shapes that follow from this modified fitting process are also presented in Table A-II and, as predicted, the variation in bandwidth is substantially
reduced; the mean-to-standard deviation ratios have increased
to around
20.
The observers
are ordered
ac-
de Boer, E., and Jongkees, L. B. W. (1968). "On cochlear sharpening and cross-correlation methods, "Acta OtoLaryngal. Stockholm 65, 97-104.
de Boer, E., and Kuyper, P. (1968). "Triggered correlation," IEEE Trans. Biomed. Eng. BME-15 (3), 169-179. Egan, J.P., and Hake, H. W. (1950). "On the maskingpattern of a simple auditory stimulus, "J. Acoust. Soc. Am. 22, 622--630.
Evans, E. F., andWilson, J.P.
(1973). "Frequencyselec-
tivity of the cochlea," in Basic Mechanisms of Hearing, edited by A. R. M•ller (Academic, New York), pp. 519-551.
Fletcher, H. (1940). "Auditory patterns, "Rev. Mod. Phys. 12, 47-61.
cording to bandwidth, but the order is quite different from that in Table i and the proportionaliVy constant no longer varies systematically with bandwidth. There is also a reduction in the variability of a0, the estimate of wide-band threshold. But aside from these effects, which are all directly related to the adjusting of K, there are no important differences in the summary statistics, that is, in mean bandwidth, mean a0, mean filter range, or filter- range variability. The largest shift of the point added to zero was 0. 49 dB for observer MS at 2.0 kHz, and the average shift was just 0. 18
Fletcher, H. (1953). Speechand Hearing in Communication (Van Nostrand, New York), 2rid ed. Greenwood, D. D. (1961). "Auditory masking and the critical band, "J. Acoust. Soc. Am. 33, 484-502. Hawkins, J. E., and Stevens, S.S. (1950). "Tt•e maskingof pure tones and of speech by white noise," J. Acoust. Soc.
dB.
Margolis, R. H. (1974). "The measurementof critical masking bands," doctoral dissertation (University of Iowa). Margolis, R. H., and Small, A.M. (1975). "The measure-
Since the individual
thresholds
were
estimated
only to the nearest 0. 5 dB, the magnitude of the adjustment does not seem unreasonable, and it is not surprising to find that there are virtually no changes in the shapes of the filters produced by the modified fitting process. Finally, it should perhaps be pointed out that the correspondence between filter range and bandwidth is not improved by the modified fitting process.
*The paper also appears as DCIEM Report75 RP X6, andportions of it were presented at the 88th Meeting of the Acoustical Society of America, J. Acoust. Soc. Am. 56, S 36 (A) (1974).
•To be precise, Margolis andSmall held the tonepower constant and measured d' as a function of the position of the noise edge. While this procedure avoids the criticism that filter shape may vary with tone power, it does not provide accurate estimates of filter shape for frequencies that are remote from
the tone.
Neither
of these differences
would be
expected to alter the overall correspondencein the results of Patterson and Margolis and Small.
•'Theauthor is indebtedto bothM. M. Taylor andM. R. Schroeder for this suggestion.
3If the 3-dB bandwidthBW is usedinstead of the equivalent rectangular bandwidth,the approximationis exp[- 4 ln2(Af/
f0 BW)•]ß The 3-dB bandwidths for the 0.5-, 1.0-, and2.0kHz approximations over the first 20 dB of attenuation are 93.5, 190, and 297 Hz.
de Boer, E. (1967). "Correlation studies applied to the frequency resolution in the cochlea, "J. Aud. Res. 7, 209--217. de Boer, E. (1968). "Reverse correlation I," Proc. K. Ned. Akad. Wet. 71, 472-487. de Boer, E. (1969). "Reverse correlation II," Proc. K. Ned.
Am. 22, 6-13.
Houtgast, T. (1972). "Psychophysical experiments on grating acuity," paper presented at the Symposiumon Hearing Theory, IPO, Eindhoven, The Netherlands. Houtgast, T. (1974). "Lateral suppressionin hearing, "doctoral dissertation (University of Amsterdam, Amsterdam, The Netherlands).
ment of critical masking bands," J. Speech Hear. Res. 18, 571-587 •.
Patterson, R. D. (1970). "Experimental determination of auditory filter shape for tones masked by bands of noise, "J. Acoust. Soc. Am. 47, 107(A). Patterson, R. D. (1971). "Effect of amplitude on auditory filter shape," J. Acoust. Soc. Am. 49, 81(A). Patterson, R. D. (1974). "Auditory filter shape," J. Acoust. Soc. Am. 55, 802-809.
Schafer, T. H., Gales, R. S., Shewmaker, C. A., andThompson, P.O. (1950). "The frequency selectivity of the ear as determined by masking experiments," J. Acoust. Soc. Am. 22, 490-496.
Scharf, B. (1970). "Critical bands," in Foundations of Modern
AuditoryTheory, editedby J. Tobias(Academic,NewYork), Vol.
1.
Swets, J. A., Green, D. M., and Tanner, W. P., Jr. (1962). "On the width of critical bands, "J. Acoust. Soc. Am. 34, 108--113.
Webster, J. C., Miller, P. H., Thompson, P. O., andDavenport, E. W. (1952). "The masking and pitch shifts of pure tones near abrupt changes in a thermal noise spectrum, "J. Acoust. Soc. Am. 24, 147-152.
Wilson, J.P.,
and Evans, E. F. (1971). "Grating acuity of
the ear: psychophysical and neurophysiological measures of frequency resolving power," Proceedings of the 7th International Congress on Acoustics 3, Akademiai Kiado, Budapest, pp. 397-400.
Zwicker, R., Flottorp, G., and Stevens, S.S. ical bandwidth in loudness summation, "J. 29, 548-557.
J. Acoust. Soc. Am., Vol. 59, No. 3, March 1976 Downloaded 30 Jun 2013 to 132.206.27.25. Redistribution subject to ASA license or copyright; see http://asadl.org/terms
(1957). "Crit-
Acoust. Soc. Am.