Effects of fluctuating noise and interfering speech on the speech-

reception threshold for impaired and normal hearing Joost M. Festen and Reinier Plomp ExperimentalA udiology, Departmentof Otolaryngology, Free University Hospital,P.O.Box 7057, 1007MB Amsterdam, The Netherlands

(Received26 July 1989;revised17January1990;accepted15May 1990)

The speech-reception threshold(SRT) for sentences presentedin a fluctuatinginterfering backgroundsoundof 80 dBA SPL is measuredfor 20 normal-hearinglistenersand 20 listeners with sensorineural hearingimpairment.The interferingsoundsrangefrom steady-statenoise, via modulatednoise,to a singlecompetingvoice.Two voicesare used,one male and one female,and the spectrumof the maskeris shapedaccordingto thesevoices.For both voices, the SRT is measuredaswell in noisespectrallyshapedaccordingto the targetvoiceasshaped accordingto the other voice.The resultsshowthat, for normal-hearinglisteners,the SRT for sentencesin modulatednoiseis 4-6 dB lower than for steady-statenoise;for sentencesmasked by a competingvoice,this differenceis 6-8 dB. For listenerswith moderatesensorineural hearingloss,elevatedthresholdsare obtainedwithout an appreciableeffectof masker fluctuations.The implicationsof theseresultsfor estimatinga hearinghandicapin everyday conditionsare discussed.By usingthe articulation index (AI), it is shownthat hearing-

impairedindividualsperformpoorerthan suggested by the lossof audibilityfor somepartsof the speechsignal.Finally, threemechanismsare discussed that contributeto the absenceof unmaskingby maskerfluctuationsin hearing-impairedlisteners.The low sensationlevel at which the impairedlistenersreceivethe maskerseemsa major determinant.The secondand third factorsare: reducedtemporalresolutionand a reductionin comodulationmasking release,respectively. PACS numbers:43.66.Mk, 43.66.Sr, 43.71.Gv [NFV]

far, all dataonspeech reception in noisewerecollected for

INTRODUCTION

Researchon speechreceptionin noisehasa longhistory.FrenchandSteinberg (1947)investigated theeffects of signalintensity, theadditionof noise,andfilteringof either the speechor the noiseuponspeech intelligibility.To describethe observeddataandto predictintelligibilityin new conditions, theyintroducedthearticulationindex(AI) and presented an algorithm for it basedon thelevelsof speech and maskingsoundas a functionof frequency.The main elementsin this model are: ( 1) the contributionof eachfre-

normal-hearinglisteners.

Effects ofhearinglossonspeech reception wereinitially studied for conditionswithout interference.Later, speech

reception in noisefor hearing-impaired listeners attracted moreandmoreattention(amongothers,Palva,1955;Lindeman,1967;and Jokinen,1973). After concluding that

speech intelligibility in noiseis governed bythespeech-tonoiseratio,Plomp(1978) described the speech-reception threshold(SRT) as a functionof the levelof steady-state noisewith a simplemodel.In thismodel,asin the present

quency bandabovetheabsolute threshold isalinearfunction study,SRTisdefined asthelevelof speech fora fixed50% of the signal-to-noise ratioin dB between 12and 18dB, score.The modelcontainstwo parameters,one for the and (2) theoverallintelligibilityistheweightedsumof these threshold in quietandoneforthethreshold in noise,andit contributions overthefrequency bandsfrom250to 7000Hz. assumes a fixed"effective" speech-to-noise ratioat threshTheir studywasrestrictedto normal-hearing listenersand old. The effectivenoiselevel in the model is determined by steady-stateinterference. At about the same time, effectsof masker fluctuations

and speechinterruptions wereinvestigated in a seriesof studies(Miller, 1947;Miller and Licklider, 1950). The ad-

vantageof maskerinterruptions wasvery clearlydemonstratedbytheintelligibilityof monosyllables asa functionof theinterruptionrateof noisewith a fixeddutycycle(0.5). For ratesbelowabout200 Hz, the intelligibility increasesas the rate is lowered.Maximum intelligibility is reachedat

aboutteninterruptions persecond; for lowerrates,intelligibilitydropsagainbecause complete wordsaremasked.So a•Portionsofthisarticlewerepresented at the114thMeetingoftheAcousticalSociety ofAmerica,Miami,FL [J.Acoust.Soc.Am.Suppl.182,S4

hearing listeners, thismodelexcellently fitsthedatabyHawkinsandStevens (1950). For hearing-impaired listeners, the modelcontains twoextraparameters. Oneparameter isthe elevation of thethreshold in quiet,whichmaybeinterpreted

asa higherleveloftheinternal noise. Theotherparameter is theincrease inspeech-to-noise rationeeded toreachtheSRT in noise.The model was validated in a number of experi-

mentswithhearing-impaired listeners, recentlyreviewed by

Plomp(1986).In general, the second parameter, called hearing lossforspeech innoise, isonlysmall(0-10dB), but it hasnocounterpart in thearticulation theoryby French andSteinberg (1947), norin itsupdatedversions (Kryter,

(1987)].

1725

the sumof an internalnoise(responsible for the absolute threshold)and the externallyappliednoise.For normal-

J.Acoust. Soc.Am.88(4),October 1990

0001-4966/90/101725-12500.80 © 1990Acoustical Society ofAmerica

1725

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

1962;Pavlovic, 1984). For slopingaudiograms,part of the hearinglossfor speechin noisewill, accordingto the articulation theory, be attributed to the inaudibility of high-frequencycomponentsin the speech.However, evenin an experiment in which speech and noise were presentedto hearing-impairedlistenersabovethresholdbetween250 and 4000 Hz, van Dijkhuizen et al. (1989) showed elevated speech-reception thresholds. Apart from hearinglossfor speechin steady-statenoise, it appearsthat listenerswith sensorineuralhearinglosshave an extra handicapin perceivingspeechthat is maskedby competingspeech.Carhart and Tillman (1970) measured discrimination for monosyllablesagainst competing sentencesfor variousgroupsof listeners.For two groupswith sensorineuralhearinglossand differentmaximum discrimination scores,the effectiveness of the competingspeechwas 12-15 dB greaterthan for listenerswith normal hearing or conductivehearingloss.Duquesnoy(1983) determinedthe binaural SRT for sentencesmasked by either competing speechor noisematchedin spectrumand levelto the masking speech.He tested two groups of listeners,a younger group with normal hearingand an elderly groupwith presbycusis,and he observedthat the unmaskingoccurringfor competingspeechin the normal-hearinggroup was absent for presbycusis. To excludethe possibilityof the extra hearing lossfor speechin a fluctuatingmaskerbeingan effectof age,Festenand Plomp (1986) measuredthe SRT in a fluctuating interfering sound also for a group of young hearingimpaired listeners(pupils of a high schoolfor the hearing impaired). To further explore the effect of the nature of masker fluctuations,sinusoidallyintensity-modulatednoise was usedand time-reversedspeechfrom the sametalker as the target speech.For an optimum modulation rate of the noise (about 16 Hz), the normal-hearinglistenersgained nearly 5.5 dB in S/N ratio relative to the thresholdin steadystatenoise,whereas,for the young hearing-impairedlisteners, the averagegain was only 1.2 dB. From theseresults, two conclusionscan be drawn: First, unmaskingof speech through maskerfluctuationsis not restrictedto competing speech,and, second,the absenceof unmaskingwith masker fluctuations,as now also obtained in young hearing-impairedlisteners,is an effectof hearinglossrather than of age. The presentstudyisa further explorationin whichirregularly modulated maskersare usedand without a constrainton listener age. SRT is measuredfor conditionswith steadystate noise, interfering speech,noise modulated with the widebandenvelopeof runningspeech,and noisein which the low and high frequenciesare separatelymodulated with speechenvelopesfrom the corresponding frequencyregions. I. METHOD

A. Speech material

Two different sets of short everyday sentenceswere used,one set read by a male speakerand the other by a female speaker.Each set comprisesten lists of 13 sentences. Individual sentencescontain eight or nine syllables;they were recordedat a constantlevel and were adjustedto be equallyintelligible.The sentences readby thefemalespeaker 1726

J. Acoust.Soc.Am.,Vol.88, No. 4, October1990

• v

100

>



80



60



40

o

I

i

i

125

250

500

i

1000

frequency

I

2000

I

4000

i

8000

(Hz)

FIG. 1. Long-termaverage1/3-oct spectraof the male voice(bold curve) andfemalevoice(light curve)plottedfor anoveralllevelof 80dBA asused forthecorrespondingly shaped masker.Additionally,medianvalues(open circles)andquartiles(verticalbars) of the pure-tonethresholds at octave frequencies aregivenfor the groupof 20 hearing-impaired listeners.

weredescribedby Plomp and Mimpen (1979a). Both setsof sentences are digitally storedwith a samplingfrequencyof 15 625 Hz and eight bits of resolution.The long-termaverage spectraof the two voicesare shown in Fig. 1. Over a broad range of frequencies,the spectraof the male and femalevoicearevery muchalike. However,they divergeat the low frequencieswherethe male voiceis strongerdue to the lower

fundamental

and above 2.5 kHz

where the female

voicecontainsmore energy. B. Interfering signals

Interferingsignalswerechosento covera rangebetween steady-statenoiseand running speechof a singlevoice. A pseudo-runningspeechwas obtainedby generatingvarious sentences in succession without pauses.The noisehasa spectrum equal to the long-term averagespectrumof either the male or the femalevoice.To mimic the intensityfluctuations of speech,two kinds of fluctuatingnoisemaskerswere derived from both voices.In the first masker,noisesspectrally shapedto eitherof the voicesare multipliedby the envelope of speechfrom the corresponding voice.The fluctuationsof the speechimposedon this masker are coherentover the entire spectrum.In the secondmasker,this coherenceis disturbed, by separatelymodulatinghigh and low frequencies with the envelopeof speechfrom the corresponding frequency regions. The on-line generation of these modulated maskerswasperformeddigitallyin a signalprocessor(TMS 320-10). A block diagram of the signalprocessingfor the two-bandmodulatedmaskeris shownin Fig. 2. Speechand noisewith equal long-term averagespectraare read from disk.Both signalsare splitup in a low- and a high-frequency part with 1000-Hzcrossoverfrequency.We usedsixth-order Chebechevfilterswith 1-dB passband ripple and slopesof approximately36 dB/oct. From the filtered speech,the envelopeis determinedby taking the modulusof the speech samples,followingthe signalpeakswith a recoverytime constant of 10 ms, and smoothingthe signalby an exponential smoothingfilter With a 40-Hz cutofffrequency.In both freJ.M. Festenand R. Plomp:SRT influctuatingnoise

1726

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13



--

o

running

envelop detecto

I

high-pass

speech

filters

I



f

long-term spectrum

average

__1

/

envelope detector

I

low-pass

multiplier

filters

t-, I

noise

i

kHz

FIG. 2. Blockdiagramof the signalprocessing for the two-band-modulated masker.Speechand noisewith equallong-termaveragespectrumare splitinto a low- and a high-frequency part with 1000-Hz cutofffrequency.From both bandsof filteredspeech,the envelopeis determinedand multipliedby the noise from the corresponding frequencyband.After leveladjustmentthe two frequencybandsof modulatednoiseare added.

quencybands,the noiseis multipliedby the speechenvelope and,afterleveladjustment,thetwo frequencybandsareadded. Becauseof the slopingspectrumof speech,the power

Figure 3 showsthe envelopespectraof the two voices, widebandand 1000Hz highandlow-passfiltered.The modulation index is the rms of the intensity envelopeper 1/3 octaveof modulationdividedby the averageintensity.(Note that, for a 100% sinusoidalintensity-modulatedsignal,the modulationindexis 1.0in the 1/3-oct bandcorrespondingto the modulationfrequency.) The envelopespectraof the two voicesare verysimilar;theybothhavea bandpasscharacteristic and show a maximum at about 4 Hz. For frequencies

above 1000 Hz is lessthan below. As a result of this, also, the

absolutestrengthof the speechfluctuationsdiffersbetween the two frequencybands.To restore,after the modulation, the originallevelratiobetweenhighandlow frequencies, the low-frequencyband must be attenuated.This attenuation was 8 dB for the female voice and 10.5 dB for the male.

female I

voice I

(67

I

male

seconds)

I

I

'

i

i

voice i

(67

seconds)

I



i

i

,,', ', 1000 o

1.0

HZ,

high

'

pass

lOOO Hz, high pass



z

o06

.•-

.

'1000 HZ, •

z• o

•---•

'".,'; 1ow pass T]de band

0.4

0.2

o.

;5 ;

I

2

I

4

'

8

'

I6

'

' o.

22

modulation

15 1I

frequency

I

2

I

4

I

8

.t 16

32

(Hz)

FIG. 3.One-third-oct bandspectra oftheintensity envelope normalized bytheaverage intensity for67 sofspeech fromthetwovoices. Theenvelope spectra for thewideband speech areshownby thefullydrawncurvesandfor high-andlow-pass-filtered speech bycurveswithlongandshortdashes, respectively. 1727

J. Acoust.Soc. Am., Vol. 88, No. 4, October 1990

J.M. Festen and R. Plomp:SRT in fluctuatingnoise

1727

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

above 1000 Hz, the signalis more stronglymodulatedthan below. The modulationspectrafor low-pass-filteredspeech and widebandspeechare almostsimilar. For both voices,the SRT was measured under ten mask-

er conditions.Four kindsof maskers(steady-statenoise,single-bandmodulated noise,two-band modulatednoise,and time-reversedspeech)were appliedtwice each,oncebased on the samevoiceasthe target, and oncebasedon the other voice.Of thetwo remainingconditions,onewasin quietand, in the other, the interferingsoundwas normally running speechfrom the other speaker.For eachsentencepresentation, the interference started 800 ms before the start of the

sentenceand lastedfor 3.9 or 4.7 s, dependingon the length of thesentence. In all presentations, theinterference stopped at least 800 ms after the end of the sentence. The interference

wasswitchedon and off smoothlywithin 100ms.Throughout the experimentthe level of the interferencewas fixedat 80 dBA.

C. Procedure

For each listener, the monaural speech-reception thresholdwasmeasuredfor both voicesin ten maskingconditions. The listenerand the investigatorwere seatedin a sound-insulated room. Sentenceswere presentedto the listenerover a headset (Beyer DT48) and to the investigator visuallyon a computerterminal. In all conditions,a list of 13

TABLE I. Descriptiveandaudiometricdataof thehearing-impaired listeners.Pure-toneaudiogramsare representedby two numbers:PTA (average air-conductionhearinglevel at 500, !000, and 2000 Hz) and the average slopebetween250 and4000 Hz. For speechreception,the maximumscore obtainedwith monosyllables presentedin quietis given,aswell asthescore obtainedat a speechlevelof 80 dBA. Max score

Listener

Sex

Age (years)

Ear PTA (dB)

Slope mono(dB/ syll. oct)

Scoreat 80 dBA

(%)

!

F

39

R

55

!00

90

2

M

47

L

46

-- 2 11

85

75

3

M

62

R

48

3

100

85

4

F

71

R

57

2

90

85

5

F

72

R

37

4

93

93

6

F

70

R

47

8

!00

80

7

M

57

R

42

12

90

90

8

F

43

R

50

!1

85

85

9

M

64

R

47

9

85

70

10

M

37

L

58

5

97

70

!1

M

68

R

35

7

95

95

12

M

21

L

57

10

100

85

13

F

65

R

47

5

100

95

14

M

41

R

39

9

100

87

15

M

72

L

45

8

93

75

16

M

46

R

41

7

100

70

17

M

77

R

48

6

97

55

18

M

68

R

38

12

100

85

19

F

40

L

57

15

95

!0

20

M

76

L

47

8

90

80

sentences,unknown to the listener, was used to estimate the

level at which 50% of the sentences were reproducedwithout any error. In each condition, the first sentencestarted belowthe receptionthreshold.This sentencewasrepeated, at 4-dB higher levelswith eachrepetition,until the listener wasableto reproduceit correctly.The 12 other sentencesin a list werepresentedonly oncein a simpleup-down procedure (cf. Levitt, 1971) with a stepsizeof 2 dB. An errorless reproductionof the entire sentencewas requiredfor a correct response.The averagepresentationlevelover sentences 4-13 was taken as the SRT. The standard error for the SRT

in steady-statenoisedeterminedaccordingto this procedure is lessthan 1 dB for normal-hearinglisteners(Plomp and Mimpen, 1979a). To avoid confoundingof both measurement order and sentencelists with condition effects, the or-

der of conditionswascounterbalanced oversubjectsaccording to a digram-balancedLatin square,whereasthe orderof sentencelists was kept fixed. This ensuredthat ten conditions appearedonly once in every positionof the order for every ten listeners,and also all pairs of succeedingconditions occurredonly once. A group of ten listenersstarted with thefemalevoiceanda secondgroupoften listenerswith the male voice.

D. Listeners

Twenty normal-hearinglistenersand 20 listenerswith sensorineuralhearing loss participatedin the experiment. The ageof the normal-hearinglistenersrangedfrom 16-36 and the majority of them were university students.They were testedat the ear of their preference.One listenerin the normal-hearinggrouphad a lossof 20 dB re: ISO 389 at 4 kHz. All other thresholdsfor the normal-hearinglisteners 1728

J. Acoust.Soc. Am.,Vol. 88, No. 4, October1990

were better than 15 dB re: ISO 389 from 250 to 4000 Hz.

Listenerswith sensorineuralhearingimpairmentwere selectedfrom filesof the audiologydepartmentof the Free University hospital; no constraint on age was used. The hearing-impairedlistenerswere testedat the ear with the better PTA (average air-conductionhearing level at 500, 1000 and 2000 Hz). PTA rangedfrom 35 to 58 dB and all listenersreacheda maximumintelligibilityscorefor monosyllablesof at least 85%. A summaryof the relevantaudiometricdata is givenin Table I. Median pure-tonethresholds and quartilesare shownin Fig. 1, togetherwith a one-third octaveanalysisof the long-term averagespectrumof the maleand the femalevoiceplotted at 80 dBA. As is shownin this figure,part of the high frequenciesin the signalwill be below threshold

for these listeners.

II. RESULTS

Speech-receptionthresholdsobtained for the normalhearinglistenersare givenin Fig. 4. For eachconditionthe rangeof signal-to-noise ratioscontaining68% of the thresholds (average plus and minus one standard deviation) is shown. For each masker category, separateresults are shownfor conditionswith similar and differentspectrafor signaland masker.The open bars give the resultsfor the female voice and the hatched bars for the male voice. In

general,lower thresholdsare foundfor fluctuatingmaskers thanfor steady-state noiseof the sameaverageintensity.On the average,the SRT expressedas signal-to-noiseratio is -- 11.4, -- 8.4, and -- 9.5 dB for a speechmasker,singleJ.M. Festenand R. Plomp:SRT in fluctuatingnoise

1728

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

masker

no i se signal and masker spectrum

same

speech

diff.

same

:

different

mod.

no i se

mod.

no i se

same

diff.

same

diff

:

,

.

_ ,-

E

-12

'

.]

_

-1•

[-] femalevoice

signal

•i malevoice

FIG. 4. Speech-reception thresholdsfor sentences relativeto the maskerlevelfor 20 normal-hearinglistenersand for variousmaskerconditions,asindicated on top of eachpanel.Barsrepresentthe rangeof thresholdsbetweenthe averageplusand minusonestandarddeviation(68 % ). Openbarsare for the female voice,hatchedbarsfor the male voice.For eachmaskercategory,separateresultsare shownfor conditionswith similarand differentspectrafor signaland masker. Masker level in all conditions was 80 dBA.

band-modulatednoise,and two-band-modulatednoise,respectively, comparedto - 4.7 dB for steady-state noise.Be-

thresholds arefound(average-- 2.3 dB S/N ratio) for the femalevoice,with a largevariabilitybetweenlisteners.The

causeof the variationsin the spreadof the data among

averagestandard deviation in SRT for the conditions with a fluctuating masker is 2.4 dB (excluded are conditions in

conditions,the significanceof differencesin SRT between

conditions was tested with a Wilcoxen test for matched samwhichsignalandtime-reversed interferingspeech arefrom pies.All differences betweenconditionswith a fluctuating the sametalker). For steady-state noisethis standarddevimasker andthecorrespondi ngconditions withasteady-stateationis 1.3dB.In conditions withdifferentspectra formaskmaskerarehighlysignificant (p < 0.001). Theonlyexcep- er andsignal,themalevoiceismoreadequately maskedbya tion is the conditionin which speechis maskedby time- soundwith the spectrumof the femalevoicethan the other reversedspeechfrom the sametalker.Here, relativelyhigh wayaround.Apart fromthisspectrum-dependent effect,the

masker

noise signal

and masker

spectrum

same

speech

diff.

same

different

mod. noise

mod. noise

same

same

diff.

diff

8

;

'

/





.

• • -

_,•

/

/

.

• •

/

. -/



. /

/

-

-16

signal



female voice



male voice

FIG. 5.Speech-reception threshold forsentences relative tothemasker levelasinFig.4.Boldbarsarefor20hearing-impaired listeners andlightbarsarea replotof the normal-hearing data.

1729

J. Acoust.Soc.Am.,Vol.88, No.4, October1990

J.M. FestenandR. Plomp:SRTinfluctuating noise

1729

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

than 10dB abovethe SRT in quiet,with a minimumdistance

malevoiceasusedin thisexperimentseemsto be,in general, moresusceptible to maskingthan the femalevoice.This also appliesfor conditionsin which signaland maskerhave the

of 7.5 dB.

same spectrum.

The resultsfor the groupof hearing-impairedlisteners are shownin Fig. 5. For easeof comparison,the normalhearing data are replotted in this figure with thin lines. Thresholdsfor the hearing-impairedgroupare higherin all conditionsand, in addition, displayvirtually no differences amongconditions.The only exceptionto this generalobservation is, again, the result for speechmaskedby time-reversedspeechfrom the sametalker. Here, relativelyhigh thresholdsare obtainedas for the normal-hearinglisteners. The averagedifferencebetweenthe two groupsof listeners for steady-state noise,a speechmasker,single-bandmodulated noise, and two-band modulated noise is 4.0, 10.3, 7.0,

and 9.3 dB, respectively.The conditionwith both signaland time-reversedspeechmaskerfrom the sametalker was excludedfrom thiscomparison.For all conditionsin Fig. 5, the differencesbetweenthe two groupsof listenersare highly significant(p < 0.001) on a Mann-Whitney testfor independentsamples.The averagestandarddeviationacross conditionswas,of course,largerthanfor thenormal-hearing group:2.1 dB for steady-state noiseand2.8 dB for thefluctuating-maskerconditions.The SRT in quietwas52 dBA with a standarddeviationof 9.6 dB. For only one listenera few maskedspeech-reception thresholdswere obtainedat less

Figures4 and 5 pertainto levelsat which 50% of the sentences are correctlyreproduced.If the steepness of the psychometric functionwereverydifferentfor variousmasking conditions,thenthe resultswoulddependuponthe targetscorein theadaptiveprocedure.To estimatethepsychometric function for a number

of masker conditions,

individualrunsof theadaptiveprocedurewereshiftedto the averageresultandadded.Figure6 shows,forbothgroupsof listeners,the averagepsychometric functionfor conditions with a steady-statemasking noise,a two-band modulated noise,and a speechmasker(reversedspeechfrom the same talker excluded). Each curve is basedupon 1040 responses (13 sentencesX 20 listeners X 4 conditions). Due to the

adaptiveprocedure,the majorityof the responses is concentrated on signallevelsfor which a scoreof about 50% is obtained. Toward both extremes of the function, the number

ofresponses decreases and,alongwithit, thereliabilityofthe scoredecreases.Scoresbasedon lessthan 20 responsesare not shown and thosebasedon between 50 and 20 responses

are represented by opensymbols.The steepness of the psychometric function was estimated from a maximum

likeli-

hood fit to the data of the function

p(x) -- (1 + exp[(M-- x)/S ]}--I,

(1)

wherep(x) is the probabilityof a correctresponse, x is the

lOO

normal-hearing

,'

listeners

.e'

•,,'• /

60



40

u•

interfering ,",,///i/

voice





o



ß,'

20

_• modulated

/ \

/

/ noise

.•'/ •e

I

I

I

I

'

!

i

i

i

i



/



OOøo/ø • 100



i

/ steady-state noise

,



I

I

I

I

I

I

i

i

i

i

i

i

i

i

I

I

I

i

I

I

i

I

steady-state

hear ing- impai red

o

I

1 i steners

•nterfering '/'///

40

voice

•F

ß

!

3

!

modulated

,' / ,' JJ•e.... 20

I

I

-14

I

I

-12

I



-10

I

I

-8

I

'

-6

speech-to-masker

I

I

I

I

-4

-2

ratio

(dS)

I

I

0

I

I

2

FIG.6.Average discrimination curves forsentences presented insteady-state noise, two-band-modulated noise, andwithaninterfering voice, fornormal and hearing-impaired listeners. Eachcurve isbased on1040 responses; closed symbols contain more than50responses andopen symbols 20-50responses. 1730

J. Acoust.Soc.Am.,Vol.88, No.4, October1990

J.M. FestenandR. Plomp:SRTinfluctuating noise

1730

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

stimulus level, M is the level for which a score of 50% is

obtained,and S is the steepness. For normal-hearinglistenersthe slopeof the psychometricfunctionat 50% is 21.0%, 12.0%, and 11.9% per dB for steady-statenoise,a speech masker, and two-band modulated noise, respectively.For the hearing-impairedlisteners,the corresponding slopesare 20.4%, 14.9%, and 15.4% per dB, respectively.Although thereis somevariationin slopeof the psychometricfunction, this variationis not solargethat the differencesamongconditionsandgroupsof listenersreallydependuponthe performance level.

III. DISCUSSION

Beforegoinginto detail on aspectsof auditoryanalysis that may be relatedto theseresults,we will brieflytouchon the practicalimplicationsof our results.Fluctuatinginterferencesof speechare much more commonin daily situations than steady-statenoises.Consequently,the SRT in a fluctuatinginterferingmaskeroffersa bettermeasurefor the ability of hearing-impairedlistenersin speechcommunication than a steady-statemasker. Speechintelligibility for normal-hearinglistenersis affectedlessby fluctuatinginterfering signals,like competingspeech,than by steady-state noise (see also Carhart et al., 1969). For sensorineurally

hearing-impairedlisteners,this benefitis much smalleror evenabsent(seeFig. 5). As a resultof this,the differences in SRT betweenhearing-impairedand normal-hearinglisteners are considerablylarger with fluctuatingmaskersthan with steady-statemaskers(see also Carhart and Tillman, 1970;Duquesnoy,1983;Festenand Plomp, 1986). At the sametime, the accuracyof the thresholdsexpressed in decibels has not diminished,as shown by the steepnessof the discriminationcurvesin Fig. 6. Therefore,it is easierto discriminatebetweenindividual listenersor groupsof listeners on the basisof the SRT in a fluctuatinginterferencethan on the SRT in steady-statenoise. The type of fluctuatinginterference(speech,singleband-modulated noise or two-band-modulated noise) seems

to be moreor lessirrelevant.The only exceptionis a masker that is time-reversedspeechfrom the sametalker asthe target. For both normal and hearing-impairedlisteners,these thresholdsare much higher than for other maskerswith comparablemodulations.Probablythe similarityin character of maskerand signalmakesthis conditionmore difficult than the other ones[ seeDirks and Bower (1969) for similar

A. Relation with listener age

It is generallyacknowledgedthat audiometrichearing lossincreasesprogressivelywith increasingage (Robinson and Sutton, 1979). Also, hearing lossfor speechincreases with age. The onset and the extent of the deterioration in speechunderstandingstronglyvary over listenersand seem to dependupon the speechmaterial and the characteristics of the test. A surveyof factorsaffectingspeechunderstanding is givenin a recentreviewof the literatureby a Working Group on Speech Understanding and Aging (1988). Roughlyspeaking,thesefactorscanbe dividedin peripheral factors, factors related to the central auditory nervoussystem, and cognitivefactors.Apart from effectsintroducedby a decreasingperipheralauditory sensitivity,effectsof aging on speechunderstandingaregenerallysmall,in particularin testsbasedon the mere reproductionof simple sentences. Kalikow et al. (1977) obtained,for high-predictabilityitems in a sentencetest, a differenceof 1-2 dB in S/N ratio between

young and elderly (age 60-75) listeners;both groupshad thresholdsbetter than 20-dB hearinglevelat all frequencies through4 kHz. In a studyon the speech-reception threshold as a function of age, Plomp and Mimpen (1979b) found a goodcorrelationbetweenthe hearinglossfor speechin quiet and the averagepure-tonehearing lossfor 500, 1000, and 2000 Hz up to an agegroupbetween80 and 89 years.They concludedthat hearinglossfor speechwith increasingageis causedby a deteriorationin auditoryprocessing rather than in central processing.In a nonsense-syllable test in noise, Gelfand et al. (1986) found a relation between test scores

and age for essentiallynormal-hearinglisteners.But, after partialingout small high-frequencyhearinglosses,the correlationbetweenperformancein noiseand agewasno longer statisticallysignificant.Also, they concludedthat probably the origin of hearinglossfor speechwith increasingageis a peripheralauditory deterioration.In peripheraldeterioration, lossof sensitivityis only one element,additionally, reducedfrequencyandtemporalresolutionin the ear may play a role [see also Gelfand et al. (1988)]. Patterson et al. 84

•'

82

>• 8o

O

effects].

For conditionswith differentmaskerand signalspectra, thereis, in the normal-hearinggroup,systematicallya larger maskingeffectof the femalevoiceon the male speechthan conversely.There may, of course,be voice-relatedeffectsin this study,like onevoicebeingmoreclearlyarticulatedthan the other one. However, becausethe threshold differences

betweenvoices,presentin the normal-hearinggroup, are absentin the impairedgroup,a purelyspectraleffectis most probable.As seenin Fig. 1, spectralcomponentsbetween3 and 6 kHz are about 10 dB strongerin the femalevoicethan in the male voice. For the impairedgroup,this differenceis onlyof marginalimportancedueto the high-frequencyhearing losses. 1731

J. Acoust. Soc. Am., Vol. 88, No. 4, October 1990



78 ¸

E

C•

e

76

½

74

I

I

I

I

I

I

20

50

40

50

60

70

80

90

listener age (years) FIG. 7. Scattergram of theSRT versuslisteneragefor thehearing-impaired group.Circlesandtrianglesrepresent theSRT maskedbysteady-state noise and an interferingvoice,respectively.Correlationcoefficients with ageare 0.14 and 0.19, respectively. J. M. Festen and R. Plomp: SRT in fluctuatingnoise

1731

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

(1982) found age-relatedeffectsin frequencyselectivity. Their resultson speechintelligibilityin noisemeasuredwith words were correlated with both the audiogram and frequencyselectivity,but no indicationwas obtainedfor a deterioration of the processingefficiencyof speechwith increasingage. In the presentstudy, hearing-impairedlistenerswere selectedwithout a constrainton age. Effectsof listenerage on the SRT maskedby steady-statenoiseor by a competing voice are virtually absent,as shown in Fig. 7. This result corroborates our previousconclusion,obtainedwith a group of young hearing-impairedlisteners (Festen and Plomp, 1986), that the highSRT in a fluctuatinginterferingsoundis not an effectof listenerage.

lossand high speech-recognition scores,they founda close correspondence to the resultsfrom normal-hearinglisteners. However,for listenerswith a poorrecognitionability,theAI appearedto be a scantypredictorof the recognitionscore. They concludedthat, in its presentform, theAI procedureis inadequateto producea generallyapplicableestimateof speech-recognitionperformance. Zurek and Delhorne (1987) went one stepfurther and measuredconsonantreceptionin noisein order to test the hypothesisthat the elevatedreceptionthresholdfor hearing-impairedlistenerscan be attributedsimplyto a lossof audibility.For this purpose, they tested normal-hearing listeners with an additional maskingnoiseshapedto producepure-tonethresholdscomparableto thosefor the hearing-impairedlisteners.The results for both groupsof listenerswere also comparedin B. Relation with the articulation index terms of the articulation index. Although there was a large variability in their data, they found no evidenceto rejectthe Accordingto the theory underlyingthe articulationinhypothesis. dex (French and Steinberg,1947), speechintelligibilityfor Evidently,for maskerlevelsand pure-tonethresholdsas normal-hearinglistenerscan be predictedfrom the spectral shown in Fig. 1, lossof audibility will affectthe resultsto levelsof speechand maskingnoisein combinationwith the some extent, but a preciseevaluationcriticallydependsupon detectionthreshold.The relativeperformanceundervarious the way in which the AI is calculatedor on how normal and conditionsisexpressed in a numericalindex,the articulation index,which represents the weightedsumof the signal-to- hearing-impairedlistenersare compared.When usingthe noiseratiosfor the speechpeaksabovethreshold,truncated low CR from normal-hearinglistenersto estimatethe apparent noiselevel (threshold-CR) that is assumedto be responbetween0 and 30 dB, overa numberof frequencybands.The siblefor the elevatedthresholdsof the impairedlisteners,we relation between speechintelligibility and AI was deterobtaina noiselevel that, when actually applied,would lead mined with talkers and listeners for various stimuli like sento higher thresholdsdue to reducedfrequencyselectivity. tences, monosyllables,etc. The method was modified by The effectof the CR from normal-hearinglistenerson AI is Kryter (1962) and standardized,againwith modifications, demonstrated in Fig. 8. Shownisa scatterplot of SRT for the accordingto ANSI (1969). Basedon severalstudies,Pavlofemale voice masked by steady-statenoise with the same vic (1987) suggesteda further update. In his version,the importanceof the variousfrequencybandsdependsuponthe speech material (monosyllables,average speech, easy speech)in sucha way that the low frequenciesare more 0.5 heavilyweightedfor more redundantspeechstimuli. Recently,the AI hasalsobeenappliedto speechintelligibility for hearing-impairedlisteners(see also Pavlovic, 0.4 1984; Kamm et al., 1985; Zurek and Delhorne, 1987). In thesestudies,the elevatedthresholdof hearing-impairedlisteners is assumedto be causedby an additional internal maskingnoise.As for low-levelspeechin quietin the original procedure,it is assumedthat the spectraldensityof this apparentnoisecan be estimatedby subtractingthe critical ratio (CR) from the pure-tonethresholds.In the studies cited above,the CR for normal-hearinglistenerswas used. Pavlovic (1984) measuredspeechdiscriminationwith PB words, both filtered and in noise. For listenerswith normal

.o 0.3 ._

ß ß ß

0.2 ß

.

-6

.

.

-4

.

i

-2

ß

.

i

0

.

i

2

hearingand with minor hearingloss,he foundgoodpredicSRT re mosker level (dB) tions but for thosewith larger impairments,the measured discriminationwaspoorerthan predicted.In a secondexper- FIG. 8. Scattergramof the articulationindexat thresholdversusSRT for iment, it was shown that the correction factor neededto im-

provethe predictions(proficiencyfactor) mustbe lessthan 1, and smaller for frequencyregionswith a greater loss. From theseresults,it was concluded that additional deficits,

peculiarto hearingloss,like a reducedfrequencyselectivity shouldbeincludedin the AI procedure.Kamm et al. (1985) usedthe AI to predictperformanceof normaland hearingimpaired listenerson a nonsense-syllable test in quiet. For hearing-impairedlistenerswith mild-to-moderatehearing 1732

J. Acoust.Soc. Am., Vol. 88, No. 4, October 1990

the femalevoicemaskedby steady-statenoisewith a correspondingspectrum. AI wascalculatedfor easyspeechaccordingto the 1/3-oct methodby Pavlovic (1987) with a correctionfor headphonelistening.For normalhearing listeners(open symbols), the whole speechspectrumis above thresholdand, therefore, SRT and AI are linked. The dashedlines mark an areaof two standarddeviationsin normal-hearingAI aroundthemean.The hearing-impaired listenersare dividedin threesubgroups: ( 1) listenersfor whom the speechis below thresholdin four 1/3-oct bandsor less(downward-pointingtriangles);(2) speechbelowthresholdin fiveor six 1/3-oct bands(circles); and ( 3 ) speechbelowthresholdin more than sixfrequency bands(upward-pointingtriangles).

J.M. Festen and R. Plomp:SRT in fluctuatingnoise

1732

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

TABLE II. Averagearticulation indexwithstandard deviation in parentheses fortwovoices andtwosteady-state noisemaskers. AI wascalculated according to a procedure for normalhearingby Pavlovic(1987) andbytwoprocedures modifiedfor hearingimpairmentby Pavlovicetal. (1986). In all procedures a thresholds correction forheadphone listening wasapplied.Thedifferences in average AI between impairedandreference groupforcorresponding signaland maskerconditions weretestedona Mann-Whitneytestfor independent samples. Resultsfor the impairedgroupwhichdiffersignificantly fromthe correspondingresultin the referencegroupare indicatedwith a superscript( 1 for 5% and 2 for the 1% level). Averagespeech

Importance function Signal Normal-hearingprocedure

Procedure

modified

for impairedCR

Procedure

modified

for impairedCR and speech desensitization

Masker

Reference group 0.25 (0.04) 0.26 (0.03) 0.18 (0.04) 0.27 (0.06)

0.25 (0.07) 0.27 (0.06) 0.242(0.07) 0.28 (0.06)

F M F M

F M M F

0.312(0.05) 0.312(0.05) 0.292( 0.07) 0.32•(0.06)

0.332(0.05) 0.332(0.05) 0.292(0.07) 0.352(0.06)

F M

F M

0.192(0.04) 0.212(0.04)

0.202(0.04) 0.212(0.04)

F

M

0.17 (0.04)

0.18 (0.04)

M

F

0.202( 0.04)

0.222(0.05)

calculation

should be based on the

CR as obtainedfrom hearing-impairedlistenersand as suggestedby Pavlovicet al. (1986). This procedureadds1 dB to the CR for each 10dB of hearingloss,which corresponds to the averageCR for variousfrequencies,asfoundin a review of the literature by Tyler (1986). In order to obtain the same AI for normal-hearingand hearing-impairedlisteners,Pavlovic (1986) additionally introduceda "speechdesensitization factor" to accountfor further suprathresholddeteriorations of hearing. The results of an application of these proceduresto our SRT data for steady-statemasker condiin Table II. The differences in the aver-

ageAI betweenhearing-impairedlistenersand the reference group were testedwith a Mann-Withney test for independent samples.With the normal-hearingproceduresignificant differencesbetweenthe two groupsof listenerswere obtainedfor only a few conditions.However, in modeling the absolutethreshold,thisprocedureoverestimates the apparentnoise,givingartificiallylow AIs for hearing-impaired listenersin conditionswherea part of the speechspectrumis belowthreshold.With a correctionfor the impairedCR in theAI, hearing-impaired listenersneeda significantly better AI at thresholdthan normal-hearinglisteners.This shows that thehigherSRT in thehearing-impaired groupcannotbe 1733

Impaired group

F M M F

J. Acoust.Soc.Am.,Vol.88, No.4, October1990

0.25 (0.04) 0.26 (0.03) 0.17 (0.04) 0.29(0.06)

0.28 (0.06) 0.291(0.05)

0.252(0.06) 0.31 (0.06)

simplyattributedto a lossof audibilityfor a part of the speech spectrum. With a correction for boththe impaired CR andthe suprathreshold "speechdesensitization factor,"

theAI for hearing-impaired listeners is,in general, significantlylowerthanfor normal-hearing listeners, independent of theimportancefunctionapplied.Therefore,in the conditionswith steady-state maskers as appliedin our experiments, the speechdesensitizationfactor overestimatesthe

suprathreshold deterioration of hearing. C. Relation with the modulation-transfer

below the absolute threshold.

tions are summarized

Reference group

F M F M

spectrumand the corresponding AI calculatedaccordingto the 1/3-oct procedurefor normalhearingand "easyspeech" by Pavlovic (1987). For normal-hearing listeners (open symbols),signalswerefar abovethe absolutethreshold,and, therefore,differencesin SRT give correspondingdifferences in AI. The hearing-impairedlistenersare divided in three subgroupson the basesof the number of 1/3-oct bandsfor which the apparentnoise,modelingthe threshold,exceeds the maskingnoise. The data clearly show the anticipated effectof lowerAI at the speech-reception thresholdfor those listenersfor whom a wide part of the speechspectrumis A more correct AI

Easyspeech

Impaired group

function

Application ofthearticulation indexto compare speech intelligibility acrossconditions,is limited to thoseconditions

with onlydistortionin the frequencydomain,suchasinterferingnoiseor bandpass filtering.In our experiment, this onlyholdsfor theconditions in quietandwith steady-state noise.More widelyapplicable isthespeech-transmission index (STI)

based on the modulation-transfer function

(Steenekenand Houtgast,1980;Houtgastand Steeneken, 1985). This method also takes into account time-domain

TABLE III. Correlation between thefluctuations ofintensity in speech and disturbedspeech togetherwith theapparentsignal-to-noise ratio (dB) for twodisturbing sounds: noiseandaninterfering voice.Theapparent signalto-noiseratios(S/N)' arecalculated froma regression analysison the intensitymodulations according to Ludvigsen etal. (1990)ona 12-ssegment of runningspeechfor an octavebandat 1000Hz.

Noise S/N (dB) 5 0 -- 5 -- 10 -- 15

r 0.999 0.995 0.972 0.825 0.406

(S/N)' 5.1 0.0 -- 5.1 -- 10.3 -- 15.7

Interferingvoice r (S/N)' 0.971 0.807 0.454 0.233 0.154

J.M. Festenand R. Plomp:SRT influctuating noise

5.6 0.8 -- 3.5 --6.7 --8.6

1733

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

distortionslike peak-clipping,reverberation,and automatic gaincontrol.The STI isbaseduponthe importanceof modulationsfor speechintelligibility,and is calculatedfrom the relative reductionof intensitymodulations(in terms of the modulation index) in 14 1/3-oct intervals of modulation

(0.63 Hz up to 12.5Hz) for eachof sevenoctavefrequency bands ( 125 Hz-8 kHz). These modulation indices are con-

verted to an apparent signal-to-noiseratio, truncated to a rangebetween+ 15dB, averagedovermodulationfrequencies,and expressed in a numericalindex (STI) by summing over octavefrequencieswith weightingfactorscomparable to thosein the AI. Although this procedurecorrectly accountsfor time-domain distortionsin the envelopeof the signal intensity, it fails in case of interfering modulated noise,becauseit canonly accountfor the strengthof modulations,irrespectiveof their source. Recently,Ludvigsenet al. (1990) proposedan alternative algorithmfor the STI basedon a regression analysis( Y = aX + b ) of the intensitycontoursof speech(X) and disturbedspeech(Y) in 23 frequencybandsand weighted summation.Within eachfrequencyband, the apparentsignal-to-noiseratio is definedas the logarithm of the ratio betweenintensityfluctuationsthat are correlatedand uncorrelatedbetweendisturbedand original speech' (S/N)' = 10 loglo(aX/b),

(2)

whereX is the averageintensityof the noise-freespeechand a and b are constantsfrom the regression.For steady-state noise,the resultof this calculationis equivalentto the normal signal-to-noiseratio. For a fluctuating interference, however,it distinguishes betweenmodulationspresentin the speechandthosefrom the interferingsound.As an example, Table III shows, for an octave band at 1000 Hz, the correla-

tion betweenintensityfluctuationsof speechand disturbed speechfor varioussignal-to-noiseratios togetherwith the apparentsignal-to-noise ratio. When comparingan interfering voiceand steady-statenoise,the correlationdropsfaster for the speechmasker,but the apparentS/N ratio is clearly lesssensitiveto the fluctuatingmasker.This trend agrees with our data, but the predictedthresholdshift in decibelsis too small to accountfor the measuredSRT, with an exception for the conditionin which speechwasmaskedby noise modulatedwith the widebandenvelopeof a secondvoice.A detailedapplicationof this procedurein predictingthe intelligibility of speechdisturbedby a modulatedmaskerneeds further validation but is beyond the scopeof this study. Moreover, althougha calculationschemefocussedat predicting intelligibility may be valuable,it doesnot explain why lower thresholdsare obtainedfor a modulatedmasker, and what the effectof hearinglossis in this respect. D. Relation with temporal resolution

Temporalresolutionof the ear canbemeasuredin several ways: ( 1) by the detectionof a brief pausein an otherwise continuoussound,"gap detection"[for normal hearingsee Plomp (1964), and,for hearing-impairedlisteners,seeIrwin et. al. (1981) andBuusand Florentine(1985) ]; (2) by temporal masking [Fastl (1979) for normal hearing,and Nelsonand Freyman (1987) for impairedhearing]; (3) by de1734

J. Acoust. Soc. Am., Vol. 88, No. 4, October 1990

tectionof modulations (cf. Viemeister,1979);and (4) by discrimination of signalshavingidenticalenergyspectra suchasHuffmansequences (cf. Green, 1985). Smiarowski

and Carhart(1975) suggested that forwardmaskingand gapdetection represent effects ofthesameunderlying mechanismof auditory persistence. However, not all of these methodsyieldthe sametemporalparameters for the auditorysystem. Thismaybeduepartlyto thedifficultyof measuringthetemporalcharacteristics independent fromcontaminatingeffectslike, e.g.,frequencyselectivity. However, the alternativehypothesis, namely,that differenttaskscall on differentprocesses in the auditorysystem,cannotbe ruled out.

Amongthe variousdata available,thoseon temporal maskingare presumablythe mostrelevantin relationto the

masked speech-reception threshold. For normal-hearing listeners,we know that the rate at which forwardmasking dropswith time delaybetweenmaskerand probesignal stronglydependsupon the level of the masker (Widin and

Viemeister,1979;Jesteadt et al., 1982).Forward-masking curvesona log-timescalecanberepresented approximately by straight lines that reach the absolutethresholdafter a time interval between 150 and 300 ms, irrespectiveof the levelof simultaneousmasking.When the maskeris only attenuated, instead of switched off, the masked threshold ex-

pressedin dB SPL dropsalong the samecurves,exceptthat this rolloff is interceptedat the thresholdof simultaneous maskingfor the attenuatedmasker (Pollack, 1955). Therefore,the periodof time overwhichthe thresholddropsafter a sudden attenuation of the masker dependsupon the amountof attenuationand may be considerablyshorterthan for switchingoff the masker. The time courseof forward maskingin impairedearsis mainly determinedby the lower sensationlevel of the masker. Just as for normal hearing,the recoveryfrom masking seemsto dependupon the sensoryresponseevokedby the stimulus (Nelson and Freyman, 1987). Becausefor equal physicalstimuluslevel (SPL) the sensationlevel is much lowerfor the hearing-impairedlisteners,the slopeof forward maskingis alsomuchmoreshallow.Therefore,with respect to temporal masking, hearing-impairedlistenersby no meansbehavelike noise-maskednormal-hearinglisteners. Additionally, smalleffectsof a reducedtemporalresolution havebeenfound,but not in all sensorineurally impairedears (cf. Zwicker and Schorn, 1982;Nelsonand Freyman, 1987). This conclusion was based on the observation that, when

comparingtemporal masking data from normal and impaired ears on equal sensationlevel, someof the impaired listenersshoweda greaterpersistenceof maskingthan the normals,which was describedwith a larger time constant. Nelson and Freyman, usingpure-tonemaskersand probe signals,observeda gradual increaseof this time constant with hearinglossup to abouttwicethe averagenormal-hearing time constantfor a lossof 50 dB. Within the normalhearinggroup,a scatterin the time constantwasfoundwith a standard deviation of about 25%.

When maskingspeechwith a modulatedmasker,the SRT will stronglydependon forward masking.For a fixed sound-pressure level, the forward maskingin hearing-imJ.M. Festen and R. Plomp: SRT in fluctuatingnoise

1734

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

pairedlisteners isdetermined bya combination ofsensitivity maskerfor hearing-impairedlisteners,in additionto lossof lossand reducedtemporalresolution.Evenlongbeforethe absolutethresholdof the impairedlistenersis reachedafter

sensitivityand reducedtemporalresolution.

termination of the masker, the differencein forward mask-

IV. CONCLUSIONS

ingcompared withnormal-hearing listeners maybeseveral

The resultsshowthat, for normal-hearinglisteners,the SRT for sentencesin noisestronglydependsupon the temporaldistributionof themasker.For a noisemaskervarying in levellike the envelopeof speech,thresholdsarefoundthat are 4 to 6 dB lower than in steady-statenoise.For a single interferingvoice,this differenceis between6 and 8 dB. For listenerswith moderatesensorineural hearingloss,elevated SRTs in noiseare obtainedwithout an appreciableeffectof masker fluctuationsand without an effect of listener age. Becauseindividual SRTs do not havea larger measurement error in a fluctuatingmasker,the difficultiesof hearing-impairedlistenerswith speechreceptionin noisyenvironments are much more distinct for a fluctuating masker than for a steady-statemasker.The differencesin SRT betweenthe impaired group and the normal-hearinggroup increasefrom about4 dB in steady-statenoiseto about 10 dB for a single interferingvoice. In describingthe SRT in steady-statenoisefor hearingimpaired listeners,applicationof the AI procedurefor normal-hearinglistenersis incorrectbecausethe internal noise modelingthe thresholdis overestimated,givingan artificially low AI at threshold.With a correctionfor the impaired CR in the AI procedure,hearing-impairedlistenersneed a significantlybetterAI at thresholdthan normal-hearinglisteners.For a descriptionof the resultsin fluctuatingnoise, the calculationof STI on simplythe depthof modulationsis inadequate,becauseno distinctionis made betweensignal and masker modulations.A recently proposedalgorithm based on the correlation of intensity fluctuations between speechanddisturbedspeech(Ludvigsenet al., 1990) givesa first-orderapproximationto the data for normal-hearinglis-

tens of decibels.The relevant duration of masker interrup-

tionsfor ourexperiment canbeestimated fromthetemporal characteristicsof speech,as shown in Fig. 3. The most prominentmodulationfrequencyin the speechand the maskeris 4 Hz, corresponding to half a periodof 125 ms. Overthisperiodof time,therelease frommaskingelicitedby an 80-dBA masker can be estimated to be about 40 dB for

normal-hearinglisteners.For hearing-impairedlisteners with a lossof 47 dB (the averagelossof our impairedlisteners), this releaseof maskingwill be no more than about 10 dB. Of course,theseestimatedforward maskingdata cannot directlybe convertedinto differences in SRT, but it is conceivable that a factor of 4 differencein forward masking

givesa comparable difference in the releasefrom masking due to masker fluctuations.

E. Relation with comodulation masking release

When a toneis maskedby noisethe detectionthreshold of the toneincreases with an increasingnoisebandwidthup to what has become known as the critical bandwidth and

remainsconstantfor a further wideningof the noiseband. However,the thresholdof the tonedecreases after reaching the criticalbandwidth,insteadof beingconstant,when the noise contains modulations that are coherent over the entire

noiseband (Hall et al., 1984). For thisphenomenon, called comodulationmaskingrelease(CMR), severalexplanationshavebeenproposed(seeBuus,1985;Hall and Grose, 1988); in all of them, the auditory systemis supposedto combine information from several critical bands in order to

detectthe signal.For instance,the temporalenvelopein bandsflankingthe signalbandprovidesinformationon the locationof dipsin themasker.Thesemomentsofferthebest chanceto detectthe signalbecausethen the varyingS/N ratio in the critical band containingthe signal is highest. Alternatively,the envelopecorrelationamong frequency bandsmay be used;this correlationdecreases by the presenceof the signal.We may concludethat, in CMR, both temporalresolution andspectralresolution of theearplaya role. This conclusionis supportedby the reducedCMR as foundin hearing-impaired listenersandits correlationwith reducedfrequencyresolution(seeHall et al., 1988a). As suggested by Hall et al. (1988b), the CMR phenomenonmaynotonlyhavean effectin experiments ontemporal

teners.

The severereductionin the releasefrom maskingwith maskerfluctuations,asobservedin sensorineurallyhearingimpairedlisteners,canfor the largerpart beaccountedfor by data on temporal masking.The low sensationlevel at which the impaired listenersreceivethe maskerseemsa major determinant.A secondfactor may be reducedtemporalresolution per se.Finally, part of the releasefrom the maskingof speechin widebandmodulated maskersmay.be related to frequencyresolution,in a similar way as in comodulation maskingrelease.As a resultof reducedfrequencyresolution, CMR is alsoreducedin the impaired listeners.

resolutionin which the maskeris coherentlymodulatedover

a wide band of frequencies(e.g., Zwicker and Schorn, 1982),butalsoin experiments on speechreceptionin modulated noiselike thosepresentedin this article,and thosereportedby Festen(1987). Sofar, the effectof CMR hasnot beenassessed directly with speechsignals,but, for a narrow band of noise,the effecthas been demonstrated(Hall and

Grose,1988) asalsofor complextones(Hall et al., 1988b). If partof thereleasefrommaskingdueto widebandmasker fluctuations,asshownin Fig. 4, is mediatedby CMR, thena reduced CMR is a third factor in the SRT in a fluctuating 1735

J. Acoust.Soc. Am., Vol. 88, No. 4, October1990

ANSI (1969). ANSI S3.5-1969,"Methodsfor the calculationof the articulation index" (American National StandardsInstitute, New York). Buus,S. (1985). "Releasefrommaskingcausedby envelopefluctuations," J. Acoust. Soc. Am. 78, 1958-1965.

Buus,S., and Florentine,M. (1985). "Gap detectionin normaland im-

pairedlisteners: theeffectof levelandfrequency," in TimeResolution in AuditorySystems, editedby A. Michelsen(Springer,London), pp. 159177.

J.M. Festenand R. Plomp:SRT in fluctuatingnoise

1735

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

Carhart, R. C., and Tillman, T. W. (1970). "Interaction of competing speechsignalswith hearinglosses,"Arch. Otolaryng.91, 273-279. Carhart, R. C., Tillman, T. W., and Greetis, E. S. (1969). "Perceptual maskingin multiplesoundbackgrounds,"J. Acoust.Soc.Am. 45, 694703.

Dirks, D. D., and Bower,D. R. (1969) ."Maskingeffectsof speechcompeting messages," J. SpeechHear. Res. 12, 229-245. van Dijkhuizen, J. N., Festen,J. M., and Plomp, R. (1989). "The effectof varyingthe amplitude-frequency responseon the maskedspeech-reception thresholdof sentencesfor hearing-impairedlisteners,"J. Acoust. Soc. Am. 86, 621-628.

Duquesnoy,A. J. (1983). "Effect of a singleinterferingnoiseor speech sourceupon the binaural sentenceintelligibility of aged persons,"J. Acoust. Soc. Am. 74, 739-743.

Fastl, H. (1979). "Temporal masking effects:III. Pure tone masker," Acustica 43, 282-294.

Festen, J. M. (1987). "Speech-receptionthreshold in fluctuating backgroundsoundanditspossiblerelationto temporalauditoryresolution," in The Psychophysics of SpeechPerception,editedby M. E. H. Schouten (Nijhoff, The Netherlands) (NATO ASI SeriesD). Festen,J. M., and Plomp, R. (1986). "The extra effectof maskerfluctuationson the SRT for hearing-impairedlisteners,"Proc. 12thInt. Congr. Acoust.,Toronto, Vol. 1, Paper B 11-4. French,N. R., andSteinberg,J.C. (1947). "Factorsgoverningtheintelligibility of speechsounds,"J. Acoust.Soc.Am. 19, 90-119. Gelfand,S. A., Piper,N., andSilman,S. (1986). "Consonantrecognitionin quietandin noisewith agingamongnormalhearinglisteners,"J. Acoust. Soc. Am. 80, 1589-1598.

Gelfand, S. A., Ross,L., and Miller, S. (1988). "Sentencereceptionin noise from one versustwo sources:Effects of aging and hearing loss," J. Acoust. Soc. Am. 83, 248-256.

Green,D. M. (1985). "Temporalfactorsin psychoacoustics," in TimeResolutionin AuditorySystems,editedby A. Michelsen(Springer,London), pp. 122-140. Hall, J. W., and GroseJ. H. (1988). "Comodulationmaskingrelease:evidencefor multiple cues,"J. Acoust. Soc.Am. 84, 1669-1675. Hall, J. W., Haggard,M.P., and Fernandes,M. A. (1984). "Detectionin noiseby spectro-temporal patternanalysis,"J. Acoust.Soc.Am. 76, 5056.

Hall, J. W., Davis, A. C.. Haggard. M.P., and Pillsbury,H. C. (1988a). "Spectro-temporal analysisin normal-hearing andcochlear-impaired listeners," J. Acoust. Soc. Am. 84, 1325-1331.

Hall, J. W., Grose, J. H., and Haggard M.P. (1988b). "Comodulation maskingreleasefor multicomponentsignals,"J. Acoust.Soc.Am. 83, 677-686.

Hawkins, J. E., and Stevens,S.S. (1950). "The maskingof pure tonesand of speechby whitenoise,"J. Acoust.Soc.Am. 22, 6-13. Houtgast,T., and Steeneken,H. J. M. (1985). "A reviewof the MTF conceptin room acousticsand its usefor estimatingspeechintelligibilityin auditoria," J. Acoust. Soc. Am. 77, 1069-1077.

Irwin, R. J., Hinchcliff, L. K., and Kemp, S. (1981). "Temporal acuity in normaland hearing-impairedlisteners,"Audiology20, 234-243. Jesteadt,W., Bacon,S. P., and Lehman,J. R. (1982). "Forward maskingas a functionof frequency,maskerlevel,and signaldelay," J. Acoust.Soc. Am. 71, 950-962.

Jokinen,K. (1973). "Presbyacusis. VI. maskingof speech,"Acta Oto-Laryngol. 76, 426-430. Kalikow, D. N., Stevens,K. N., and Elliott, L. L. (1977). "Developmentof a testof speechintelligibilityin noiseusingsentencematerialwith controlled word predictability,"J. Acoust.Soc.Am. 61, 1337-1351. Kamm, C. A., Dirks, D. D., andBell,T. S. (1985). "Speechrecognitionand the articulation index for normal and hearing-impaired listeners," J. Acoust. Soc. Am. 77, 281-288.

Kryter, K. D. (1962). "Methods for the calculationand useof the articula-

1736

J. Acoust.Soc.Am.,Vol.88, No.4, October1990

tion index," J. Acoust. Soc. Am. 34, 1689-1697.

Levitt, H. (1971). "Transformedup-down methodsin psychoacoustics," J. Acoust. Soc. Am. 49, 467-477.

Lindeman,H. E. (1967). "Bepalingvan de validiteitvan het gehoormet behulpvan eenbedrijfsspraakaudiometer," Tijdschr.Soc.Geneeskd.45, 814-837.

Ludvigsen,C., Elberling,C.,Keidser,G., and Poulsen,T. (1990). "Prediction of intelligibilityof non-linearlyprocessed speech,"Acta Otolaryngol. Suppl.469, 190-195. Miller, G. A. (1947) "The maskingof speech,"Psychol.Bull. 44, 105-129. Miller, G. A., and Licklider, J. C. R. (1950). "The intelligibilityof interruptedspeech,"J. Acoust.Soc.Am. 22, 167-173. Nelson,D. A., and Freyman,R. L. (1987). "Temporalresolutionin sensorineuralhearing-impaired listeners,"J. Acoust.Soc.Am. 81, 709-720. Palva,T. (1955). "Studiesof hearingfor pure tonesand speechin noise," Acta Oto-Laryngol. 45, 231-143. Patterson,R. D., Nimmo-Smith, I., Weber, D. L., and Milroy, R. (1982). "The deteriorationof heatingwith age:Frequencyselectivity,the critical ratio, the audiogram,and speechthreshold,"J. Acoust.Soc.Am. 72, 1788-1803.

Pavlovic,C. V. (1984). "Useof the articalationindexfor assessing residual auditoryfunctionin listenerswith sensorineural hearingimpairment,"J. Acoust. Soc. Am. 75, 1253-1258.

Pavlovic,C. V. (1987). "Derivationof primaryparameters andprocedures for usein speechintelligibilitypredictions,"J. Acoust.Soc.Am. 82, 413422.

Pavlovic, C. V., Studebaker,G. A., and Sherbecoe,R. L. (1986). "An ar-

ticulationindexbasedprocedurefor predictingspeechrecognitionperformanceof hearing-impaired individuals,"J. Acoust.Soc.Am. 80, 5057.

Plomp, R. (1964). "Rate of decayof auditory sensation,"J. Acoust. Soc. Am. 36, 277-282.

Plomp,R. (1978). "Auditory handicapof hearingimpairmentandthe limited benefitof hearingaids,"J. Acoust.Soc.Am. 63, 533-549. Plomp,R. (1986). "A signal-to-noise ratio modelfor the speech-reception thresholdof the hearingimpaired,"J. SpeechHear. Res.29, 146-154. Plomp,R., andMimpen, A.M. (1979a). "Improvingthe reliabilityof testing the speech-reception thresholdfor sentences," Audiology18, 43-52. Plomp,R., and Mimpen, A.M. (1979b). "Speech-reception thresholdfor sentences as a functionof ageand noiselevel," J. Acoust.Soc.Am. 66, 1333-1342.

Pollack,I. (1955) "Maskingby periodicallyinterruptednoise,"J. Acoust. Soc. Am. 27, 353-355.

Robinson,D. W., andSutton,G. J. (1979). "Age effectin hearing--A comparativeanalysisof publishedthresholddata," Audiology18, 320-334. Steeneken, H. J. M., and Houtgast,T. (1980). "A physicalmethodfor measuringspeech-transmission quality,"J. Acoust.Soc.Am. 67, 318-326. Smiarowski,R. A., and Carhart, R. (1975). "Relationsamongtemporal resolution,forward masking,and simultaneousmasking,"J. Acoust. Soc. Am. 57, 1169-1174.

Tyler, R. S. (1986). "Frequencyresolutionin hearing-impairedlisteners," in Frequency Selectivityin Hearing,editedby B.C. J. Moore (Academic, London), pp. 309-371. Viemeister,N. F. (1979). "Temporalmodulationtransferfunctionsbased upon modulationthresholds,"J. Acoust. Soc.Am. 66, 1364-1380. Widin, G. P., andViemeister,N. F. (1979). "Intensiveandtemporaleffects in pure-toneforward masking,"J. Acoust.Soc.Am. 66, 388-395. WorkingGroupon SpeechUnderstandingandAging (1988). "Speechunderstandingand aging," J. Acoust. Soc.Am. 83, 859-895. Zurek, P.M., and Delhorne,L. A. (1987). "Consonantreceptionin noise by listenerswith mild andmoderatesensorineural hearingimpairment," J. Acoust. Soc. Am. 82, 1548-1559.

Zwicker,E., andSchorn,K. (1982). "Temporalresolutionin hard-of-hearing patients,"Audiology21, 474-492.

J.M. FestenandR. Plomp:SRTinfluctuating noise

1736

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.174.21.5 On: Sat, 20 Dec 2014 00:50:13

Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing.

The speech-reception threshold (SRT) for sentences presented in a fluctuating interfering background sound of 80 dBA SPL is measured for 20 normal-hea...
2MB Sizes 0 Downloads 0 Views