J Am Acad Audiol 24:671–683 (2013)

Measuring the Long-Term SNRs of Static and Adaptive Compression Amplification Techniques for Speech in Noise DOI: 10.3766/jaaa.24.8.4 Ying-Hui Lai* Pei-Chun Li† Kuen-Shian Tsai* Woei-Chyn Chu* Shuenn-Tsong Young‡

Abstract Background: Multichannel wide-dynamic-range compression (WDRC) is a widely adopted amplification scheme in modern digital hearing aids. It attempts to provide individuals with loudness recruitment with superior speech intelligibility and greater listening comfort over a wider range of input levels. However, recent surveys have shown that compression processing (operating in the nonlinear regime) usually reduces the long-term signal-to-noise ratio (SNR). Purpose: The purpose of this study was to determine the long-term SNR in an adaptive compressionratio (CR) amplification scheme called adaptive wide-dynamic-range compression (AWDRC), and to determine whether this concept is better than static WDRC amplification at improving the long-term SNR for speech in noise. Design and Study Sample: AWDRC uses the input short-term dynamic range to adjust the CR to maximize audibility and comfort. Various methods for evaluating the long-term SNR were used to observe the relationship between the CR and output SNR performance in AWDRC for seven typical audiograms, and to compare the results with those for static WDRC amplification. Results: The results showed that the variation of the CR in AWDRC amplification can maintain the comfort and audibility of the output sound. In addition, the average long-term SNR improved by 0.1–5.5 dB for a flat hearing loss, by 0.2–3.4 dB for a reverse sloping hearing loss, by 1.4–4.8 dB for a high-frequency hearing loss, and by 0.3–5.7 dB for a mild-to-moderate-sloping high-frequency hearing loss relative to static WDRC amplification. The output long-term SNR differed significantly ( p , .001) between static WDRC and AWDRC amplification. Conclusions: The results of this study show that AWDRC, which uses the characteristics of the input signal to adaptively adjust the CR, provides better long-term SNR performance than static WDRC amplification. Key Words: Hearing aids, adaptive compression amplification, long-term SNR, AWDRC Abbreviations: ADRO 5 adaptive dynamic-range optimization; ANOVA 5 analysis of variance; AWDRC 5 adaptive wide-dynamic-range compression; CR 5 compression ratio; DCL 5 discomfort level; HLG 5 high-level gain; HT 5 hearing threshold; HTH 5 higher threshold; I/O 5 input/output; LLG 5 low-level gain; LTH 5 lower threshold; Maxest 5 maximum estimate; Minest 5 minimum estimate; SNR 5 signal-to-noise ratio; WDRC 5 wide-dynamic-range compression

*Institute of Biomedical Engineering, National Yang-Ming University, Taipei, Taiwan; †Department of Audiology and Speech-Language Pathology, Mackay Medical College, New Taipei City, Taiwan; ‡Center of Holistic Education, Mackay Medical College, New Taipei City, Taiwan Shuenn-Tsong Young, No. 46, Sec. 3, Jhong-Jheng Rd., Sanzhi Dist., New Taipei City 252, Taiwan; E-mail: [email protected] This research was supported by the National Science Council of Taiwan.

671

Journal of the American Academy of Audiology/Volume 24, Number 8, 2013

INTRODUCTION

W

ide-dynamic-range compression (WDRC) is a common amplification scheme in modern hearing aids. It attempts to provide individuals with loudness recruitment with superior speech intelligibility and greater listening comfort over a wider range of input levels by applying lower gain to louder sounds (Sammeth & Levitt, 2000; Dillon, 2001; Rosengard et al, 2005). Although hearing aids using compression can offer consistently audible and comfortable signals across a wide range of listening conditions in quiet without the need for adjustable volume controls, the benefit of compression for speech-in-noise conditions remains controversial. Several studies have found that compression amplification is more beneficial than linear amplification under speech-in-noise conditions. Olsen et al (2005) investigated three characteristics of fast-acting compression that could be beneficial to normal-hearing listeners. The results showed that compression provided better performance than linear amplification under all of the tested conditions when the signal-to-noise (SNR) ratio was negative. Gatehouse et al (2006) evaluated the benefits of fast-acting WDRC and linear reference fittings for speech intelligibility. They found that hearing-aid users with a high cognitive ability will benefit from fast-acting compression in terms of speech recognition under a speech-in-noise condition. Shi and Doherty (2008) evaluated a commercial four-channel compression-based digital hearing aid. The compression ratio (CR) was set individually for the participants in each channel and ranged from 1.07:1–2.67:1. The results showed that sentence intelligibility and speech clarity were higher for compression than for linear processing. In contrast, other studies have found that linear amplification is superior to several different types of compression under speech-in-noise conditions. van Buuren et al (1999) used methods with two CRs (4:1 and 2:1), linear amplification (1:1), and methods with two expansion ratios (1:2 and 1:4) to amplify the input sound when testing the speech intelligibility for 52 volunteers (26 with normal hearing and 26 with sensorineural hearing impairment) in the speech-in-noise condition. The results showed that the speech intelligibility under speech-in-noise conditions was, on average, better for linear amplification than for compression or expansion. Boike (2004) compared the recognition of linearly amplified and WDRC-amplified speech in noise by older and younger listeners with similar hearing losses. The average recognition scores were slightly lower (by z2%) for WDRC-amplified conditions than for linear amplification. Souza et al (2007) measured the speech recognition ability of elderly listeners with different hearing losses for speech-in-noise conditions, and found that the recognition scores were slightly lower for WDRC-amplified speech than for linearly amplified speech across all groups and noise conditions.

672

There are many possible reasons for this lack of consensus in performance comparisons between linear and WDRC amplification techniques in speech-in-noise conditions, such as differences in device configurations, input signal types, outcome evaluation methods, and the selection criteria applied to the listener groups. A new objective long-term SNR measure for WDRC was proposed by Hagerman and Olofsson (2004). This measure is designed to separate the signal and noise components from the combined system output of nonlinear amplification, and provides a better estimation of SNR than traditional SNR measures, which were originally designed for linear systems. Some researchers (Souza et al, 2006; Naylor and Johannesson, 2009) found that using this objective evaluation method to quantify the effects of WDRC in speech-in-noise conditions produced consistent outcomes. Naylor and Johannesson, (2009) measured the long-term SNRs at the input and output of a hearing aid with amplitude compression by changing the CR under different input noise types, channel numbers, attack times, and release times. The results showed that differences in these parameter values affected the output long-term SNR performance, with CR variations having the largest impact. The output long-term SNR was constant for all values of input SNR when CR 5 1:1, but it decreased as the CR increased and also after WDRC processing when the input SNR was positive, possibly because the gain of WDRC amplification increases low-level noise during speech pauses in the signal. During such pauses, the input level is dominated by the noise level, which therefore also determines the WDRC gain. Because the noise signal is lower than the speech signal for an SNR .0 dB, the gain will be higher during these periods, and this reduces the output SNR relative to the input SNR. Based on the above assumption, when the CR of WDRC amplification is .1 and the input SNR is positive, the amplification method seems to have a negative impact on the long-term SNR perspective. Although the dynamic range of sounds can exceed 100 dB, the range required for daily oral communication is typically only z40 dB. The range is even narrower if the dynamic range is measured for a short-term signal (e.g., 100 msec). Therefore, it is possible to use time-varying CR amplification while fulfilling the audibility and comfort requirements. The main purpose of this study was to determine the effects on the mean difference of the long-term SNR between using adaptive wide-dynamic-range compression (AWDRC) and using static WDRC amplification to process common speech signals under different noise conditions for typical types of hearing loss. The secondary purpose was to observe whether the factors of audiogram type, noise type, input signal level, and input SNR affect the long-term SNR performance of adaptively changing the CR of signal processing.

SNR of Static and Adaptive Compression Amplification/Lai et al

METHODS The AWDRC Amplification Scheme We propose an adaptive compression amplification method called adaptive wide-dynamic-range compression (AWDRC) that is based on adjusting the CR according to the current short-term dynamic range. There were three main goals when developing the new type of adaptive amplification: (1) ensuring that the signal is audible, (2) ensuring user comfort, and (3) selecting CR to be as close to unity (i.e., linear) as possible to reduce the “speech pause” effect. Fig. 1 shows a block diagram of our AWDRC amplification method. The items within the dotted area in the figure constitute traditional multichannel WDRC processing. Sound is received by the microphone and then is processed by multiple filters to divide the input into multiple frequency channels. The WDRC compressor in each channel determines the signal level, which is used to calculate the corresponding gain that is applied in that channel. The outputs of all of the channels are summed to produce the final amplified sound supplied to the hearing-aid user. A typical WDRC-processing input/output (I/O) function for a single channel is depicted in Fig. 2. The signal processing is divided into four regions based on the input level: (i) a linear gain for input levels below the lower threshold (LTH), (ii) WDRC for input levels between the LTH and the higher threshold (HTH), with the output level increasing by 1/CR dB for each dB increase in the input level (e.g., in Fig. 2, the output level increases by 1/2 dB for each 1 dB increase in

the input level), and (iii) a linear gain for input levels above the HTH until (iv) compression limiting is activated for very high signal levels to keep the output lower than a predefined level. A CR value of 1 equates to linear amplification for all except very high inputs. The shape of the I/O curve (Fig. 2) can be adjusted by varying the low-level gain (LLG), LTH, high-level gain (HLG), HTH, and CR (Souza, 2002; Venema, 2006), where CR 5 (HTH – LTH) / [(HTH 1 HLG) – (LTH 1 LLG)]. In our design, the parameters of the WDRC amplifier can be set to any nonlinear prescription formula, such as NAL-NL1 (National Acoustic Laboratories— Nonlinear Version 1) (Dillon, 2001) and DSL input/ output (DSL[i/o]) (Cornelisse et al, 1995), based on the corresponding values of LTH, LLG, HTH, HLG, and CR. Each channel includes a feedback component (adaptive concept) whose input is the output of the channel, and the output of the feedback component is used to adjust the CR of that channel. The first task of this feedback component is to perform a boundarycheck calculation, whereby the output level of the WDRC amplifier is estimated on a frame basis and the result is stored in a first-in-first-out queue of finite length L. When a new estimate is obtained, the maximum estimate (Maxest) and the minimum estimate (Minest) of the L estimates in the queue are used to drive the operations according to the AWDRC rules by the delta_gain parameter (with units of decibels per millisecond), which will increase or decrease the HLG or LLG in the interval between two consecutive estimates (i.e., delta_t), and thus change the CR. The rules (which are executed every delta_t) governing the adaptive operations are described in detail below.

Figure 1. Signal processing of a hearing aid with AWDRC amplification. The AWDRC amplifier comprises a traditional WDRC amplifier and an adaptive component.

673

Journal of the American Academy of Audiology/Volume 24, Number 8, 2013

rule. In addition, the I/O relationship in AWDRC differs markedly from its traditional static behavior in a single WDRC channel. Fig. 3(d) shows an example of the I/O function of AWDRC, where the compressor seeks to match the design goals by using CRs that fall within the shaded area. Experimental Design Types of Hearing Loss

Figure 2. Steady-state I/O function for a single frequency channel in a digital hearing aid. The I/O curve can be divided into four regions: low-input linear gain, compression, high-input linear gain, and compression limiting. The shape of the I/O curve is determined by the LLG, LTH, HLG, HTH, and CR.

1. Decrease-CR Rule The purpose of the decrease-CR rule is to keep the WDRC processing as close to linear as possible. As long as the Maxest is lower than the discomfort level (DCL) and the Minest is higher than the hearing threshold (HT), the LLG will be decreased and the HLG will be increased simultaneously by delta_gain, unless the CR already equals 1 (i.e., linearity). An example of the effect of this rule on the I/O function is shown in Fig. 3(a). 2. Comfort Rule The purpose of the comfort rule is to ensure that the output sound will not be amplified to an uncomfortable level. When the Maxest is higher than the DCL, the HLG will be decreased by delta_gain, unless the HLG is equal to the initially prescribed value. An example of the effect of this rule on the I/O function is shown in Fig. 3(b). 3. Audibility Rule The purpose of the audibility rule is to ensure that the output sound is audible to the hearing-impaired individual. When the Minest is lower than the HT, the LLG will be increased by delta_gain, unless the LLG is equal to the initially prescribed value. An example of the effect of this rule on the I/O function is shown in Fig. 3(c). The priorities of these three rules decrease in the following order: comfort rule, audibility rule, and decrease-CR

674

Fig. 4 shows the seven audiograms that were used in this study to test the efficacy of the new method. Audiograms 1–5 were from a series of studies conducted by National Acoustic Laboratories to compare the gain/ frequency response of NAL-NL1 and other prescriptions (Byrne et al, 2001; Keidser et al, 2003). They represent a flat loss (Audiogram 1), a reverse sloping loss (Audiogram 2), a moderately sloping high-frequency loss (Audiogram 3), a steeply sloping high-frequency loss with a normal low-frequency threshold (Audiogram 4), and a steeply sloping high-frequency loss with a mild low-frequency hearing loss (Audiogram 5). However, none of these five audiograms represents a mild-to-moderate sloping loss, which represents a no-negligible proportion of hearing losses. Based on Ciletti and Flamme (2008), we added two audiograms of mild-to-moderate-sloping high-frequency loss (audiograms 6 and 7) to better cover general hearing loss patterns. Test Signals The same speech signal was used in all measurements; it was the concatenated speech of Mandarin sentences lasting for 10 sec. The sentences were extracted from Taiwan news broadcasts spoken by a female newscaster. Three noise signals—range-hood noise, babble noise, and traffic noise—were also concatenated for 10 sec. First, the root-mean-square sound intensity of each signal was normalized to 65 dB SPL, corresponding to a moderate speech level. Next, speech and noise were combined at four SNR values (–2, 12, 16, or 110 dB) that are typical of everyday situations and produce a range of scores with ceiling or floor effects in the conditions used (Souza et al, 2006). The speech and noise signals were adjusted simultaneously by the same absolute amounts to produce the different SNRs; for example, when the input SNR was increased by 10 dB, the speech was increased by 5 dB and the noise was decreased by 5 dB. The combined sounds were then adjusted to three output levels: loud (75 dB SPL), moderate (65 dB SPL), and soft (50 dB SPL). Measurement Procedure The purpose of the measurement procedure was to quantify the variation of the CR in AWDRC amplification,

SNR of Static and Adaptive Compression Amplification/Lai et al

Figure 3. Example effects of AWDRC amplification on the I/O function. AWDRC amplification is based on the input sound modifying the I/O function adaptively. The solid line is the original I/O function of WDRC, and the dashed line is the I/O function of AWDRC amplification adjusted adaptively for the decrease-CR rule (a), comfort rule (b), and audibility rule (c). The gray lines in (b) and (c) are for an I/O function adjusted by the decrease-CR rule. The shaded region in (d) represents the adjustable range of AWDRC amplification in this example.

and compare differences in the long-term SNR between WDRC and AWDRC amplification methods for the same hearing-loss audiograms using a software simulation method. A platform developed using LabVIEW software was used to simulate the proposed AWDRC amplification method for hearing aids with four channels. The crossover frequencies between the channels were 0.56, 1.43, and 3.56 kHz. The parameters of AWDRC amplification needed to be set before testing. This study used the NAL-NL1 prescription for the WDRC amplification process. Fitting software from Audifon (a manufacturer of hearing aids) was used to calculate the WDRC parameters of NAL-NL1 for compensating for the hearing losses defined by the seven audiograms, and these parameters were also applied to the WDRC amplifier included in AWDRC amplification. In addition, the boundary of the DCL in AWDRC amplification was set according to the result predicted by the fitting software. The delta_gain in AWDRC was set to 1 dB and delta_t was set to 100 msec in this study, which corresponds to the LLG being increased by 1 dB every 100 msec. In addition, the attack and release times of WDRC were set as 5 and 26 msec, respectively, which are the same values used by Naylor and Johannesson (2009). Once the above parameters were set, 12 types of test signals were used to compare the long-term SNR

performance between the WDRC and AWDRC amplification schemes with the aid of objective methods (Olsen et al, 2005; Naylor and Johannesson, 2009). Methods of SNR Evaluation The separation technique of the long-term SNR developed by Hagerman and Olofsson (2004) was used to extract the speech and noise components from WDRC-amplified (Olsen et al, 2005; Souza et al, 2006) and AWDRC-amplified sounds. Three types of speech-plus-noise files were created: (1) the original speech and the original noise, (2) the inverted original speech and the original noise, and (3) the original speech and the inverted original noise. In each case, the speech and noise were combined to produce the desired input SNR by adjusting the sound level in each channel of a two-channel waveform file. The speech-plus-noise files were processed by WDRC and AWDRC amplifiers, after which the speech or noise files were ready to be extracted. Two files were used to obtain the waveform of the speech after processing: signal plus noise, and signal plus inverted noise. The noise was canceled by adding these files, leaving only the speech. This extracted speech-alone file included speech at double

675

Journal of the American Academy of Audiology/Volume 24, Number 8, 2013

Figure 4. Audiograms of five types of sensorineural hearing loss used for comparing the AWDRC and WDRC amplification methods: (1) flat loss, (2) reverse sloping loss, (3) moderately sloping high-frequency loss, (4) steeply sloping high-frequency loss with a normal lowfrequency threshold, (5) steeply sloping high-frequency loss with a mild low-frequency hearing loss, and (6) and (7) mild-to-moderatesloping high-frequency hearing loss.

the intensity, and hence the speech level was reduced by 6 dB to obtain accurate speech levels. The waveform of the noise after processing was obtained similarly by using signal plus noise, and inverted signal plus noise, with the level of the extracted noise also being reduced by 6 dB to obtain accurate noise levels. The long-term SNR is defined as the difference between speech and noise in decibels. Data Analysis The mean difference of the long-term SNR between the AWDRC and static WDRC amplification schemes was first determined using the paired-samples t-test. We then investigated whether the factors of audiogram type, noise type, input signal level, and input SNR affect the long-term SNR of adaptively changing the CR. This was achieved using one-way analysis of variance (ANOVA), and multiple-comparison procedures (post hoc testing of the Tukey method) were then used to detect means that differed for each factor. All statistical analyses were performed using SPSS version 19.0. RESULTS

T

o monitor the change in CR with time, we sampled the instantaneous CRs every 100 msec in each test condition. The trends of the change in CR in each test condition were similar, and hence the results for the rangehood noise condition were chosen as a representative

676

example. Fig. 5 shows the results of WDRC and AWDRC processing for the flat type of hearing loss with the speech-in-range-hood-noise condition. Fig. 5(a) shows the unprocessed speech sentences that include rangehood noise for an input SNR of 6 dB, Fig. 5(b) shows the results for the WDRC amplifier with the NALNL1 prescription, Fig. 5(c) shows the corresponding results for AWDRC amplification, and Fig. 5(d) shows the changes in CRs of channels 1–4 in AWDRC amplification. In this example, because the dynamic ranges of the speech signals were z30 dB, which is lower than the residual dynamic range (z50 dB) of the hearing loss, the CR of each channel will eventually be close to 1. Comparisons of the long-term SNR between WDRC (NAL-NL1 prescription) and AWDRC amplification methods for three input noise conditions are shown in Figs. 6–8. In each figure, the vertical axis represents the average of the long-term SNR difference between the AWDRC and WDRC amplification methods, and data are shown for different-input SNRs for the seven audiograms depicted in Fig. 4. A positive difference means that the long-term SNR was higher for AWDRC amplification than for WDRC amplification. For audiograms 3–5, the respective long-term SNRs in rangehood, babble, and traffic noise were 1.4–3.5, 2.6–4.2, and 2.6–4.8 dB higher for AWDRC than for WDRC amplification (the ranges correspond to input SNRs ranging from –2 to 10 dB); the corresponding differences were 0.1–2.7, 2.0–5.5, and 1.5–4.3 dB for

SNR of Static and Adaptive Compression Amplification/Lai et al

Figure 5. Comparisons in the presence of range-hood noise: (a) unprocessed sentences, (b) sentences with WDRC amplification processing, and (c) sentences with AWDRC amplification processing. In (a) to (c), a y-axis magnitude of 8500 corresponds to 102 dB SPL. (d) Variation of CR (channels 1–4) for AWDRC amplification processing.

audiograms 1, 0.3–3.4, 0.4–2.9; 0.2–2.9 dB for audiogram 2; and 0.3–2.1, 0.4–5.5, and 0.5–5.7 dB for audiograms 6 and 7. The overall mean 6 SD value of the output long-term SNR was 5.8 6 4.9 dB for AWDRC and 3.2 6 4.1 dB for WDRC. The paired-samples t-test

indicated the presence of significant differences between AWDRC and WDRC (t518.1, p , .001). Table 1 presents the results of one-way ANOVA and Tukey post hoc comparisons for the four factors. The ANOVA testing revealed that the preferences differed

Figure 6. Long-term SNR difference between WDRC and AWDRC amplification methods for range-hood noise.

677

Journal of the American Academy of Audiology/Volume 24, Number 8, 2013

Figure 7. Long-term SNR difference between WDRC and AWDRC amplification methods for babble noise.

significantly with the audiogram type across the seven groups (F56.15, p , .001), whereas Tukey post hoc comparisons of the seven groups indicated the significance difference for group pairs: (1,6), (2,3), (2,4), (2,7), (3,6), (4,6), (5,6), and (6,7). The preferences also differed significantly across the three groups of noise type (F57.04,

p 5 .001). Tukey post hoc comparisons of the three groups indicated the significance difference for group pairs: (range-hood noise, babble noise) and (range-hood noise, babble noise). The preferences also differed significantly across the three groups of input signal level (F559.74, p , .001). Tukey post hoc comparisons of the

Figure 8. Long-term SNR difference between WDRC and AWDRC amplification methods for traffic noise.

678

SNR of Static and Adaptive Compression Amplification/Lai et al

Table 1. Mean Differences of the Long-Term SNR between AWDRC and WDRC Amplification Schemes in Which Each Factor Is Included in the One-Way ANOVA Statistical Analysis Test and Post Hoc Tukey Testing Is Applied Variables

n

Mean Difference

SD

Audiogram 1 2 3 4 5 6 7

36 36 36 36 36 36 36

2.58 1.67 3.20 3.26 3.00 1.04 3.27

2.06 2.72 2.40 2.30 2.11 0.74 2.05

R B T

Noise type Range hood Babble Traffic

84 84 84

1.84 2.97 2.92

1.69 2.48 2.38

L M S

Input level Loud Moderate Soft

84 84 84

3.57 3.39 0.76

2.03 1.69 1.87

10 6 2 22

Input SNR 10 dB 6 dB 2 dB 22 dB

63 63 63 63

3.58 2.95 2.21 1.56

2.53 2.20 1.92 1.86

Group Notation 1 2 3 4 5 6 7

F

p

6.15

,0.001

Post Hoc Comparison* (Groupi, Groupj) (1,6) (2,3) (2,4) (2,7) (3,6) (4,6) (5,6) (6,7)

7.04

0.001 (R,B) (R,T)

59.74

,0.001 (L,S) (M,S)

10.53

,0.001 (10,2) (10,–2) (6, –2)

Note: Dependent variable: Mean difference 5 the mean output long-term SNR of AWDRC minus the mean output long-term SNR of WDRC (in dB). *Where the mean difference is significant at the a 5 0.05 level.

three groups indicated the significance difference for group pairs: (loud, soft) and (moderate, soft). Finally, the preferences also differed significantly across the four groups of input SNR (F510.53, p , .001). Tukey post hoc comparisons of the four groups indicated the significance difference for group pairs: (10, 2), (10, –2), and (6, –2). DISCUSSION

T

his study has proposed a novel method of amplification, named AWDRC, for hearing aids. Adjusting the CR in AWDRC amplification improved the SNR in speech-in-noise conditions relative to using WDRC amplification. In the applied test conditions, the output long-term SNRs were significantly higher for AWDRC amplification than for static WDRC amplification ( p , .001) when the methods were used to process common speech signals under different noise conditions for typical types of hearing loss. This result implies that AWDRC could improve speech intelligibility for hearingaid users listening to speech in a noisy environment. The results also showed that the factors of the audiogram type, noise type, input signal level, and input SNR all affect the long-term SNR performance of AWDRC amplification. AWDRC provided long-term SNR improvements of more than 2.5 dB for audiogram

types 1, 3, 4, 5, and 7; of 1.67 dB for audiogram 2; and of 1.04 dB for audiogram 6. This indicates that the benefit of using AWDRC varies with the type of hearing loss. AWDRC also improved the SNR by up to 2.9 dB relative to WDRC for babble and traffic noise types and by 1.84 dB for range-hood noise, which implies that the effects of AWDRC on SNR are better for unstable types of noise. In the input signal level, AWDRC improved the SNR by up to 3.4 dB relative to WDRC for loud and moderate input levels and by 0.76 dB for soft input levels, which indicates that the effects of AWDRC on SNR are better for louder input signals. Finally, AWDRC improved the SNR by up to 2.95 dB relative to WDRC for input SNRs of 10 and 6 dB, by 2.21 dB for an input SNR of 2 dB, and by 1.56 dB for an input SNR of –2 dB. Hence, the long-term benefit in SNR performance obtained by using AWDRC will decrease as the input SNR decreases. In summary, the improvement in the output long-term SNR obtained by using AWDRC amplification is influenced by various factors, but this method is always superior to WDRC amplification. The left panels in Fig. 9 show an example of the input level variation in channels 1–4 for an input SNR of 6 dB with a loud input in the range-hood noise condition. Although the input SNR was positive (in a long-term sense), the speech signal level in a speech pause

679

Journal of the American Academy of Audiology/Volume 24, Number 8, 2013

segment (in a short-term sense) will be lower than the noise level. WDRC amplification will increase the lowlevel noise during the pauses in the speech signal and thereby decrease the SNR performance; in contrast, the rules of AWDRC decrease the CR of WDRC, resulting in the low-level noise during pauses in the speech possibly producing a lower gain and thereby improving the SNR performance relative to WDRC amplification. We have already mentioned the assumption about compression processing (CR .1) decreasing the longterm SNR when the input SNR was positive due to transient nonlinear amplification of the speech pauses, and the expectation that when the input SNR is negative, the long-term SNR would be higher for WDRC than for linear amplification. The current data show that the long-term SNR was higher for AWDRC amplification than for WDRC, even when the input SNR was –2 dB. The right panels in Fig. 9 show an example of the input level variation in channels 1–4 for an input SNR of –2 dB with a loud input in the speech-inrange-hood-noise condition. Although the long-term signal power is 2 dB lower than the noise, for most cases the frequency power distribution would differ between the noise and speech, and thus the short-term levels in different channels (which determine the corresponding WDRC gains) were not always determined by the noise. For an extreme example, if the noise power is concentrated in the frequency range of channel 1, the WDRC gains in other channels will be determined by the speech signals in those channels. When the level in a channel was mostly determined by speech, the WDRC gain for signals will be lower than the gain for noise during a speech pause, and might have a negative impact

on the long-term SNR. In the right panels of Fig. 9, the variation of the green curve (corresponding to a combination of speech and noise in channel 1) was relatively small, and the corresponding variation of the WDRC gain was also small. However, in channels 2–4 the green curve was alternately dominated by speech or noise. For example, the green curve in channel 3 was dominated by the speech signal from 0.3–0.4 sec, but by the noise from 0.4–0.5 sec. This alternating behavior was possibly responsible for the long-term SNR performance of WDRC not being superior to that of AWDRC for an input SNR of –2 dB. Naylor and Johannesson (2009) also pointed that the long-term SNR was worse for WDRC than for linear amplification for an input SNR of up to –6 dB, because this represents a crossover point in the input SNR “bias”; this is consistent with the results of the present study. Naylor and Johannesson (2009) also found that the long-term SNR performance of linear amplification deteriorated closer to the point of the input SNR “bias.” AWDRC is based on the concept of remaining as close as possible to linear amplification, which implies that AWDRC amplification cannot provide better long-term SNR performance when close to the point of input SNR “bias.” Therefore, the long-term SNR performance of the compression system might be improved even more by adding a decision rule about whether to activate the adaptive compression based on the comparison of input SNR and the bias. The results showed that the AWDRC amplification scheme can adaptively adjust CR toward 1 under the conditions used in this study. Another adaptive amplification scheme, named adaptive dynamic-range optimization (ADRO) and proposed by Blamey et al (2004),

Figure 9. Examples of input level variation in channels 1–4 for speech in range-hood noise over 5 sec. The left and right panels are for input SNRs of 6 and –2 dB, respectively. Black, red, and green curves are the input speech segment, input noise segment, and speech-plusnoise segment (test signal), respectively.

680

SNR of Static and Adaptive Compression Amplification/Lai et al

uses statistical analysis to select the frequency channel containing the most information (the intensity statistical distribution of input signal between 30 and 90% in each frequency channel) of the input sound, and fuzzy logic rules are used to control the gain in each frequency channel to ensure that the signals are presented at an audible and comfortable level (Blamey, 2005a, 2005b; Blamey et al, 2004, 2005). The ADRO amplifier scheme could be considered to provide a type of automatic linear volume control without using compression processing in each channel (Blamey et al, 2005; Venema, 2006). In contrast, AWDRC will approach a response that is as linear as possible whenever there is sufficient dynamic range for signal amplification without compression, and will transition to the minimum CR when the dynamic range is insufficient for signal amplification. As indicated above, these two amplification processors are controlled by different parameters: when the residual dynamic range of a hearing-loss individual is higher than the input-sound dynamic range, ADRO and AWDRC will adaptively adjust individual parameters—target for ADRO (Blamey et al, 2004) and CR for AWDRC—to produce linear amplification. The output long-term SNR performances of these two amplification schemes should be similar. However, AWDRC requires less computing resources to calculate the Maxest and Minest values after the amplification process without using intensity statistical distribution compared with ADRO, thereby greatly reducing the calculation time. Moreover, for extreme speech-in-noise conditions (i.e., where the input-sound dynamic range is higher than the residual dynamic range of a hearing-loss individual), certain speech signals will be lost in ADRO amplification. In contrast, AWDRC amplification will adaptively adjust the CR in an attempt to make all speech signals fall within the residual dynamic range of a hearing-loss individual. Therefore, although the two amplification schemes have similarities, they have notable differences in the controlled parameters, used boundary values (Maxest and Minest for AWDRC; 30% and 90% of intensity statistical distribution for ADRO), and ability for speech-signal retention within the audibility range under extreme input conditions. WDRC amplification sometimes uses expansion (i.e., CR,1), mainly to reduce the gain for very soft input noises (e.g., ,40 dB SPL), such as those due to the internal noise of the microphone and low-level environmental noises. The concept of expansion is to reduce the gain for very soft input sounds, and then rapidly increase the gain as the input increases, until the expansion threshold of WDRC amplification is reached (Venema, 2006). Therefore, when the environmental noise (or the noise during speech pauses) is below the expansion threshold, this expansion process of WDRC can improve the output SNR performance (Ghent et al, 2000; Kuk, 2002). How-

ever, when the input speech signal is contaminated by noise, the signal level is usually above the expansion threshold even during speech pauses. In this study, the input levels were set to simulate normal speech conditions and were higher than the expansion threshold. The expansion method would, then, not reduce the gain for the noise during speech pauses. In contrast, AWDRC will continuously and adaptively adjust the LLG value of WDRC based on the input speech signal. This approach could result in AWDRC providing a more suitable amplifier gain for speech pauses under speechin-noise conditions. Although the expansion threshold of WDRC amplification can be adjusted to increase the range of expansion, studies have shown that expansion has negatively affected the recognition of low-level speech in both quiet and noisy conditions (Plyler et al, 2005; Plyler et al, 2007). Therefore, increasing the range of expansion would not overcome the problem of the decreased output long-term SNR of the WDRC amplification scheme in various noise conditions. Actually, the WDRC amplification scheme could also be viewed as adaptive, in the sense that the actual CR will vary with the input amplitude. However, the nominal CR is only achievable when using very short attack times (shorter than 5 msec) and release times (shorter than 30 msec) (Kuk, 2002). Kuk and Ludvigsen (1999) quantitatively illustrated how the release time changes the nominal CR. Their results showed that a long release time (typically longer than 1–2 sec) was more likely to make the effective amplification closer to linear (i.e., CR 5 1). However, this effect differs from the direct CR adjustment in AWDRC. In addition, a potential problem with a long release time is that if the hearing aid has just responded to a higher-level sound by decreasing the gain, it might not provide sufficient gain for a following lower input sound. The attack time and release time in the study were set according to a fastacting compressor; therefore, the CR of WDRC in AWDRC amplification could be viewed as the nominal CR. The CR in AWDRC amplification changes adaptively according to the characteristics of the input signal, while maintaining the gain as close as possible to linear amplification. However, when the range of input levels exceeds the residual dynamic range of hearing, the CR of AWDRC will not be decreased by the decrease-CR rule, instead remaining at the initial WDRC parameter values. Therefore, the worst condition of AWDRC amplification was keeping the original values of the WDRC parameters. Noise-reduction techniques are popular for enhancing the SNR in hearing aids, but their performance can be affected by the compression inherent in WDRC amplification (Chung, 2007). AWDRC amplification adaptively adjusts the WDRC parameters, and decreases the CR as much as possible according to the particular input signal. Therefore, if noise-reduction

681

Journal of the American Academy of Audiology/Volume 24, Number 8, 2013

techniques are used in conjunction with AWDRC amplification, for most of the time, AWDRC will not affect the gain obtained by the noise reduction algorithm, which implies fewer algorithm interaction effects and that the benefits of noise reduction processing might be closer to the design expectations. Most of the signal-processing steps in AWDRC amplification are similar to those in WDRC amplification, with only an extra feedback component being required to implement the AWDRC rules, and the DCL and HT parameters being checked in each channel. Therefore, the additional computing resources needed represent a small percentage of those required for all of the amplification steps, making it feasible to implement AWDRC using the microprocessors that are already present in existing hearing aids. A limitation of this study was that the experiments involved software simulations only, without the use of any actual hearing aids, and did not consider the effects of various factors such as the characteristics of the microphone, receiver, and housing. Therefore, the overall effectiveness of the AWDRC amplification system in real hearing aids needs to be investigated. Also, the objective and subjective speech intelligibility and sound quality should be evaluated in the clinical situation. CONCLUSION

T

his study has shown that the characteristics of the input signal can be used to adaptively adjust CR amplification. The results show that AWDRC provided better long-term SNR performance that differed significantly from static CR compression amplification (i.e., WDRC) when tested with speech in three types of noise for input SNRs from –2 to 10 dB. This finding implies that AWDRC amplification could provide hearing-aid users listening to speech in a noisy environment with superior speech intelligibility. In addition, because AWDRC amplification is based on the traditional concept of WDRC and uses a similar fitting method, audiologists will find it relative easy to implement using their existing experience of WDRC amplification.

Acknowledgments. We thank Ing-Tiau Kuo, Hsin-Yi Huang, and Hui-Chen Lee of the Division of Experimental Surgery Biostatistics Task Force, Taipei Veterans General Hospital, for help with the statistical analyses.

Blamey PJ, Macfarlane DS, Steele BR. (2005) An intrinsically digital amplification scheme for hearing aids. EURASIP J Appl Signal Process 2005:3026–3033. Blamey PJ, Martin LFA, Fiket HJ. (2004) A digital processing strategy to optimize hearing aid outputs directly. J Am Acad Audiol 15(10):716–728. Boike KT. (2004) Influence of Age on the Recognition of Amplitude Compressed Speech in Backgrounds Varying in Temporal Complexity. University of Washington. Byrne D, Dillon H, Ching T, Katsch R, Keidser G. (2001) NAL-NL1 procedure for fitting nonlinear hearing aids: characteristics and comparisons with other procedures. J Am Acad Audiol 12(1): 37–51. Chung K. (2007) Effective compression and noise reduction configurations for hearing protectors. J Acoust Soc Am 121(2): 1090–1101. Ciletti L, Flamme GA. (2008) Prevalence of hearing impairment by gender and audiometric configuration: results from the National Health and Nutrition Examination Survey (1999-2004) and the Keokuk County Rural Health Study (1994-1998). J Am Acad Audiol 19(9):672–685. Cornelisse LE, Seewald RC, Jamieson DG. (1995) The input/ output formula: a theoretical approach to the fitting of personal amplification devices. J Acoust Soc Am 97(3):1854–1864. Dillon H. (2001) Hearing Aids, Chapter 6. Thieme Medical. Gatehouse S, Naylor G, Elberling C. (2006) Linear and nonlinear hearing aid fittings—1. Patterns of benefit. Int J Audiol 45(3): 130–152. Ghent RM, Jr, Nilsson MJ, Bray VH, Jr. 2000. Uses of expansion to promote listening comfort with hearing aids Poster presented at the American Academy of Audiology Convention. Chicago, IL. Hagerman B, Olofsson A. 2004. A method to measure the effect of noise reduction algorithms using simultaneous speech and noise. Acta Acustica 90(2):356–361. Keidser G, Brew C, Peck A. (2003) How proprietary fitting algorithms compare to each other and to some generic algorithms. Hear J 56(3):28–38. Kuk FK. 2002. Considerations in modern multichannel nonlinear hearing aids. In: Valente M, ed. Hearing Aids: Standards, Options, and Limitations. New York: Thieme, 178–213. Kuk FK, Ludvigsen C. (1999) Variables affecting the use of prescriptive formulae to fit modern nonlinear hearing aids. J Am Acad Audiol 10(8):458–465. Naylor G, Johannesson RB. (2009) Long-term signal-to-noise ratio at the input and output of amplitude-compression systems. J Am Acad Audiol 20(3):161–171.

REFERENCES

Olsen HL, Olofsson A, Hagerman B. (2005) The effect of audibility, signal-to-noise ratio, and temporal speech cues on the benefit from fast-acting compression in modulated noise. Int J Audiol 44(7): 421–433.

Blamey PJ. (2005a) Adaptive dynamic range optimization (ADRO): a digital amplification strategy for hearing aids and cochlear implants. Trends Amplif 9(2):77–98.

Plyler PN, Hill AB, Trine TD. (2005) The effects of expansion on the objective and subjective performance of hearing instrument users. J Am Acad Audiol 16(2):101–113.

Blamey PJ. (2005b) Sound processing in hearing aids and CIs is gradually converging. Hear J 58(11):44.

Plyler PN, Lowery KJ, Hamby HM, Trine TD. (2007) The objective and subjective evaluation of multichannel expansion in wide

682

SNR of Static and Adaptive Compression Amplification/Lai et al

dynamic range compression hearing instruments. J Speech Lang Hear Res 50(1):15–24. Rosengard PS, Payton KL, Braida LD. (2005) Effect of slow-acting wide dynamic range compression on measures of intelligibility and ratings of speech quality in simulated-loss listeners. J Speech Lang Hear Res 48(3):702–714. Sammeth CA, Levitt H. (2000) Hearing Aid Selection and Fitting in Adults: History and Evolution. New York: Thieme.

Souza PE, Boike KT, Witherell K, Tremblay K. (2007) Prediction of speech recognition from audibility in older listeners with hearing loss: effects of age, amplification, and background noise. J Am Acad Audiol 18(1):54–65. Souza PE, Jenstad LM, Boike KT. (2006) Measuring the acoustic effects of compression amplification on speech in noise. J Acoust Soc Am 119(1):41–44.

Shi LF, Doherty KA. (2008) Subjective and objective effects of fast and slow compression on the perception of reverberant speech in listeners with hearing loss. J Speech Lang Hear Res 51(5):1328–1340.

van Buuren RA, Festen JM, Houtgast T. (1999) Compression and expansion of the temporal envelope: evaluation of speech intelligibility and sound quality. J Acoust Soc Am 105(5): 2903–2913.

Souza PE. (2002) Effects of compression on speech acoustics, intelligibility, and sound quality. Trends Amplif 6:131.

Venema TH. (2006) Compression for Clinicians. 2nd ed. Clifton Park, NY: Thomson Delmar Learning.

683

Copyright of Journal of the American Academy of Audiology is the property of American Academy of Audiology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

Measuring the long-term SNRs of static and adaptive compression amplification techniques for speech in noise.

Multichannel wide-dynamic-range compression (WDRC) is a widely adopted amplification scheme in modern digital hearing aids. It attempts to provide ind...
1015KB Sizes 0 Downloads 0 Views