Quarterly Journal of Experimental Psychology

ISSN: 0033-555X (Print) (Online) Journal homepage: http://www.tandfonline.com/loi/pqje19

Backward masking and asymmetry of processing for vowels differing in acoustical similarity John Allen & Mark Haggard To cite this article: John Allen & Mark Haggard (1978) Backward masking and asymmetry of processing for vowels differing in acoustical similarity, Quarterly Journal of Experimental Psychology, 30:1, 43-55, DOI: 10.1080/14640747808400653 To link to this article: http://dx.doi.org/10.1080/14640747808400653

Published online: 21 Jun 2007.

Submit your article to this journal

Article views: 14

View related articles

Citing articles: 2 View citing articles

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=pqje19 Download by: [NUS National University of Singapore]

Date: 06 November 2015, At: 05:23

Quarterly Journal of Experimental Psychology (1978) 30, 43-55.

Downloaded by [NUS National University of Singapore] at 05:23 06 November 2015

BACKWARD MASKING AND ASYMMETRY OF PROCESSING FOR VOWELS DIFFERING I N ACOUSTICAL SIMILARITY JOHN ALLEN" Laboratory of Experimental Psychology, University of Sussex, U.K. AND MARK HAGGARD MRC Institute of Hearing Research, Medical School, University of Nottingham, U.K. Several investigations have reported problems in demonstrating for vowels the phenomena of backward masking and right ear advantage so easily demonstrated for consonants. An experiment is reported on the dichotic backward masking (BM) of acoustically similar and dissimilar sets of vowels in consonant-vowel syllables. The results suggest that increasing perceptual difficulty by varying the acoustic similarity of the stimulus set augments BM. A second experiment showed that the right ear advantage (REA) was augmented by manipulating the acoustic similarity of the stimulus set and by decreasing intelligibility through the addition of noise. This and other evidence was employed to ascribe variations in both REA and BM to the interaction of perceptual processing time with information decay in precategorical acoustical storage. It is argued that this process interaction underlies statistical interactions in dichotic data which show that variations in difficulty (discriminability, noise addition, brain deterioration) affect most adversely the least favoured items (those presented earlier, those on the unattended ear, those on the left ear).

Introduction A variety of experimental situations have been held to support a difference in the mode of processing between vowels and consonants. Specifically, the place feature of voiced stop consonants has been shown (Pisoni, 1973) to display a greater degree of categoricality in perception (i.e. higher relative discriminability for acoustical differences at, rather than within phoneme category boundaries). This feature also shows greater susceptibility to backward masking (BM) (Porter, Shankweiler and Liberman, 1969), greater tendency to give the right ear advantage (REA) (Studdert-Kennedy and Shankweiler, 1970) and lesser tendency to produce recency effects in the short term memory serial position curve (Crowder, 1973a). The question then arises: are these differences due to the same properties in each case? And if so are they due to primary acoustical properties of the sounds, to * Now at: Central School of Speech and Drama, Eton Avenue, London NW3 3HY, U.K. to where requests for reprints should be directed.

43

Downloaded by [NUS National University of Singapore] at 05:23 06 November 2015

44

J. ALLEN AND M. HAGGARD

discriminability of the set of sounds, to the form of processing typically elicited by each, to the phoneme class per se, or to artifacts such as dissimilar performance levels? These five variables are not easy to dissociate and in proselytising the importance of the high degree of “encodedness” (i.e. contextual variability), of stop consonant sounds in explaining the phenomena involved, Liberman, Cooper, Shankweiler and Studdert-Kennedy (1967) and Liberman, Mattingly and Turvey (1972) did not attempt any empirical dissociation. I n the case of the REA a dissociation was achieved by Haggard (1971) who ruled out phoneme class by showing that an REA may be obtained with vowels when subjects are uncertain about the size of the vocal tract that produce them. I n the case of memory effects a dissociation was achieved by Darwin and Baddeley (1974) who also ruled out phoneme class and such acoustical concomitants as the duration of individual sounds in a class. They succeeded in demonstrating recency effects with recall lists composed of syllables contrasting in consonants or vowels which were acoustically dissimilar (e.g./g, J, m/ or /I, ae, u/, but found little, or no evidence of such effects with lists of syllables distinguished by acoustically similar items (e.g. /b, d, g/ or /I, ae, €I). Darwin and Baddeley attributed these differing effects to decay of information in auditory memory ; they argued that despite degradation in auditory storage prior to recall, sufficient information may remain to facilitate the recall of acoustically dissimilar speech sounds but insufficient information may survive to enable subjects to make judgements fine enough to distinguish between acoustically similar items. These authors extended the notion of confusions arising as a consequence of decay in auditory memory in order to acount for variations in REA and BM. According to their account, if the stimuli comprising a dichotic pair are acoustically distinct useful information will effectively persist for longer in auditory memory, thus giving the left hemisphere more time to identify the left ear stimulus which is held to be degraded (relative to the right ear) as a consequence of the poorer neural connections to that hemisphere (Sparks, Goodlass and Nickel, 1970). I n the case of acoustically similar dichotic stimuli, however, insufficient information may be available for the identification of the degraded left ear input. I n the extension of Darwin and Baddeley’s account to BM it is assumed that the processing of the leading member of an asynchronous dichotic pair of speech sounds is disrupted by the arrival of the lagging member. Following this disruption some representation of the initial sound is assumed to persist and this may be utilized by the categorizing mechanism after the lagging stimulus has been identified. The degree of BM will, therefore, depend upon whether sufficient information remains in auditory memory to enable the leading stimulus to be distinguished. Thus this account predicts that both the REA and BM should be greater with acoustically confusable stimulus sets. The present paper assesses and extends Darwin and Baddeley’s argument concerning REA and BM in the light of data derived from two experiments in which the acoustic confusability of the stimulus sets was manipulated. Using the same and further evidence it then corrects three possible misinterpretations of the contrast between Darwin and Baddeley’s account and that of Liberman et aE. The first misinterpretation might be that Darwin and Baddeley’s account implicated only the absolute level of performance. The second misinterpretation might be

Downloaded by [NUS National University of Singapore] at 05:23 06 November 2015

PROCESSING FOR VOWELS

45

that it implicated specifically memory processes, rather than categorisation processes, even in the three experimental tasks not obviously concerned with memory. T h e third misinterpretation might be that it totally demolished the relevance of c < encoding’’ as an explanatory concept. We argue here that the Darwin-Baddeley thesis illustrates a very general property of experimental paradigms that influence whether or not underlying processing differences can be revealed ; the encodedness of stimuli can affect that property although many other stimulus and task factors also affect it, Given that vowel sounds display little backward masking and minimal REA under most conditions, can we, as Darwin and Baddeley suggested, demonstrate substantial BM and REA by selecting an appropriately confusable set of vowels ? Experiment I addresses the masking issue and Experiment I1 the REA issue. In order to determine whether the acoustic similarity of the set of items emerges as a particularly potent determinant of backward masking and ear effects experiments must employ more than one basis of perceptual difficulty. Abbreviating stimuli down to 50 ms does not of itself give REA (Darwin, 1971) nor banish the suffix effects associated with precategorical acoustic storage (PAS) (Foreit, 1976). We, therefore, held duration constant and varied both the acoustic similarity and the signal-to-noise ratio of a set of vowels in dichotic presentation.

Experiment I No strong evidence exists for qualitatively different results between those tasks requiring subjects to report both dichotic stimuli on each trial and those tasks requiring single report from a designated ear ; we therefore simplified the subjects’ task and our interpretation by adopting the latter method. Likewise there is no evidence for variation in the ordering of qualitative determinants of BM effects over the range of 0-100ms in onset asynchrony for vowels (Porter and Mirabile, 1977). We therefore chose a single value of stimulus onset asynchrony (SOA) in the region that gives biggest effects with consonants and simply sought a BM effect for vowels. Method Stimulus preparation The four syllables /bI, be, bae, bu/ were synthesized and stored digitally using the University of Sussex digital speech synthesizer. These stimuli were identical (same synthesizer and control parameters) to those used by Darwin and Baddeley (1974). Each syllable lasted 60 ms, of which the final 30 ms was a steady-state vowel segment. The frequencies of the first two formants of each vowel are shown in Fig. I . The acoustical specifications of the stimuli are summarised in Table I. The fixed /b-/ context was employed to give a constant degree of linguistic set (cf. Spellacy and Blumstein, 1970). It will be noted that the onset frequencies of the formant transitions did not vary with the vowel in the fashion found in natural speech. Although not readily audible because of the categorical perception effect for stop consonants, this had the effect of ensuring that minimal vowel information was carried on the transition, a desirable precaution in view of effects of possible sequential encoding upon processing times (Wood and Day, 1975) and in view of the possibility that transitions as such may contribute to REA (Allen and Haggard, 1977). A second set of the

J. ALLEN AND M. HAGGARD

46

basic four syllables was produced and stored digitally, using a software procedure which permitted 0-3800 Hz broad-band noise to be added to each previously synthesized utterance giving a S/N ratio of - 10 dB.

* Downloaded by [NUS National University of Singapore] at 05:23 06 November 2015

$ s

/z/

1800

1400

g

1200

c

D

8

1000

3

t

/u/

6oo MO 300 400 500 600 700 800 First formant frequency (Hz)

FIGURE I. First and second formant frequencies of the vowel stimuli used in Experiment I.

TABLE I Acoustical speczjTcations of the stimuli used in Experiment I ~

Formant Transitions (Hz)

FIonset F2 onset F3 onset

190

760 2600

Transition durations 30 ms Fundamental frequency trajectory

I 30-122

Hz

Vowel steady states Duration 30 ms: fundamental frequency trajectory

/I/

FI

F2 F3

/El

122-1 15

Hz

/=I

lul

520

640 2020

760 1830

250

2050 2500

2500

2500

2500

880

Four randomized sequences of nonidentical dichotic pairs were generated by the synthesizer and recorded on a Revox (A77) tape recorder. The first sequence consisted of dichotic pairs comprised of syllables drawn from the acoustically similar set of /bI, bae, be/ and the second sequence was made up of dichotic pairs consisting of syllables drawn from the acoustically dissimilar stimulus ensemble /bI, bae, but. The remaining two sequences corresponded to the first two but with noise added. The SOA value was 30 ms on every trial. Each block consisted of a random sequence of all three possible pairings replicated 10times for each channel assignment, giving a total of 60 trials. The intertrial interval was 7 s. Within each block the lead and lag conditions for a particular channel occurred in random order, thus the subjects knew in advance the ear on which the target syllable would occur but not the temporal order of target and mask.

PROCESSING FOR VOWELS

47

Downloaded by [NUS National University of Singapore] at 05:23 06 November 2015

Subjects and procedure A group of 16 right-handed undergraduates acted as paid subjects. None had any known hearing loss and all were native speakers of English. The stimuli were presented over high quality headphones. In order to familiarize subjects with the stimuli and dichotic listening task a dichotic test was taken by all subjects after the four basic syllables had been demonstrated. In the dichotic test subjects were required to attend to one ear only during any particular block of trials and report the vowel presented to that ear. Subjects recorded their responses on specially prepared data sheets. The response categories ‘ 5 , e, a, 00” were used to represent the vowels /I, 6, ae, u/ respectively. Following the dichotic practice trials the subjects were randomly divided into four groups of four. Each group listened to the four different dichotic sequences in a counterbalanced order determined by a Latin-square design. Under each of the four stimulus conditions subjects received 60 dichotic trials twice. The ear to be attended was indicated prior to each block of trials. Any sequential effects on attention were counter-balanced across the eight blocks of trials using a repeated ABBA sequence for half the subjects in each of the four groups, and a BAAB sequence for the remainder. T o counter-balance any inequality between the two channels of the reproducing equipment all subjects reversed their headphones after each successive two blocks of trials. Additionally, in each group, one of the subjects assigned to the two different orders of attention began the dichotic trials with one channel on the left ear, while the other initially received the same channel on the right ear.

Results The accuracy of report for all four stimulus conditions is indicated in Figure 2 which shows the mean percentages of correct responses (pooled over both conditions of attention for both leading and lagging target stimuli). A four-way analysis of variance (ANOVA) was conducted upon the number of correct scores with factors: S/N ratio, acoustic similarity, ear of report and temporal offset. All the main effects were significant except the ear of report. There was a highly significant overall superiority for the identification of lagging as opposed to leading vowels (F = 32-09, df= I , I ~ ,P (0.001). The highly significant acoustic similarity offset interaction ( F = 20.25, df = I , I S , P

Backward masking and asymmetry of processing for vowels differing in acoustical similarity.

Quarterly Journal of Experimental Psychology ISSN: 0033-555X (Print) (Online) Journal homepage: http://www.tandfonline.com/loi/pqje19 Backward maski...
967KB Sizes 0 Downloads 0 Views