Qualitative acoustic analysis in the study of motor speech disorders Julie M. Liss

Department of Communication Disorders, 115$hevlinHall, University ofMinnesota, Minneapolis, Minnesota 55455

Gary Weismer Department of Communicative Disorders andtheWaismanCenter,University of Wisconsin, Madison, Wisconsin 53 705

(Received9 March 1992;acceptedfor publication29 July 1992)

Traditionalmeasurements performedon the acousticsignalsof normalspeechare frequently usedto quantifythe acousticcharacteristics of disorderedspeechaswell. This letter demonstrates howimportantaspects of speechproductiondeficitsin motorspeechdisorders may be overlookedif stringentquantificationprocedures are employed,especiallyin the stage of exploratorydataanalysis.It is suggested that qualitativeprocedures, whereinphenomena are inferredfrom visualexaminationof certainacousticdisplays,are usefulto supplement traditional measurements,and moreover,that they be usedto point to the typesof measurements that shouldbe madein the finer-grainedstagesof quantitativeanalysis. PACS numbers: 43.70.Dn

INTRODUCTION We have asserted elsewhere that traditional

I. METHODS

acoustic

measuresof temporaland spectralcharacteristics of normal speechmay not necessarilyreveal the inherently"important" aspectsof disorderedspeechproduction (Weismer and Liss, 1991;seealsoWestbury, 1991). By "important," we mean thosecharacteristicsof the acousticsignalthat are likely to reflect aspectsof disorderedsensorimotorcontrol and/or perceptualphenomenathat will play a prominent role in a theory of motor speechdisorders.We have also arguedthat becausethe traditional parametricapproaches necessarilyexcludeidiosyncraticand aberrant speechproduction from the analyses,and in fact may be compromised by the variability,muchpotentiallyrevealinginformationis sacrificed (Sussmanet al., 1988).

The purposeof this letter is to describehow the inclusionof a qualitativelevel of acousticanalysiscan elucidate potentially"important" phenomenain the disorderedproductionof vocalicsegments, and how this can leadto theoretical notions that yield testablehypothesesin both the quantitativeand qualitativedomains.By "qualitative"level of analysis,we meana visualinspectionandinterpretationof acousticdata, especiallyinvolvingrepetitionsof utterances by individualspeakers.The examplesdescribedin this letter were obtainedin the contextof a larger investigationof the

acoustic effects of contrastive stress • amongnormalgeriatric men,and subjectsexhibitingapraxiaof speechand ataxic dysarthria (Liss and Weismer, submitted). Whereas the currentexamplesinvolvequalitativeanalyses offormant trajectories associatedwith vocalic nuclei, the analysisapproachmight be appliedto any setof acousticor kinematic data (see Cooke and Brown, 1986, for a discussionof a similar method in researchon limb motor control, and Corcos et

al., 1990,for examplesof suchanalyses). 2984

J. Acoust.Soc. Am. 92 (5), November1992

AND PROCEDURES

For the largerstudy,subjectsincludedfour eachof control (C), apraxic (A), and ataxicdysarthric(D) speakers. Detaileddescriptions of thesesubjectscanbe foundin other publishedworks (e.g., McNeil et al., 1990). For the purposesof demonstration, we will examinedatafromonenormal male subject (age 63), and two male speakerswith apraxiaof speech(age 54 and 62) who werefreeof significant concomitantdysarthriaor aphasia. Subjectsproduced phrasesand sentencesfollowing tape-recordedstimuli. Theseproductionswere recordedon cassettetape usinghigh fidelity equipment.The two utterancesanalyzedin this investigation,"buy Bobbya poppy," and "build a big building,"wereproducedfivetimeseachin two conditions(for a total of 40 testutterancesper subject). These were randomly distributed among other sentences and phrases.Two speakingconditionswere utilized in the larger study:neutral and contrastivestress.Each sentence wasproducedin a neutral condition,for which no specific directionsaboutstressplacementweregiven,andin a condition in which each of the content words was contrastively stressed.

The quantitativeacousticanalysisof the largerinvestigationconsistedof measurementof segmentand utterance durations, and measurement of formant transition charac-

teristics.Here we will discussonly formant transitioncharacteristicsfor/aI/("buy") and/I1/("build"). Thesesegments were selected because they are associatedwith relativelylargeandcomplexchangesin articulatoryconfiguration that are reflectedmostnotablyin the trajectoryof the secondformant frequency(F2). We usethe term "trajectory" to referto the time-frequencypath of the centerof the formant band across the entire duration of the vocalic nu-

cleus.Transition duration (TD, in ms) and transition extent 0001-4966/92/112984-04500.80

¸ 1992 AcousticalSocietyof America

2984

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 155.33.120.209 On: Wed, 03 Dec 2014 17:55:25

(TE, in Hz) of the most rapidly changingportionsof the trajectoriesweremeasuredusinga Kay DSP 5500 workstation usinga wide-band( 300 Hz) spectrographic display(04 kHz scaleexpansion)and the associatedwaveforms(see WeismerandLiss, 1991). Slopevalueswerecalculatedfrom thesemeasures(TE/TD). Slopevalueswerenot calculated for productionsthat did not containthe expectedtrajectory shape[e.g.,Fig. 1(b), production# 2 doesnot containa 20Hz changein any 20-ms segment---ouroperationaldefinition of "flat" (see Weismer et al., 1988) ]. Inter- and intrajudgereliabilitywasacceptablefor all quantitativemeasures (Liss and Weismer, submitted).

In the qualitativephaseof the analysis,multipletrajectory tracingswere superimposed and visuallyexaminedto identify and describepatternsand phenomena.This technique was originally designedto accommodateformants that were excludedfrom quantitative assessment because they could not be measuredaccordingto our operational definitions.Suchformantstypicallycorresponded with aberrantproductionsof the vowelthat did not yieldthe expected trajectoriesfor the segmentsof interest.In the figures shownhere,therearefivetrajectoriesplottedper panel,correspondingto five repetitionsof a particularutterance/con-

CONTROL BIG

dition. Mean slopevalues(from the quantitativeanalysis) and a transcriptionof eachof the syllablescontainingthe plottedformanttrajectoriesare alsoincludedin the figures. The transcriptionsare includedonly to describethe general segmentalcharacteristics of the syllables,and not asindices of the normality or aberrancyof the productions. The trajectoriesshownin thispaper (Figs. 1 and 2) are usedto illustratehow the qualitativeprocedurecanbe used to identify and cataloguephenomenaat the individualsubject levelof analysis.Our pointisthat thiskind of qualitative examinationcanexposephenomenathat mustbe explained in theoreticalaccounts,and in fact takesadvantageof intraand intersubjectvariability as objectsof theoreticalinterest, asopposedto a viewwhereinthat variabilityisan obstacleto successful statistical treatment

II. RESULTS

of the data.

AND DISCUSSION

A. Temporal translocation and gesture scaling

Figure1(a) displaysfiveF 2 trajectories for thesegment /I1/(from "build") producedby a controlsubject,andFig. 1(b) showsF2 trajectoriesfor the samesegmentproduced

D

APRAXIC

R

BUILD SLOPE (N=5)

N

SLOPE (N=5)

800 't

X= 9.62 Hzlms SD= 1,48

•1800

z

X= 5.48 Hzlms SD= 1.58 TRANSCRIPTION I/II/ 2/Ill 3/It/ 4/II/ 5/œ1/

,,z,1300

LU1300

3

o

800

800

100 200 TIME (ms)

(a)

APRAXIC BIG

100 200 TIME (ms)

(a)

T

APRAXIC BIG

1800

1800

SLOPE (N=4)

X= 5.23 Hz/ms SD= .81

N

R

SLOPE (N=5)

X= 4.22 Hzlms $D= 1.74

N

TRANSCRIPTION

-

z

2

LU1300

12345-

/II/ /I:l/ /I:l/ /I:t/ /I:l/

TRANSCRIPTION 1-/It/ 2-/ill 3/11/ 4lit/ 5Ifil

o z

,,,1300 D

o

o 1

800

800

0 (b)

100

200

FIG. 1. (a) Five F2 trajectoriesfor the syllablenucleus/I1/producedin the "BIG" stressconditionby a neurologicallynormalsubject;(b) fiveformant trajectoriesfor the samesyllableandstressconditionproducedby a subject with apraxiaof speech. 2985

100

TIME (ms)

J. Acoust. Soc. Am., Vol. 92, No. 5, November 1992

(b)

200

TIME (ms)

FIG. 2. (a) FiveF2 trajectories forthesyllablenucleus/fl/producedin the "BUILD" stressconditionby a subjectwith apraxiaof speech;(b) five formanttrajectories producedby thesamesubjectin the "BIG" stresscondition.

J.M. Liss and G. Weismer: Letters to the Editor

2985

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 155.33.120.209 On: Wed, 03 Dec 2014 17:55:25

by an apraxicspeaker.All of thesetrajectorieswere taken from utterancesin which the word "big" wascontrastively

is probablyrelatedto a gesturescalingproblemthat is expressed in theformof variabilityof thegesturemagnitude.A stressed. directquantitativepredictionof thisviewisthathyperscaled Usingthesetwo setsof trajectories, it ispossible to demgestures(those with large initial swingsin F2) shouldbe onstratehowtraditionalmeasures of formantslopeandseg- associated with later-occurringonsetsof the majorarticulament duration can under-represent important differences torygesture.Examinationof thefivetrajectories in Fig. 2 (b) betweenthesesetsof formants.Considerthatthemeanslope generallysupports thisidea.Hereagain,thedevelopment of value of the steepestportionsof the trajectoriesfrom the suchpredictionsis basedlargelyon visualinspectionof the normal speaker[ Fig. 1(a) ] is 9.62 Hz/ms (s.d. = 1.48). superimposed trajectories, andwouldnot emergefromstrict Comparedto the mean slope(5.23, s.d. = 0.81) of four of adherenceto traditionalquantification. the trajectoriesproducedby the apraxic speakerin Fig. B. Articulatory decomposition/segmentalization 1(b), we can concludethat the trajectoriesof the control speakerare steeper.Further comparisonrevealsthat the The precedingexamplesillustrate that observations transitiondurationsof the apraxicsubjectarealsogenerally from qualitativeanalysis(temporaltranslocation)canlead greaterthan thoseof the normal speaker.Thus, from this to testablehypothesesabout the underlying mechanism traditionalquantitativeapproach,we caninfer that the ma(gesturescaling).Taking thisonestepfurther,the qualita-

jor articulatory gesture 2 for/I1/was produced at a slower

tive analysiscan alsopoint to hypotheses that beardirectly on the development of theoriesof motorspeechdisorders. The scalingproblemdescribedaboveand its possible This conclusion is consistent with other studies of the relationshipto temporal translocationof transitionsmay articulatorydeficitin apraxiaof speech(cf. Kent andRosen- alsobe relatedto, or be a byproductof, articulatorydecombek, 1983),andonecouldbeinclinedto stophere.However, position( or, segmentalization) of speech. Thisphenomenon the conclusionof slowness and the possiblelengtheningof isthoughtto becharacterized by a reductionin thedegreeof the major articulatorygesturedoesnot captureall that is overlapbetweensuccessive articulatorygestures(Weismer unusualaboutthetrajectoryrepetitions. Despitethegeneral andLiss,1991), resultingin speechacousticwaveformsthat similaritiesamongthemostrapidlychangingportionsof the reflect reduced coarticulationor coproduction(i.e., evitrajectories,thereare substantial differences amongthe dudenceof motor controldeficits).Articulatorydecomposirationsof the segments thatprecedethem.In Fig. 1(a), the tion hasbeenobservedpreviouslyfor both apraxicand dysegmentof the trajectoryprecedingthe downwardsloping sarthricspeakers(Kent and Rosenbek,1983;Weismerand portionisrelativelybriefandtopologically uniform3 across Liss,1991;Weismeret al., 1992), andis likely to be largely four of the fivetokens.This is not true for the repetitionsof responsiblefor the perceptionof "scanningspeech"(rethe apraxicspeaker[Fig. 1(b)] wherethe trajectoriesapducedsegmentdurationcontrasts)in thesedisorders(e.g., pearto be "translocated"acrossthe time axis.Specifically, Zieglerand von Cramon,1986). Comparethe productions thesetrajectoriesgenerallyend in the expecteddownward of an apraxicspeakershownin Fig. 2 (a) and (b) to thoseof slope,but the durationsof the flat portionsprecedingthe thecontrolspeakerin Fig. 1(a). Four of the fiveF 2 valuesat downwardsegmentsrangefrom approximately25 to nearly theonsetof thenormaltrajectories,or at thepointwherethe 200 ms.Theseplateauswouldbe consistent with periodsof consonantconstrictionis just releasedinto the vocalicsegrelativearticulatoryimmobility,and mustbe explainedin ment, range between 1600-1700 Hz. This is a reasonable any comprehensive theoryof motor speechdisorders.The "target" value for/I/produced by a geriatricmale in this pointhereis that withoutvisualexamination of thesesuper- phoneticcontext.Four trajectoriesshowa brief steadystate imposedformanttrajectories, onewouldnotthinkapriorito (about 30 ms) around thesefrequencies,followedby the measurethe time precedingthe downwardslopingportion characteristicfallingtransition.In contrast,sevenof the ten of thetrajectories.Thus,thequalitativestepof visualexami- /I1/trajectoriesproducedby apraxicspeaker[Fig. 2 (a) and nationof the disorderedtrajectoriesrevealsa phenomenon (b) ] have startingfrequenciesthat are substantiallylower that lendsitselfeasilyto quantification; that is not apparent than expectedfor/I/, due to the failure to positionthe from thetracingsof a normalspeaker;andthat maybeasso- tonguein a high-frontpositionduringthe periodof vocal ciatedwith underlyingissuesof motor control. tract closure for the word-initial/b/. At the release of the Another examplecan be found in Fig. 2 (a). Note that /b/, then, the tonguemust moveto the requiredhigh-front four of thetrajectorieshaveessentially thesamestartingfreposition,a gesturereflectedin severalof thesetrajectoriesas quency (around 1300 Hz). Nevertheless,the onsetsof the the initial risingportionof the F 2 trajectory.Thesevarious downwardslopingtransitionin this setof trajectories--the attemptsresultin differentscalingsof the gesture,aswell as "temporal translocation"of the transitions--are highly the phenomenonidentifiedaboveas "temporaltranslocavariable,apparentlyas a consequence of variabilityin gestion" of transitions. ture scalingfollowingreleaseof the/b/. Thesegesturescaling difficulties,as evidencedhereby large variationsin the C. Theoretical considerations magnitudesof the initial risingfrequencyswingsalongthe F2 trajectories,shouldbe associatedwith variationsin the In our experience,the kindsof phenomenadescribedin onsettime of the major articulatorygesture(i.e., the transithispaperarecommonin the speechof individualswith motion). In otherwords,temporaltranslocationof transitions tor speechdisorders.We haveshownhowa qualitativeanalrate and over a greaterperiodof time, as comparedto the gesturefor the normal speaker.

2986

J. Acoust. Soc. Am., Vol. 92, No. 5, November 1992

J.M. Liss and G. Weismer: Letters to the Editor

2986

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 155.33.120.209 On: Wed, 03 Dec 2014 17:55:25

• Contrastive stress is a commontherapytechnique wherebya speaker is ysisprocedure (i.e.,visualexamination of superimposed trataught to emphasizethe contentor importantwordsin an utteranceto jectories)canleadto theoreticalperspectives on the nature enhanceintelligibilityand message transfer. of the speechproductiondeficitin motorspeechdisorders By "majorarticulatorygesture"wemeanthepartof thevocalicnucleus andyieldsometestable(and quantifiable)ideas.Specifical- that demandsthe mostextensive(and perhapsthe mostrapid) changein vocal tract configurationto producethe desiredsoundsequence.The ly, amongthe articulatoryaberrancies identifiedhere (temacousticcorrelateof thisis the operationallydefinedtransitionportionof poraltranslocation of transitions, inappropriate scalingof the formant trajectory. gestures, articulatorydecomposition/segmentalization), we "Topologically uniform"impliesa consistent trajectory shape across reppostulate thattheformertwoareprobablybyproducts of the etitions.The implicationis not that the specificfrequency-timecoordinatesare identicalor nearly so acrossrepetitions,but that the different latter. One possibilityis that segmentalization of articulatrajectoriescould be translatedup or down the frequencyscale(or, in tory gestures inducescompensatory responses onthepart of somecases,the durationscale)with the resultingsuperimposition showthe speakerthat leadto variationsin gesturescaling,and ing a minimumvariabilityfor a particularportionof the trajectory.In thustemporaltranslocationof transitions. normal speakers,the amount of frequencyor time scaletranslationreThis is an attractive theoretical notion for at least three

quiredto demonstrate topologicaluniformityistypicallyveryminor.Trajectoriesthat are not topologicallyuniform,by this definition,wouldnot showa reductionof the trajectoryvariabilitywhenthisscaletranslationis performed.

reasons. First,it attemptsto unifytheexplanatory apparatus of variousspatio-temporal articulatorydifficultiesin some motor speechdisordersby meansof a general,well-attested Cooke, J. D., and Brown, S. H. (1986). "Science and statisticsin motor phenomenon(i.e., segmentalization). Second,it fitsin with physiology,"J. Motor Behav.17, 489-492. certain contemporaryaccountsof successive articulatory Corcos,D. M., Agarwal,G. C., Flaherty,B. P., andGottlieb,G. L. (1990). overlapsnamely,thenotionof gesture slidingandblending "Organizingprinciplesfor single-joint movements. IV. Implicationsfor isometriccontractions,"J. Neurophysiol.64, 1033-1042. describedby Saltzmanand Munhall (1989)--that might Kent, R. D., andRosenbek,J. C. (1983). "Acousticpatternsof apraxiaof permittheoreticalanalysisof motorspeechdisorders within speech,"J. SpeechHear. Res.26, 231-249. the frameworkof a theoryof normalspeechproduction. Liss,J. M., and Weismer,G. (submitted). "Acousticcharacteristicsof conThird,asnotedabove,it suggests certainempiricalteststhat trastivestressproductionin control geriatric,apraxic,and ataxic dysarthricspeakers," submittedto Clin. Linguist.Phon. are amenableto quantitativeanalysis. McNeil, M. R., Liss,J. M., Tseng,C-H., andKent, R. D. (1990). "Effects The full potencyof qualitativeanalyseslikely will be of speechrateon the absoluteandrelativetimingof apraxicandconducrealizedwhensuchattemptsare guided--butnot limited-tion aphasicsentence production,"Brain Lang.38, 135-158. bytheoretical considerations. In effortsto identifythepoten- Saltzman,E. L., and Munhall, K. G. (1989). "A dynamicalapproachto gesturalpatterningin speech production,"Ecol.Psychol.1, 333-382. tially "important"acousticmeasures in motorspeech disorSussman, H. M., Marquardt,T. P., MacNeilage,P. F., and Hutchinson,J. ders,we regardthequalitativeprocedure asindispensable. A. (1988). "Anticipatorycoarticulationin aphasia:SomemethodologACKNOWLEDGMENTS

ical considerations," Brain Language35, 369-379. Weismer,G., Kent, R. D., Hodge,M., andMartin, R. (1988). "The acoustic signature for intelligibilitytestwords,"J. Acoust.Soc.Am. 84, 12811291.

This researchwassupportedby a Universityof Minnesota Graduate SchoolGrant-in-Aid of Research,Artistry, and Scholarship,and NIH Grant No. NS 18797.We extend our gratitudeto Karen Forrestfor her thoughtfulcomments on an earlierversionof this paper.We alsowishto acknowledgethe laboratoryassistance providedby Kristin Little and StephanieHanson. Addressreprint requeststo Julie Liss, Ph.D., Departmentof CommunicationDisorders,115Shevlin Hall, Universityof Minnesota,Minneapolis,MN 55455.

2987

J. Acoust. Soc. Am., Vol. 92, No. 5, November 1992

Weismer,G., and Liss,J. M. (1991). "Acoustic/perceptualtaxonomiesof disordered speech,"in Dysarthriaand•4praxiaof Speech: Perspectives on Management,editedby C. Moore, K. Yorkston,and D. Beukelman (Brookes,Baltimore), pp. 245-270. Weismer, G., Martin, R., Kent, R. D., and Kent, J. F. (1992). "Formant trajectoriesin men with amyotrophiclateral sclerosis," J. Acoust.Soc. Am. 91, 1085-1098.

Westbury,J. R. (1991). "On the analysisof speechmovements," J. Acoust. Soc. Am. 89, 1870(A).

Ziegler,W., and von Cramon,D. (1986). "Disturbedcoarticulationin apraxiaof speech;acousticevidence,"Brain Lang.29, 34-47.

J.M. Liss and G. Weismer: Letters to the Editor

2987

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 155.33.120.209 On: Wed, 03 Dec 2014 17:55:25

Qualitative acoustic analysis in the study of motor speech disorders.

Traditional measurements performed on the acoustic signals of normal speech are frequently used to quantify the acoustic characteristics of disordered...
602KB Sizes 0 Downloads 0 Views