SECTIONEDITORS

.

ef speech

prostheses

R e b e c c a J. L e o n a r d , P h i ) a University of California, Davis, Sacramento, Calif. lossal d allow tests ~peeeh parameters, prior to actual fabrication. Advantages of the techniques are also d i s c u n e d . (J PROSTHET DENT

T

h

e

design of prostheses to improve speech in glossectomized speakers is typically encumbered by the necessity of repeatedly "trying in" a prosthesis. That is, the prosthesis is initially made in some rudimentary form, then worn by the patient while speech and possibly other types of data recordings are made for subsequent analyses. Based on results of these various analyses, the prosthesis may be modified and tried again, or perhaps many times, before a presumably "best" design is determined and made in permanent form. While some degree of trial and error will always be necessary to satisfy comfort and fit requirements of each patient, computer-aided design of a prosthesis on an "a priori" basis may substantially expedite those aspects of fabrication related to speech improvement. Pertinent features of this approach will be described in this article. It is assumed that patients have previously undergone dental and speech evaluations, and have met respective prosthodontic and speech criteria as candidates for prostheses. RADIOGRAPHIC

AND DENTAL

STUDIES

Certainanatomic and physiologic characteristics serve as inputs to the design process. In the author's clinic, initial information is obtained from dynamic, videofluoroscopic x-ray studies of swallowing and food management capabilities These studies are a routine part of clinical evaluation and follow-up in patients undergoing treatment for oropharyngeal cancer. Alternatively, lateral still x-ray studies can be used, Each patient is accompanied to the radiology suite and is given different amounts and consistencies of barium or barium-coated materials to swallow. The patient is filmed

~ i s research was supported by a grant from the National Cancer ~nstitute, grant No. 1 KO4 CA01125-03. A~sociate Adjunct Professor, Department of Otolaryngology, ]lead and Neck Surgery. ; {)'1/27038

[/~

in both lateral and anterior-posterior views, from the lips to cervical esophagus just below the larynx. A radiopaque centimeter stick is placed in the same plane of x-ray imaging to allow accurate measurements from films obtained. Experience suggests that the best speech data are obtained just after a small amount of thin, liquid barium has been swallowed, and before large amounts, or thick consistencies, of barium have been introduced. When a prosthesis to improve speech is under consideration, the patient is also asked to produce a number of speech stimuli that represent extreme positions obtained by the tongue during connected speech. Stimuli currently used include the sustained vowels "ee" (/i/, as in see), "ah" (/a/as in sock) and "oo" (/u/as in suit), and consonants requiring elevation of anterior and posterior tongue, respectively, for example, repetition of "ta-ah," and "ka-ah." Connected speech is assessed by having the speaker count from 1 to 10, and repeat the sentence "I see a fox in the chicken coop." While data for both vowels and consonants are collected, procedures currently in use are related primarily to vowel productions. Thus examples presented will focus on these stimuli. Productions of isolated vowels are filmed, first, in lateral view, and then repeated in the frontal view. All other speech stimuli are completed only in the lateral view. Speaking tasks add a negligible amount of time to the radiographic study and are believed to be well worth the extra expenditure. All aspects of the speech and swallowing studies are recorded for later review and analysis. The x-ray study permits a complete investigation of tongue dynamics in the lateral view. However, relative capabilities of right and left portions of the tongue are better assessed by a combination of radiographic and dental techniques. Frontal view x-ray films of vowels in isolation provide information about the extent of elevation of left versus right tongue movements, both anteriorly and posteriorly. Standard pressure paste dental techniques are used to supplement this assessment by specifying location on the palate and the extent (anterior to posterior, right to

A,LGuSTi ~ .

VGLCN,E 66 N,.£,Vi~R

COMPUTER-DESIGNEDSPEECH PROSTHESES

(.

2 3

( --~ ~--

4

"OO" "ah"

i,'

2.

spine },.

~,:;

I

3 Fig. 1. Composite drawing of normal speaker's tongue and jaw positions for three different vowels. Numbers indicate landmarks from x-ray view of each separate vowel that are overlaid in producing composite drawing.

Fig. 2. Boundaries used in determination of range of tongue and jaw motion (in area, in square centimeters) from composite drawing. Numbers indicate anterior (1), superior (2), posterior (3), and inferior (4) boundaries. Tongue shapes for vowels "ee" (/i/), "ah" (/a/), and "oo" (/ u/), representing extreme displacements, are also indicated.

left) of residual tongue -palate contact during production of "ta-ah" and "ka-ah."

A N A L Y S E S OF X-RAY A N D D E N T A L STUDIES Upon completion of radiographic and dental assessments, the videotaped record of the x-ray study is played back until steady-state portions of each speech sound are identified. A tape playback system that is capable of stopframe and slow motion, such as the Panasonic NV-8930 (Matsushita Electric Industrial Co., Ltd., Osaka, Japan) is ideal for this task. For vowel sounds, the steady-state position for "ee" (/i/as in see) corresponds to the point at which the tongue tip is at its highest, most forward point, while for "ah" (/a/as in sock), the optimal position is that in which the tongue and mandible are lowest and most open, respectively. For "oo" (/u/as in suit), the point where the posterior part of the tongue is most elevated is targeted as a steady-state position. Once the target positions for each vowel sound have been isolated on the videotape, a clear plastic overlay is affixed to the TV monitor and pertinent features of the lateral view x-ray film are traced. Alternatively, this task may be performed with a computer equipped with a frame-grabber video digitizing board and appropriate image enhancement/analysis software. For each vowel production, care is used in tracing the hard palate and teeth, and the upper three or four cervical vertebra. In addition to the superior and posterior landmarks, the tongue, as completely as possible, and two or three landmarks of the lower jaw, such as the lip, teeth, and chin, are traced. From the completed tracings o f / i / , / a / , and/u/, a composite drawing incorporating tongue plus jaw positions for all three productions is made. To this end, tracings of the

THE JOURNAL OF PROSTHETIC DENTISTRY

£

.... l il (see)

--lal (sock) ---lul (suit) Fig. 3. Tracings of tongue/jaw positions for sustained vowels/i/,/a/, a n d / u / i n normal female speaker. Different constriction patterns produce acoustically different formant patterns and consequently perceptually different vowels.

three vowels are superimposed so that the landmarks (Fig. 1) including alveolus, posterior nasal spine of the hard palate, and cervical vertebra, are directly overlaid. Tongue plus jaw positions for each vowel, represented in the figure

225

LEONARD

.."

/....-.,

.... l il {see) --lal (sock} ---lul (suit) Fig. 4. Tracings of tongue/jaw positions for sustained vowels/i/, /a/, a n d / u / i n female speaker with moderate glossat resection, Anterior tongue mobility appears somewhat reduced compared with normal, but greatest reduction in range of motion is in posterior part of tongue.

by dashed, dotted, and solid lines, respectively, are then simultaneously displayed. Those portions of each tongue shape in the composite drawing that represent the maximal displacement of the tongue from rest are also highlighted {solid black line in Fig. 1). Once the composite drawing is completed, the following measurements are obtained.

1. O v e r a l l a r e a o f t o n g u e d i s p l a c e m e n t This measure is arbitrarily determined from the composite drawing by defining four boundaries--oral, tongue, pharyngeal, and floor of mouth--which collectively encompass the two-dimensional articulatory tongue plus jaw positions assumed by the speaker across the three vowel productions. As illustrated in Fig. 2, the oral boundary (labeled 1 in the figure) is represented by a straight line ex~.ending from a landmark on the mandible, usually a lower ~.,(}oth, at its points of maximal elevation and lowering. The superior border (2 in the figure) is formed by the shape of ~he outline of the tongue noted in Fig. 1; that is, the ~;ongue!s dorsal contour at points of maximal displacement ~'rom rest. Posteriorly, the boundary (3 in the figure) is a }i,:~e reflecting the maximum anterior-posterior displace~ e n t of a point on the epiglottis, typically, at the lingual,~piglottic junction. The inferior boundary (4 in the figure) i~ arbitrarily determined to be the shortest distance between theanterior and posterior boundaries.

2T

.... l il (see) --lal (sock) ---lul (suit) Fig. 5. Tracings of tongue/jaw positions for sustained vowels/i/,/a/, and / u / i n female speaker with extensive glossal resection. Range of motion of tongue, particularly as this relates to forming constrictions against palate and pharynx, is markedly diminished.

Overall area of tongue motion occurring within the four boundaries, in square centimeters, can be determined by hand (or computer) and then compared with data previously computed for normal speakers (to date, the author has data for 10 adult women and 10 adult men). Range of motion for glossectomized speakers is expressed as a percentage of the mean range--considered to be 1 0 0 % determined for normal speakers. Examples of tracings demonstrating both normal and diminished ranges of motion for isolated vowel productions are presented for one normal and two impaired speakers (Figs. 3 through 5). Information obtained is helpful in characterizing the extent of a mobility deficit. However, it has proved particularly useful in selecting patients most likely to benefit from a prosthesis. Experience to date suggests that patients who achieve percentages of 50% to 80%, or less than 20%, on the area measurement may be the best candidates for prostheses. The prosthodontist may have other criteria that enter into this decision. This observation, while preliminary and based on only 12 speakers, is supported by evaluations of speech pre- and post-prosthesis placement. The author's impression from these data is that speakers with ranges exceeding 80% are typically performing well on other speech measures, and may benefit as much or more from speech therapy focusing on improved articulation and increased mobility of residual tongue as from a

AUCiU~2 ~ .~.

VC,~CM~2 ~

NUMBE~i

2

COMPUTER-DESIGNED SPEECH PROSTHESES

prosthesis. On the other hand, speakers with ranges between 20% to 50% have generally shown only small improvements with either intervention strategy. In contrast, patients with 50% to 80% of the normal range have frequently made significant improvements on a variety of speech parameters with maxillary prostheses 1 designed to complement residual tongue dynamics. Similarly, patients with extremely large tongue resections, for example, one quarter or less residual tongue, have also demonstrated substantial improvements in speech with mandibular prostheses. 2 As noted, these observations are tentative and await further exploration and explanation. However, when time and cost factors must be considered in selecting oropharyngeal patients as prospective candidates for speech prostheses, even such preliminary impressions may deserve attention.

2. Anterior, mid and posterior displacement Of particular importance to the design of prostheses is the differential analysis of tongue displacement (Fig. 6). In this scheme, the vocal tract is arbitrarily divided into anterior, middle, and posterior locations, and percentage of overall tongue displacement occurring within each location is again compared with observations made for normal speakers across the vowel stimuli. It is apparent in Fig. 6 that anterior and mid regions are indicated by perpendicular lines extending from the maxillary alveolus, mid, and posterior hard palate, respectively, to the inferior range of motion boundary previously noted. Remaining displacement is considered posterior.

3. Location/extent of maximum anterior, mid, and posterior constriction In producing vowels, it is not just the ability of the speakers to move the tongue, but their ability to effect rather tight constrictions at certain locations along the hard palate and/or pharynx, that is critical. In fact, regardless of tongue mobility, if the speaker is unable to effect constrictions of less than 1 cm between the tongue and other structures, the vocal tract is acoustically dramatically neutralized. 3 Thus the third set of measurements obtained from the x-ray study includes an estimate of anterior, mid, and/or posterior locations where greatest constriction is effected by residual tongue during vowel production. Actual distances between tongue and palate or pharynx at these sites of maximal constriction are also computed. In Fig. 2 these locations are approximately indicated by the arrows denoting each vowel. At the completion of the x-ray analysis, it should be possible to specify for tongue tip, mid, and posterior tongue, and for the jaw, not only range of motion observed across the speech stimuli, but also location and extent of maximal

THE J O U R N A L OF PROSTHETIC D E N T I S T R Y

ANTERIOR

MID t_.. 1/2

.~

POSTERIOR

Fig. 6. Range of motion differentiated according to anterior, mid, and posterior locations within vocal tract. constrictions in the oropharynx noted for these same stimuli. An optional value that can be extracted from the x-ray study is the approximate resting length of the vocal tract, measured from the larynx to the lips. This measure is desirable, but not absolutely necessary to the subsequent prosthesis design process, since reasonable predictions of acoustic events in the vocal tract can be made by assuming a tract length of 15 cm in adult women and 17 cm in adult men. As noted, relative contributions of right and left halves of the tongue to mobility data are estimated from frontal view x-ray films, and from pressure paste studies of tongue-palate contact. Differences observed are quantified as much as possible and are incorporated in the assessment of overall tongue displacement. COMPUTER

SIMULATIONS

The next stage of the design process involves the use of computerized techniques developed by Ladefoged 4 and colleagues at the UCLA Phonetics Laboratory for use on Macintosh (Apple Computer, Inc., Cupertino, Calif.) computers. One program option provides a schematized lateral view of the human vocal tract similar to the one obtained from lateral view radiographs. With the tract in view on the monitor, the user can quantitatively specify a number of physiologic variables, including degree of elevation of front and back tongue, lip opening and protrusion, laryngeal height, and mandibular and soft palate opening. When all features have been specified, the program computes area functions for 1 cm segments of the vocal tract from the larynx to the lips. Algorithms based on these data are then applied to determine the acoustic consequences of such a tract shape, that is, the effects it would have on sound (voice) generated at the larynx and transmitted to the mouth opening (Fig. 7). In another option, the user simply modifies, via a mouse

227

LEONARD

Fig. 7. Output from computer. At right are values (in pixels) specifying characteristics of certain vocal tract parameters. For shape specified, area functions for 1 cm sections of tract are computed. Algorithms performed on these data determine acoustic effects on sound directed through tract. Predicted values of F1, F2, and F3, respectively, are presented at bottom. These formants provide powerful perceptual cues to a listener attempting to identify which vowel is heard.

~

. /i/ s e e I F1

F2

/~

F3

•/aJ sock

>,1

F1 ,

~ k ff

|

1

v

I

F3

\

I

!

i

l

/u/suit ~,1 F1 F2

r

FREQUENCY (cps) Fig. 8. Amplitude peaks in output spectra (amplitude by fr~quency displays) indicate formant locations associated with each vowel, These locations differ according to shape of vocal tract and hence are acoustically (and perceptually) ~, ique for each vowel.

Z~t

input device, a given vocal tract shape until tongue and/or palate, for example, conform to a particular configuration. Once the desired shape is attained, the program again predicts the likely acoustic characteristics of speech output from the tract. Currently, the program allows only for determination of acoustic characteristics most critical to vowel production and perception, i.e., vowel formants. Formants are frequency locations at which the greatest amount of sound energy is allowed to pass when a complex sound, such as that produced by the human larynx, travels through the airway to the mouth opening. 3 They correspond closely to the natural resonance characteristics associated with the three-dimensional configuration of the vocal tract. Several formant frequencies are apparent in the output spectrum (frequency by intensity display) of any vowel (Fig. 8). However, only the lower three, F1, F2, and F3, respectively, are thought to change significantly with each change of shape of the tract, that is, with each vowel articulated. Since these frequency locations are unique for a given vocal tract shape and thus unique for each vowel, it is not surprising that they represent a powerful perceptual cue to a listener. In fact, formant frequencies provide the major cue to a listener's identification of vowel sounds. The author has found that formant data resulting from Ladefoged's computerized prediction scheme provide reasonable approximations to actual acoustic data obtained from vowel recordings. These recordings are collected at the time of the x-ray study and are subjected to separate formant analyses using spectrographic techniques.

A U ~ C S ~ ~99~

VG~M~

~

NU~E~

2

COMPUTER-DESIGNED SPEECH PROSTHESES

Fig. 9. Formant characteristics (F1, F2, and F3) associated with different degrees of palatal contouring are presented. Three shapes produce acoustically different formant characteristics and, in this case, perceptually different vowels.

Vocal tract shapes are generated on the computer that closely approximate radiographic data for an individual speaker's vowel productions. For a given shape, the program predicts forward locations in the outer spectra that can be compared to the actual formant data obtained. As noted, correlations between actual and predicted values have been found to be reasonable. This potential for accurate prediction is particularly powerful, of course, because it enables modification via computer of simulated vocal tracts and predicts certain acoustic consequences of the modification--the final stage of the design process. COMPUTERIZED

PROSTHESIS

DESIGN

Once the necessary anatomic and range of motion (tongue plus jaw) characteristics of the speaker have been specified and the accuracy of predicted vowel formants have been determined for vowel productions, it is possible to modify via computer an individual vocal tract in accordance with possible prosthetic objectives. For example, if the patient's difficulty appears to be related to an inability to effect anterior constrictions between the tongue and palate, then those vowels that require such constrictions, referred to as high, front vowels, are likely to be impaired. With the computerized system described the user might lower the palate in an anterior location according to a desired contour and then project an appropriate voicing signal through the altered vocal tract. The computerized prediction of vowel formants likely to be "output" by the modified tract, compared with normative data, provides insight into benefits of the design for speech. The palatal modification could also be held constant while several other variables, such as mandibular opening or tongue elevation, are systematically varied and their effects on speech output are tested. For example, the vowel "ee" requires a very high, anterior tongue constriction with little jaw opening, while the vowel "i" (as in "bit") requires slightly less anterior tongue constriction and greater jaw

THE JOURNAL OF PROSTHETIC DENTISTRY

opening, and the vowel "e" (as in "bet") requires even less anterior tongue constriction with slightly more jaw lowering. Knowing these characteristics of normal speech and guided by mobility data previously determined, the user can "set" a speaker's probable tongue plus jaw shape for a whole series of vowels and then make reasonable predictions of the palatal modification's effects on each production. Repetition of the process for a new palatal modification may follow and be continued until the extent and nature of palatal contouring most likely to maximize vowel improvement has been determined. Examples of the effects of such systematic changes in vocal tract shape on vowel acoustics are presented in Fig. 9. Similar techniques can also be used in the design of a mandibular prosthesis. When an "optimal" design has been determined from computer experimentation, an actual protocol prosthesis, either a palatal reshaping prosthesis that accommodates residual tongue plus mandible activity or a mandibular prosthesis that interacts with residual palate, or even both, can be fabricated and "tried in" the patient. Although some speech-related changes in the prosthesis may still be required at this stage, experience suggests that these will be minimal. SUMMARY The procedures described represent a preliminary introductory approach to the use of computerized analysis and simulation techniques in the design of speech prostheses for glossectomized patients. As noted, programs currently available allow only vowel analysis on an "a priori" basis. Similar techniques for consonants, as well as for other speech characteristics, remain to be developed. In addition, the programs described assume a normal vocal tract and a normal speaker, which means that desired modifications of the vocal tract are sometimes cumbersome. However, modifications of the programs currently in progress will allow greater flexibility in specification

229

LEONARD

of variables and a more direct approach to prosthesis design.

REFERENCES 1. LeonardR, G ~ R. D~en~,e4fecta o f p r ~

on speechin glos-

~zmk~, J ~ ~ ISS0;~i70!-8. 2. L~on~d R. Qmi~S, ~'re~ d ~ p ~ t ~ c ~ e O, ~o,,et form~t~ s~i~

of speech p r o s ~ for patients experiencing speech impairment as aconsequence of oropharyngeal cancer. With the completion o f improvements atready in progress, it is expected that cemp~ater~assisted prosthesis design will assume an even more important role in the rehabi|itation of speech, and ~ eventually swallowing, in this patient population.

and isovowel lines in a patient with t o ~ gFossectomy. J Speech Hear Disord 1983;48:423-6. 3. L i e b e r ~ n P. Speech physiology and acoustic phonetics. New York: MacMillan Publishing Co, Ltd, 1977. 4. Ladefoged P. Vocal--a Macintosh computer program. Los Angeles, Calif." Phonetics Laboratory, Linguistics Department, UCLA. Reprint requests to: DR. B,EBECCAJ. LEONARD DEPT. OF OTOLARYNGOIX)GY/~a'~EADAND NECK SURGEE¥ 2500 STOCKTONBLVD. SACRAMENTO,CA 95817

A v a i l a b i l i t y of JOURNAL back issues, 1 9 8 5 - 1 9 9 0 Back issues of THE JOURNAL OF PROSTHETIC DENTISTRY are available for purchase from the publisher, Mosby-Year Book, Inc., at a cost of $6.50 per issue. (Foreign postage is not included.) The following quantity discounts are available: 25 % off on quantities of 12 to 23, and one third off on quantities of 24 or more. Please write to Mosby-Year Book, Inc., Subscription Services, 11830 Westline Industrial Drive, St. Louis, MO 63146-3318, or call (314)453-4351 for information on availability of particular issues for that period from 1979 to 1990. If unavailable from the publisher, photocopies of complete issues are available from University Microforms International, 800 N. Zeeb Rd., Ann Arbor, MI 48106, (313)7614700.

Computerized design of speech prostheses.

The use of computerized techniques to assist in the design of palatal and/or glossal prostheses is described. Patients with oropharyngeal resection an...
2MB Sizes 0 Downloads 0 Views