Ann

oua 84:

1975

OBJECTIVE EVALUATION OF VOCAL PATHOLOGY USING VOICE SPECTROGRAPHY EUGENE RONTAL, M.D. MICHAEL RONTAL, M.D.

MICHAEL

I.

ROLNICK, PH.D.

SOUTHFIELD, MICHIGAN SUMMARY - Permanent objective evaluation of vocal changes associated with laryngeal pathology is a goal which has been difficult for the laryngologist and speech pathologist to attain. Most attempts at achieving objective records have focused on direct visual examination of the larynx using techniques such as high speed photography, x-ray studies or histologic sectioning. However, the important subjective qualities of the voice are difficult to translate into objective visual patterns. In order to produce these patterns, certain individual components of the voice (i.e., breathiness, periodicity and formant structure) must be analyzed. Recently, modifications of the sound spectrograph have enabled the clinician to objectively visualize these components. The patterns produced by the spectrograph may be applied to a variety of clinical situations. For example, the technique aids greatly in determining the success or failure of medical and surgical management for vocal cord lesions. Secondly, voice spectrography can readilv show improvements or deficiencies in vocal rehabilitation for functional dysphonia. Lastly, this method provides an objective, permanent record of the voice which may be useful from a medicolegal standpoint. Sound spectrographic analysis of vocal pathology is an important diagnostic tool for the clinician. Its future use should be encouraged as a more precise aid in the evaluation of the voice.

Permanent objective evaluation of vocal changes associated with laryngeal pathology is a goal which has been difficult for the laryngologist and speech pathologist to attain. Most attempts at achieving objective records are focused on direct visual examination of the larynx, using techniques such as high speed photography, x-ray studies, or histologic sectioning. However, the subjective qualities of the voice are difficult to translate into objective visual patterns. In order to produce these patterns, the individual components of the voice must be analyzed. Clinical modifications in voice spectrographic analysis have allowed adequate visualization of these components. Once these patterns are formalized, they can be applied to a variety of clinical situations. Although attempts have been made in the past to use spec-

trography in the evaluation of certain types of hoarseness, direct clinical application to voice problems has been a recent development. This paper is a presentation of our use of spectrography in a clinical situation for evaluation of vocal rehabilitation, surgical treatment and medical management of a variety of vocal cord disorders. DESCRIPTION OF VOICE SPECTROGRAPHY

The voice spectrograph is not a new method of voice analysis. Potter et all first developed this technique at the Bell Telephone laboratories in the 1940's. Spectrography has been a useful tool in voice research since its first development.v" It differs from a visible speech translator in that an actual graphic analysis of the speech pattern rather than a display in a cathode ray tube is produced. The graphic display is known as a spectrogram.

Presented at the meeting of the American Broncho-Esophagological Association, Atlanta, Georgia, April 7-8, 1975.

662

VOICE SPECTROGRAPHY

Iwata and Von Leden" have discussed the use of spectrograms in the study of laryngeal disease. They suggested that different laryngeal diseases might yield different spectrographic characteristics. Coopers reported spectrographic analysis as a tool to describe and compare fundamental frequencies and hoarseness in dysphonic patients before and after vocal rehabilitation. Isskiki et al9 have presented a classification system for hoarseness using spectrograms. Rontal et al4 pointed out the usefulness of spectrographic analysis as a clinical tool in evaluation of the voice following TeHon®o paste injection of paralyzed vocal cords. In spite of these contributions to the literature, few laryngologists or speech pathologists are utilizing this important, objective method of voice analysis. Sound spectrography, through a series of filters, presents a picture of the intensity, frequency and duration characteristics of the voice. These characteristics are translated into an electrical impulse represented by an electrical stylus which prints out the pattern on a rotating paper graph. Either steady state vowels such as "ah" or "ee" or spoken words and sentences can be evaluated by this technique. Depending on the information desired, a number of different patterns can be elucidated. A trained individual may then "read" the spectrograms produced and view the phonatory, articulatory and resonance qualities of the human voice. The technique called "voice printing" has confused the clinical application of voice spectrography. The voice print is a form of voice spectrogram used to identify individuals for criminal purposes, based on vocal characteristics of their spoken words. The voice spectrograph, on the other hand, analyzes the specific components of the voice by relating acoustic energy to physiology, so that vocal function can be assessed. The sound spectrogram best analyzes steady state vowels for the purpose of evaluation of phonatory characteristics.

663

The patient is asked to produce a vowel in front of the input microphone for a period of 2.4 seconds. A recording is made on a special tape loop in the spectrograph. This sample is then repeated many times and analyzed through a series of filters. The analysis can be performed in approximately three minutes. Broad band spectrograms are made for the analysis of phonation and can be kept permanently in a patient's file. A spectrographic analysis of phonatory characteristics is performed at periodic intervals so that changes during treatment can be monitored. SPEcrROGRAPHIC CHARAcrERISTICS OF TIlE NORMAL VOICE

In order to adequately understand voice spectrography, the normal physiology of voice production at the level of the glottis must be understood.v'-l" This physiologic sound production is then related to the graphic display on the spectrogram. Sound is the alternate compression and rarefaction of air molecules. Subglottic air, trapped in the trachea by the closed vocal cords, is compressed by the accessory muscles of respiration. As the pressure beneath the glottis increases, the vocal cords are separated, allowing the escape of the compressed air. When subglottic pressure falls, the vocal cords cannot be held open. They move back passively by their intrinsic tension. The subglottic pressure can then build to produce another wave of compressed air. The result is a train of compressions and rarefactions and the result is sound. The sound spectrogram will show the various physiologic characteristics of the normal voice. In this range of normality, there are certain definable differences, such as different characteristics between a male and a female voice, the characteristics of trained singers, the whisper voice and vocal fatigue. In the analysis of the spectrographic characteristics of the normal voice,

.. E.I. DuPont de Nemours and Co., Wilmington, Delaware.

664

RONTAL ET AL.

Fig. 1. Spectrogram of a normal male "ah" vowel produced by an individual with no vocal pathology. Regular, periodic vertical striations indicate synchronous vocal cord movements with no irregularity of the vibratory pattern. The horizontal bars going across the pattern are formants and represent frequency regions of energy selectively amplified by the resonating cavities. The top half of the spectrogram is relatively clear, indicating that no excessive breathiness exists.

phonatory samples are made from steady state vowels. Because of interference from articulatory characteristics, spoken words are less apt to give reproducible, analyzable data. The patterns visible on the spectrogram are evaluated in terms of acoustic parameters (Fig. 1). Periodicity of vocal cord movement is a measure of the regularity of the opening and closing of the vocal cords. It is translated on the spectrogram as the vertical striations seen ( Fig. 1). The synchronous, periodic opening and closing of the vocal cords in a normal voice will produce a corresponding regularity in the vertical striations seen. While periodicity is measured by the pattern of individual vertical striations, certain pitch characteristics of the voice can be observed in the closeness of the vertical striations on the spectrogram. The spectrogram also measures the formants of the normal voice. These formants are the horizontal bars represented as the darker bands on the spectrogram (Fig. 1). These bars or formants relate to the size and shape of the resonating cavities of the vocal tract. Formants will shift position during connected speech and with each sound being produced. On the spectrogram, clear

Fig. 2. "Ah" vowel produced by a female speaker. Normal periodicity of vocal cord movement is seen by the regularly spaced vertical lines. These vertical lines differ from the male voice in Figure 1 inasmuch as they are closer together. This is due to the higher pitched voice. The horizontal bars or formants are typical of the female voice. A lack of breathiness in the upper half of the spectrogram indicates good closure of the vocal cords.

and adequate formant structure is dependent upon a good resonating system and a lack of breathiness, as well as normal periodicity of vocal cord movement. Differences between the male and female voice are demonstrable on the spectrogram (Fig. 2). In the female voice, the fundamental pitch is generally higher. Subsequently, the rapid vibrations of the vocal cords are represented by closer vertical striations. The remainder of the spectrographic characteristics are usually the same. A child's voice closely parallels the female voice in fundamental pitch and resonance characteristics. Until vocal changes associated with pubescence occur, both the male and female child will exhibit a similar spectrographic pattern. The singing voice differs from the spoken voice. Vocal cord closure is too rapid to distinguish individual striations on the pattern produced (Fig. 3). The extra formant structure present is indicative of a finely tuned vocal mechan-

VOICE SPECTROGRAPHY

Fig. 3. A trained singer singing an "ah" vowel at a very high pitch. The vertical lines are blurred together although they make up the actual spectrogram. Extra formant structure can be seen at different frequency regions on the spectrogram. The wavy characteristic relates to tremulo so often found in the trained singing voice.

ism. This is one of the distinguishing features of the voice that is trained, or voice involved in the singing act. The regular wavy motion to the formant structure is indicative of the vibrato commonly heard in a singers voice. Other variations of the normal voice can produce distinct changes on a spectrogram. A whispered voice with no vocal cord movement shows a distinct pattern (Fig. 4). The excessive energy throughout the pattern is created by air escaping through a constricted, partially open glottis. Frequency is represented in the vertical axis of the spectrogram from 0 to 8,000 Hz. When the whispered or breathy voice is produced, energy is present in the highest frequency regions of the spectrogram. Further, vocal fatigue will also be indicated as a normal variant in a voice spectrogram (Fig. 5). Cords which are able to move in a regular, periodic manner may suddenly go into an aperiodic, lowpitched tone with subsequent return to a normal voice. This can occur in a normal individual's phonatory pattern at some time during the speaking day, when the vocal cords are fatigued or mucus rests upon them.

665

Fig. 4. Whispered vocal quality showing no evidence of vocal cord movement. Only frictional, random noise is present, caused by escape of air through a partially opened glottis. This glottal fricative sound can be produced in isolation, resulting in a whispered voice or with phonation resulting in a breathy vocal quality. The formant structure can be seen occurring due to the fact that the glottal fricative sound is being resonated by the resonating cavities.

Fig. 5. Example of a normal speaker producing an aperiodic voice due to mucus on the vocal cords. No laryngeal pathology is present. This is a normal occurence that can take place at different times during the speaking day. This type of pattern can also be caused by vocal fatigue. The spectrograph is capable of picking up and portraying these normal variations of the voice.

666

RONTAL ET AL.

The sound spectrograph was designed to show a graphic representation of the energy involved in the phonatory act. Consideration of the spectrographic characteristics of the normal voice show what is occurring on a physiologic basis. An extension of this would then allow an observation of the acoustic end result of various dysphonic or pathological problems. in doing so, certain characteristics can be identified as representing specific spectrographic aspects of dysphonia. In order to evaluate the voice adequately, each parameter of the spectrogram must be analyzed: 1) aperiodicity or loss of regular vocal cord closure is observed; 2) breathiness or increase in energy in the high frequency regions of the spectrogram is also evaluated, and 3) a breakdown in the formant or energy resonance of the voice must also be analyzed. Individual abnormalities or combinations of abnormal conditions would create spectrograms characteristic of vocal cord pathology. Correction of vocal cord problems will be reflected by an improvement in the abnormalities of each of these three factors. USES OF VOICE SPECTROGRAPHY IN THE EVALUATION OF VOCAL REHABll.ITATION

Both organic and functional disturbances of the vocal cords are frequently amenable to vocal rehabilitation. An objective description of the acoustic end result of the vocal cord pathology is important to the subsequent treatment program. Spectrographic analysis allows both the speech pathologist and the patient to be aware of ongoing improvement in the voice. Vocal cord nodules, vocal paralyses, hyperkinetic dysphonia and dysphonia plica ventricularis are just a few examples of disorders in which the spectrograph was an important tool in the total program of vocal rehabilitation. The presence of vocal nodules prevents adequate movement of the vocal cords with a subsequent aperiodic breathy voice. Some patients with these nodules will respond well to vocal hygiene and rehabilitation. This is especially true in children. The objective

Fig. 6. Three spectrograms A, Band C in serial order showing improvement in d~sphonia related to the resolution of vocal cord nodules. As the nodules reduce in size one can see a lessening of the hreathiness, better periodicity of vocal cord movement, and more distinct formant structure.

VOICE SPECTROGRAPHY

667

show a decrease in dysphonic characteristics (Fig. 6). Vocal cord paralysis shows most of the abnormal spectrographic characteristics of the voice. There is evidence of aperiodicity, breakdown of formant structure and breathiness. Vocal rehabilitation programs related to this condition involve improvement of each of these abnormal components of the voice. The spectrogram allows observation of the change in the voice following treatment. Both vocal rehabilitation and surgical correction of vocal cord paralysis have been used. In the vocal rehabilitation program, the voice spectrogram can evaluate the efficacy of the treatment technique. This type of program generally involves forced adduction exercises. The documentation of the improvement through spectrographic analysis makes a meaningful visual presentation to the referring laryngologist, speech pathologist and the patient (Fig. 7). Lack of resolution of high frequency energy indicative of continued breathiness will aid in the clinical consideration of vocal cord injection.

Fig. 7. Improvement in the voice of a patient with unilateral vocal cord paralysis (A). While the post treatment spectrogram (B) is obviously improved with the reduction in high frequency breathiness, and noticeably more adequate periodicity of vocal cord movement, continued problems are still evident. There is a continuation of high frequency breathiness indicating that complete closure of vocal cords is not possible due to remaining problems.

Vocal rehabilitation is the prime modality of treatment is functional dysphonias, Hyperkinetic dysphonia and dysphonia plica ventricularis are common examples of this type of vocal disorder. In hyperkinetic dysphonia, there is an incoordination of vocal cord movement. This is commonly associated with stressful situations or prolonged vocal strain. With spectrography, the before and after treatment progress of the patient is made visible (Fig. 8). Treatment generally involves hypofunctional voice use with specific attention to decreased vocal strain.

measurement of change in the voice during the process of vocal rehabilitation can be accurately assessed by the voice spectrogram. Therefore, one may be able to see the exact progression of events in correction of the vocal pathology. Serial spectrograms made at different stages in treatment can clearly

Dysphonia p Ii c a ventricularis is another example of a functional neuromuscular vocal disorder which can be treated effectively by vocal rehabilitation. In this situation, the false vocal cords meet in the midline before the true cords. This is clearly demonstrable on indirect laryngoscopy. The dysphonic characteristics of the voice in

668

RONTAL ET AL.

Fig. 8. Dramatic improvement in the voice of a patient with hyperkinetic dysphonia. A severely dysphonic voice with very little evidence of periodic vocal cord movement (A) was vastly improved through the use of a hypofunctional approach to voice production. The post treatment spectrogram (B) shows marked improvement in periodicity. All of the breathiness has cleared due to more adequate vocal fold closure.

dysphonia plica ventricularis are related to an attempt by the false vocal cords to participate in the phonatory act (Fig. 9). There is an extreme amount of breathiness throughout the pattern, represented by high frequency energy. The formant structure is hidden within the breathiness which is caused by the escape of air through an apparently tense and nonactive pair of vocal cords. Vocal rehabilitation involves the use of the hypofunctional approach to voice production. With the emergence of formant structure and the decrease in breathiness as indicated by the posttreatment spectrogram, voice use is encouraged. As can be seen in each of

Fig. 9. Patient wtih dysphonia ventricularis (A) exhibits a much improved voice after a hypofunctional approach to treatment. A major acoustic characteristic that has been eliminated in the posttreatment spectrogram (B) is the excessive breathiness found throughout the pretreatment pattern.

these examples, the voice spectrograph is extremely useful in demonstrating objective changes in voice, related to the vocal rehabilitation program. EVALUATION OF SURGICAL PROCEDURES OF THE LARYNX

Voice spectrography is useful in evaluation of certain surgical procedures of the larynx. Vocal cord injection, conservative laryngeal operations for carcinoma of the larynx and evaluation of pre- and postoperative vocal cord stripping procedures have all been successfully evaluated by the use of voice spectrograms. The objective evidence

VOICE SPECTROGRAPHY

669

to the laryngologist and help him decide whether a second injection may be necessary.

Fig. 10. Pre- and postspectrograms of a patient who underwent TeflorrSl paste injection for unilateral vocal cord paralysis. The postsurgical spectrogram shows an essentially normal voice with none of the aperiodicity, breathiness or formant breakdown noted in the presurgical example.

indicated by spectrographic analysis is useful both in the management of the patient in the clinical situation, and in the professional evaluation of the voice by both the laryngologist and speech pathologist. As mentioned previously, vocal cord paralysis can be treated by vocal rehabilitation. However, many of these patients are amenable to vocal cord injection with dramatic results. Widely displaced cords in the intermediate position or vocal cords paralyzed in the paramedian position with breathiness are all subjects for potential vocal cord injection. The voice spectrogram has been advocated in the evaluation of vocal cord injections with Teflon.®4 This is becoming increasingly useful in the proper evaluation of the pre- and post operative voice (Fig. 10). As mentioned previously, the abnormalities of aperiodicity, breakdown of formant structure and breathiness are all seen in vocal cord paralysis. Their correction by vocal cord injection is dramatic when pre- and postoperative spectrograms are compared. Persistent abnormalities of the spectrogram can give information

Voice spectrography is also useful in evaluation of the voice associated with the various types of glottic reconstruction following vertical hemilaryngectomy. Vertical hemilaryngectomy has been advocated as an excellent means of removal of carcinoma of the true cord with sparing of the voice. Excessive breathiness and hoarseness following this procedure has prompted a variety of laryngoplasties to be advocated as a means of reconstructing tissue opposite to the normally mobile cord in an attempt to achieve glottic closure. Although several techniques have been advocated, there is no uniform opinion as to which is the best. An evaluation of these techniques by spectrography is helpful on an objective basis to decide which technique is most efficacious in the improvement of the voice following this surgical procedure. In a limited series, we have compared the results of sternohyoid muscle grafts to the use of hemilaryngectomy alone without laryngoplasty. The spectrograms when analyzed are compared with the normal voice. The objective evidence produced by the spectrograms would indicate that there is no improvement by laryngoplasty in the long-term follow-up of these individuals. However, it must be emphasized that only large studies with long-term follow-ups can be adequately used to produce a final statement on this problem. Lastly, vocal cord nodules and polyps are frequent inciting etiologies responsible for many vocal cord strippings. The pre- and postoperative voice attained has not until this time been adequately evaluated. Frequently, hoarseness persists after vocal cord stripping. An objective evaluation of the pre- and postoperative voice is essential for the laryngolozist to adequately evaluate his patients. Frequently, the ability to produce a spectrogram and show the patient that though the voice has not returned to normal lew:Is but has been dramatically improved is extremely im-

670

RONTAL ET AL.

metabolic condition is frequently associated with marked improvement in the voice. The changes subjectively seen in the treatment program will actually be demonstrable based on objective changes seen in the spectrogram. It is further hoped that in the future the end point in the so-called Tensilonw" test for myasthenia gravis may be indicated more readily by close evaluation of the voice spectrogram. An improvement in periodicity in patients with myasthenia gravis could be an objective indication of improvement with injected Tensilon.® This may help establish the diagnosis of myasthenia gravis. Further studies will be needed for complete evaluation of this technique.

Fig. 11. Pre- and postsurgical example of phonation related to vocal cord stripping for bilateral nodules. The presurgical example (A) shows noticeable evidence of breathings and aperiodicity of vocal cord movement. The postsurgical example (B) reflects an essentially normal voice with no evidence of dysfunction.

portant in the postoperative management of these individuals (Fig. 11). Again, the objective evidence of vocal cord change cannot be overemphasized as an essential component in the adequate evaluation of laryngeal surgical candidates. EVALUATION OF MEDICAL MANAGEMENT OF VOCAL CORD LESIONS

The voice spectrogram will in the future have significant use in the evaluation of medical treatment for certain illnesses causing laryngeal disorders. Both hypothyroidism and myasthenia gravis have distinct vocal cord changes associated with their respective metabolic disturbances. Improvement in the " Roche Laboratories, Nutley, New Jersey

Further, the objective determination of the voice is of vital interest from a medicolegal standpoint. A quality tape recording can be a great asset in assessing the exact nature of the voice. The accompanying spectrogram produced from this tape will give increasing objective evidence of the laryngeal disorders described by the clinician. Just as the audiogram helps in evaluating the pre- and postoperative condition of the ear, so too the spectrogram will help in evaluating the pre- and postoperative condition of the voice. As the legal implications of laryngeal surgery become more and more important, the voice spectrogram will have increased usage. LIMITATIONS OF VOICE SPECTROGRAPHY

As in any other objective technique, the limitations of voice spectrography should be well understood. The exact quantification of spectrogram measurements is not feasible. It is the visual qualitative change of pattern in the parameters of the voice on the spectrogram rather than the actual measureable degree of change in formant structure, periodicity or breathiness, that is the significant aspect of voice spectrography. The spectrogram as such produces a clinically usable objective evaluation of the voice.

VOICE SPECTROGRAPHY

To date, spectrograms cannot differentiate between different types of vocal cord lesions. Vocal cord nodules do not produce acoustic characteristics different from a contact granuloma. Furthermore, dysphonic characteristics, as indicated on a spectrogram, cannot indicate the existence or nonexistence of a malignant lesion. It is important to remember that many types of vocal cord pathologies yield similar resultant acoustic characteristics as seen on the spectrogram.

671

Spectrograms should not be used in place of indirect laryngoscopy to determine whether or not lesions have been resolved. Spectrograms are not a substitute for direct or indirect observation of the larynx. It should be emphasized that voice evaluation by spectrography should not be used as a substitute for good clinical acumen. Again, the final judgements of vocal conditions should be an evaluation of all the parameters available to the clinician and not on the basis of a single test.

Request for reprints should be sent to Eugene Rontal, M.D., 21700 Northwestern Hwy., Tower 14, Suite 545, Southfield, Mich. 48075. ACKNoWLEDGMENT-The authors acknowledge the assistance of Mrs. Sydnor Gilbreath for her generous support of these research activities through the Gilbreath Voice Analysis Laboratory, Speech and Language Pathology Department, William Beaumont Hospital, Royal Oak, Michigan. Technical assistance was given by the Audio-Visual Department at William Beaumont Hospital. REFERENCES 1. Potter R, Kopp G, Green H: Visible

Speech. New York, D. Van Nostrand Company, Inc., 1947 2. Arnold GE: Vocal rehabilitation of paralytic dysphonia: IX Technique of intracordal injection. Arch Otolaryngol 76:358-368 3. Rolnick MI, Hoops HR: Plosive phoneme duration as a function of palato pharyngeal adequacy. Cleft Palate J 8:65-76, 1971 4. Rontal E, Rontal M, Rolnick MI: The use of spectrograms in the evaluation of vocal cord injection. Laryngoscope 85:47-56, 1975 5. Yanagihara N: Significance of harmonic changes and noise components in hoarseness. J Speech Hearing Res 10:531-541, 1967 6. Holbrook A, Fairbanks G: Diphthong formants and their movements. J Speech Hearing Res 5:38-58, 1962

7. Iwata S, Von Leden H: Voice prints in laryngeal disease. Arch Otolaryngol 91:346351, 1970 8. Cooper M: Spectrographic analysis of fundamental frequency and hoarseness before and after vocal rehabilitation. J Speech Hearing Disord 39:286-297, 1974 9. Isshiki N, Yanagihara N, Morimoto M: Approach to the objective diagnosis of hoarseness. Folia Phoniatr (Basel) 18: 183-192,1964 10. Koyama T, Kawasaki M, Ogura JH: Mechanics of voice production, regulation of vocal intensity. Laryngoscope 79:337-354, 1969 II. Timcke R, Von Leden H, Moore P: Laryngeal vibrations; measurement of the glottic wave. Part I. The normal vibratory cycle. Arch Otolaryngol 68:1-19, 1958

Objective evaluation of vocal pathology using voice spectrography.

Ann oua 84: 1975 OBJECTIVE EVALUATION OF VOCAL PATHOLOGY USING VOICE SPECTROGRAPHY EUGENE RONTAL, M.D. MICHAEL RONTAL, M.D. MICHAEL I. ROLNICK,...
1MB Sizes 0 Downloads 0 Views