A SIMPLE COMPUTERIZED METHOD FOR GENERATION OF DICHOTIC TAPES JAMESWILSON Department of Electrical Engineering. University of Toledo. Toledo. OH 43606. U.S.A.

and LEE T. ANDREWS Departmentof Neurosciences,MedicalCollegeof Ohio. Caller Service Number lC008, Toledo, OH 43699, U.S.A.

and RAMESHPAREKH Computer Lab, MedicalCollegeof Ohio, Caller Service Number 10008, Toledo, OH 43699,U.S.A. and MARILYN L. PINHEIRO Department of Neurosciences, Medical College of Ohio, Caller Service Number 10008, Toledo, OH 43699, U.S.A. (Receired 24 May 1977; received for publicatiort 20 October 1977)

Abstract-A PDP-12 based system designed to allow use of dichotic listening for clinical research and diagnosis is described. Three computer programs were developed to generate dichotic pairs which have minimal distortion and ensure simultaneous onset time of auditory signals. Displays are used .frequently throughout the program to make it easy to operate and to inform the operator of his responsibilities and options. Word length. skew delay and interpair intervals can be set by introducing desired values of these parameters into the program. Two different versions of the computer programs were developed. One uses 8 K of memory and generates dichotic pairs from words up to 614msec long. A 32 K version is not limited to words but permits use of sentences up to 2.5 set long. Dichotic pair Assembler

Onset time

Skew delay

Interpair intervals

Word length

PDP-I 2

INTRODUCTION Research dealing with dichotic listening tasks, defined as the simultaneous presentation of a different stimulus to each ear, has been done to further understand the auditory processing centers of the brain in relation to functional and anatomic asymmetries [l-6]. In the last decade dichotic stimulation has gone beyond a research tool and become a clinical diagnostic tool. Its clinical applications have been explored by Berlin, Lowe and many others [ 1,7-91. It has been found that the results of auditory tasks made up of dichotic words or syllables can indicate the site of a brain lesion in many cases, especially w en the region of the temporal lobe of either hemisphere of the brain is affected. The n%uronal processing networks for language are located in the temporal lobe of the left hemisphere in most individuals. Auditory neural pathways are predominantly contralateral; but, when only one stimulus word at a time is presented to either or both ears, it reaches both sides of the brain. It can, therefore, usually be processed accurately, even in the presence of a brain lesion. However, when two different words are presented simultaneously, one to each ear, the right ear word is transmitted directly to the left hemisphere, but the left ear word is “blocked” from this direct pathway by the right ear word and is routed first to the right (contralateral) hemisphere, then to the left hemisphere 229

230

JAMES WILSON er al.

by way of the interconnecting commissural fibers of the corpus callosum. Thus, the left ear stimulus in dichotic tasks arrives at the left hemisphere processing area several msec later than the right ear stimulus. Even though the right ear has the advantage of the more direct route, a normal brain can easily process both dichotic words. However, patients with certain brain lesions cannot handle this complex task because of damaged neural elements, and the resulting scores of dichotic tasks will show a deficit in performance. The deficit is usually for the ear contralateral to the brain lesion, especially if the left hemisphere is impaired. But scores for either or both ears may be affected, depending upon the location of the lesion. Thus, the pattern of dichotic test results relate to and indicate the site of the brain lesion, contributing to neurological diagnosis. However, in order to effectively apply dichotic listening for clinical diagnosis, dichotic tapes must be available which have minimum distortion and ensure simultaneous onset time of auditory signals. Most researchers and clinicians using dichotic stimuli make the dichotic tapes by rather crude methods. In one technique, two or more tape recorders are used and the tapes are spliced to make a dichotic pair. Yates et al. [lo] used this method for recording dichotic digits. This procedure is very time consuming. In addition, onset differences are not consistent, and many special pieces of equipment must be used to make tapes by this method. Lowe et al. [ll] also used tape recorders to align the onset times by monitoring moveable playback heads and then making a master tape recording of the results. Clifton and Bogart2 [12] used tape loops on two recorders to make a master recording on a third recorder. Their accuracy was approx. 10 msec. An electromechanical delay line was used by Ptacek [13] and Berlin et al. [7] to align the onset time. One group, Murray and Hitchcock [14], went as far as lip-reading a video recording of channel one to simultaneously record channel two. Computers have been used in the past for producing dichotic tapes. but systems were usually very expensive and large memory computers made them undesirable. Weiss and House [15] used a hybrid computer to examine the temporal and intensive parameters. Darwin [ 16,173, Gerber and Goodman [ 183, and Springer [S] used the computer-controlled pulse code modulation (PCM) system of Haskins Laboratories. The procedure of the Haskins system begins by digitizing and storing synthetic speech in a disk file. The consonant-vowels (CV’s), are then paired by computer, aligned for simultaneous onset, converted to analog, and recorded on a stereo tape recorder. The onset time differences using this method were approx. 1 msec. This system is an improvement over the other methods listed above because each onset time difference is the same. The other methods use an acceptable range of onset differences rather than one particular value and, therefore, do not produce consistent onset time differences. The major problem with the Haskins system is that it is unavailable to users. The only way to take advantage of this system is to buy a pre-recorded tape of dichotic CV pairs from Haskins Laboratories. A minicomputer was used by Knight and Kantowitz [19] to generate dichotic word pairs. The onset time difference was excellent (8.6 M), but there were problems with insufficient storage and distortion. Berlin Cl], in a review of the literature on dichotic effects, said that “most experimenters who studied laterality effects in normal subjects needed more carefully controlled stimulus pairs to study most of the effects which concerned them.” He also said that onset control is critical when words are used to test subjects who are not normal. The purpose of this paper is to describe a method for improving onset time difference and minimize distortion in. dichotic recording. The Digital Equipment Corporation’s PDP-12 minicomputer was used for this purpose. Two different versions of computer programs were developed. One uses 8 K of memory to generate dichotic pairs from words up to 614 msec long as developed by Wilson [20]. A 32 K version is not limited to words but permits use of sentences up to 2.5 set long. The peripherals necessary

Generation

of dichotic

231

tapes

to generate dichotic pairs include a real-time computer controlled clock, an analog-todigital converter, two digital-to-analog converters, two lint tape transports, a cathode ray tube (CRT) display and a teletype. Other necessary equipment includes a two channel tape recorder and two filters. All programming was done in assembler language. The process of generating dichotic tapes is divided into three programs: a program to store the words, Part 1; a program to pair the words and adjust to the desired word length, Part 2; and a program to skew if desired and transfer the dichotic pairs to the tape recorder, Part 3. COMPUTER

STORAGE

OF

WORDS

(PART

1)

Since the word length and the sampling rate of words to be stored are dependent on the amount of memory available for storage, these parameters were compromised to attain the best conditions. Therefore, a sampling rate of 10 kHz was chosen. This makes it possible to sample any signal from 0 to 5 kHz, the Nyquist frequency, without losing its intelligibility. This is also an acceptable frequency range for any speech [21,22]. With this sampling rate the word length for the 8 K version program is limited to 614 msec. The sampling by the A-to-D converter is done in the special fast sample mode. The normal mode takes 18.2 msec to complete a conversion. The fast sample mode transfers the last conversion to the accumulator and then initiates a new conversion. The conversion takes place while the computer executes succeeding instructions. In this way the sampling time is shortened to 1.6 msec. This is very important because of the number of instructions that must be executed between each clock interrupt and conversion. The THRESH and HOLD routines check the negative and positive samples respectively in the range of f20 mV- +300 mV. The range was set low enough to take into account the fact that the intensity at the beginning of the word is lower than for the rest of the word. If the input is below 20 mV, or if there is no input, the CRT will continuously display “NO INPUT” until either an acceptable input or an input level which is too high is encountered. If the input is above 3OOmV, the program will respond differently. The CRT will display “INPUT LEVEL TOO HIGH’ and stop sampling, waiting for the output level of the tape recorder to be reset. After resetting the level, the operator must type a linefeed in order to check again for a proper threshold. If the incoming word is not in the proper range, the operator will have to recue the tape recorder to the beginning of the word list. Before the word is permanently stored, its length is determined. If the input level is + 10 mV or less for one hundred consecutive samples (10 msec), the program interprets that as the end of the word. The LENGTH routine determines the duration of the word in msec and stores it in the first two memory locations of the reserved memory space. If the length is less than 3OOmsec the data is not saved because length under 3OOmsec is considered an undesirable signal. Such signals can occur when the tape recorder is started or stopped. Occasionally a sound is emitted just before a word falsely triggers the storage routine. If the data is not stored on tape, the program returns to the threshold routine rapidly enough so that the next’word is not missed. The program has the ability to store in special locations. This feature is useful when a tape has been partially filled at some earlier date. Displays are used frequently throughout the program to make it easy to operate and to inform the operator of his responsibilities and options. The routines used to display the program messages on the CRT, such as special location messages, are called DISPLAY and INPUT routines. These routines are modified versions of QUANDA, as interactive subroutine using the VR12 display, written by a software specialist at Digital Equipment Corporation [23]. To keep a record of the words stored on Linctape, each tape has an index. The INDEX routines can display, save, print, clear or format the index. C.R.\I. II 3-1,

JAMESWILSON et al.

232

PROCESSING

OF

THE

WORDS

(PART

2)

The main purpose of the Part 2 program is to pick two words that are to become a dichotic pair and then match them so that they will be of desired length when the subject listens to them. The actual composition of the final dichotic tapes, except for determining whether or not to skew the onset time, is decided in this program. The program operator must not only pick the two words for the pair, but also decide which is to be the right word and which is to be the left word. If the two words are on separate tapes, the computer stops at the appropriate times and informs the operator. The special storage location routines are available, if needed, and operate in a manner similar to those in Part 1. The SHORTR routine is used to match two words according to desired length, but SHORTR can only shorten words. The routine is essentially a two storage process: shifting the data points to shorten the word length and clearing the memory locations that become unused because of the shift. The SHORTR routine shortens the word by removing a group of consecutive data points from the center of the word. The middle of the word was selected for removal of data points because this portion of the word contains the fewest transients and, therefore, does not alter the sound. Since the difference between the actual length and desired length is given in msec, the milliseconds must be converted to an equal number of memory locations before the shifting can begin. The routine that accomplished this is called SECLOC. The amount a word can be sortened is limited by the distortion that is created when the middle section is removed. The technique of storing the two words, one right after the other, was rejected because the tape motion in Part 3 would become extremely time consuming due to the necessity for the tape to oscillate between the two words stored on tape. Another approach considered, but rejected, would be to store each word of a dichotic pair on a separate tape. The inadequacies of this technique are loss of time switching from transport to transport when the tapes are used in Part 3, and confusion in both Part 2 and Part 3 by having two LINC tapes for each dichotic pair. Therefore, the best method seemed to be to store the words so that they intermesh. There were two ways to accomplish this: storing the words in alternating tape blocks, or storing the words in alternating memory locations. The former method was used because it is easier to program and operate. There was also no additional savings in time by storing words in alternating memory locations. CONSTRUCTION

OF

DICHOTIC

TAPES

(PART

3)

This program must operate on two words in the same amount of time used in Part 1 to store one word in core memory. The timing is very critical, and everything must be prepared to operate at its optimum speed. Before the start of pair transfer, the number of pairs to be transferred and the length of interpair interval must be determined. The interpair interval is the time between dichotic pair output. The skew delay is also available, if it is desired. Due to the critical timing, as much as possible is accomplished before the transfer begins. The tape transports are set to operate in no pause and extended address modes, and the memory is then filled to capacity. After the memory is filled, the clock is started and the output is initiated. When the output is complete, the status of the tape in use is checked. If all of the pairs have been read off the tape, the program is switched to the other tape transport. If the desired number of pairs has been recorded on the tape recorder, the CRT will inform the operator. The onset difference, the starting time difference of the two words of the dichotic pair, is zero because the two words are loaded into each respective D/A converter buffer register before transfer to the output registers.

Fig. 1. Tape recorder output of the word ‘COW”. The time scale for the photograph is 49 msec per division. Fig. 2. Computer output for the word “COW”. Part 1 determined the length of the word to be 490 msec. The time scale for the photograph is 49 msec per division. Fig. 3. The word “COW” shortened in Part 2 by 100 msec. The word was shortened by removing 100 msec from the center of the word. The time scale for the photograph is 49 msec per division.

234

‘ig. 4. Right skew of 40 msec of the word “CAT”. The time scale for the photograph is 20 msec per division. Gg. 5. Right skew of 90 msec of the word “CAT”. The time scale for the photograph is 50 msec per division.

Generation of dichotic tapes

235

The output from the digital-to-analog converters must be filtered before it is recorded on the tape recorder because of the noise created by the discrete output transitions of the D/A converter. Because the signals from the D/A converters are in their final form, and the intervals are all properly timed. the output could be sent directly to a pair of earphones instead of being recorded. However, this would be a waste of computer time, if it became standard procedure. Normally, the output is recorded by a high quality tape recorder. DISCUSSION The three programs discussed here eliminate the onset time difference between the word pairs. The onset difference is consistent for all of the recorded pairs. The quality of the dichotic tape recordings is a function of the three computer programs. The only time that the operator can effect the final result is in the preparation of the tape recordings used in Part 1. It is important that these recordings are consistent in intensity, or dichotic pairs will not be balanced. The input recording should also have a good signal-to-noise ratio. If it does not, the threshold algorithm in Part 1 may trigger on noise on the tape or noise from the tape recorder. A Tandberg Tape Deck Model 9000X was used to determine the noise rejection parameters in Part 1. In Playback mode at 74 in./sec, signal-to-noise ratio of the tape deck is listed in the specification as 56.5 dB for the left channel. In record/playback mode the ratio is 51 dB. Any tape recorder of comparable quality is sufficient to prepare the words for computer input. Figure 1 is a photograph of the word COW before storage and manipulation by the computer. Figure 2 is the same word after processing by the three dichotic programs. It is evident from the two photographs that both the original word and the processed word have the same length and that the envelope of the two words is similar. Figure 3 is the word COW shortened by one hundred milliseconds by removing a one hundred millisecond section from the center of the word in Part 2. By comparing Fig. 3 to Fig. 2 it is possible to see where a section was removed. It is also possible to see that the envelope of the word did not change for the sections of the word that remain. When listening to a word that is shortened by an amount, such as one hundred milliseconds, the word is not distorted.. The shortened sound is not apparent unless the original word is listened to in comparison. Figures 4 and 5 are photographs of the right word delayed by forty and ninety milliseconds, respectively. The accuracy of the delay is a function of the crystal clock. The delay is timed by use of a counter and the 10 kHz frequency of the clock. Therefore, a delay of any acceptable length is accurate. Delays are limited to 99 msec. Past literature has been limited for the most part to research in dichotic listening. However, the potential for using dichotic listening as a clinical tool is becoming more apparent. Clinical skills in diagnosis based on dichotic tapes must, of course, be accumulated before this test technique becomes more widely used in clinical settings. However, because of the sensitivity of these tests in diagnosing lesions of the central nervous system, dichotic tapes will certainly’ become an important future diagnostic tool. It is hoped that the method presented for generating dichotic tapes will help make increased use a reality by making the tapes easier to produce and easier to obtain. SUMMARY Most researchers and clinicians using dichotic stimuli make the dichotic tapes by rather crude methods. A computerized method to generate dichotic words has been developed to improve onset time difference and minimum distortion. The onset time difference is eliminated and the dichotic words reproduced by the computer are of excellent quality. The computer program also allows for setting of word length, skew delay, and interpair interval length by introducing appropriate values to the parameters whenever they are displayed by the program. Using 8 K of memory. the program can

236

JAMES WILSON et al.

generate dichotic pairs from words as long as 614 msec. A 32 K version of the program allows for the use of short sentences up to 2.5 set in duration. The past has produced much research in dichotic listening, but the future lies in increased clinical use. The sensitivity of these tests in the diagnosis of lesions and the lack of risk to the patient may well make the method important in pertinent clinical situations. REFERENCES on dichotic effects-1970. American Academy of Ophthalmology 1. C. I. Berlin, Critical review of literature and Otolaryngology Reviews of Scientific Literature, Special Issue, (Pub.), pp. 80-90 (1972). and C. F. Loovis, Dichotic speech percep2. C. I. Berlin, S. S. Lowe-Bell, J. K. Cullen, Jr., C. L. Thompson tion: an interpretation of right-ear advantage and temporal offset effects, J. Acoust. Sot. Am. 53, 3, 699-709 (1973). The role of auditory localization in attention and memory span. J. r.up. Psycl~ol. 47. 3. D. E. Broadbent. 191-196 (1954). Laterality effects in dichotic listening, Nature 214, 742-743 (1967). 4. S. Oxbury, J. Oxbury and J. Cardiner, in a dichotic detection task, Perception Psychophys. 10, 4A. 239-241 (1971). 5. S. P. Springer, Ear asymmetry and D. Shankwieler, Hemispheric specialization for speech perception, J. Acoust. 6. M. Studdert-Kennedy Sot. Am. 48, 2, 579594 (1970). American Academy of Ophrhalmology and Otolaryngology 7. C. I. Berlin, Review of binaural effects-1969, Reviews of Scientific Literature, pp. 7-28 (1971). and dichotic factors in central auditory testing. in Handbook 8. C. 1. Berlin and S. S. Lowe, Temporal of Clinical Audiology. (Ed. J. Katz). Williams & Wilkins, Baltimore. MD. Vol. 15. pp. 280-312 (1972). and D. G. Kline, Central auditory deficits after temporal 9. C. I. Berlin, S. S. Lowe-Bell, P. J. Janetta lobectomy, Arch. Otohsryng. 96, 4-10 (1972). of discrete I 0. A. J. Yates, P. J. Smith, B. D. Burke and M. A. Keane. A technique for the construction dichotic stimulation material, Behavior Res. Methods Insrrum. 1. 257-258 (1969). and J. T. Ryan. Dichotic 11. S. S. Lowe, J. K. Cullen. Jr., C. 1. Berlin, C. L. Thompson. L. L. Kirkpatrick and monotic simultaneous and time-staggered speech, J. Acousf. Sot. Am. 47. 76 (Abstract) (1970). during dichotic listening by preschool children. 12. C. Clifton, Jr. and R. S. Bogartz. Selective attention J. exp. Child Psycho/. 6, 483-491 (I 968). investigation of dichotic word presentation, J. Speech Hearing Dis. 19, 13. P. H. Ptacek, An experimental 412422 (1954). Attention and Storage in dichotic listening, J. cup. Psycho/. 81. 14. D. J. Murray and C. H. Hitchcock, 164-169 (1969). of dichotically presented vowels. J. Acoust. Sot. Am. 53. 51-58 15. M. S. Weiss and A. S. House. Perception (1973). masking of complex sounds. Q. J. E.xp. Psych. 23. 386392 (1971). 16. C. J. Darwin, Dichotic backward 17. C. J. Darwin. Ear differences in the recall of fricatives and vowels. 0. J. E.w. Psych. 23. 4662 (1971). Ear preference for dichotically presented verbal .stimuli as a function 18. S. E. Gerber and P. Goodman, of report strategies. J. Acousf. Sot. Am. 49. 1163-l 168 (1971). A minicomputer method for generating dichotic word pairs. 19. J. L. Knight, Jr. and B. H. Kantowitz. Behav. Res. Methods Instrum. 5. 231-234 (1973). operated system to generate dichotic tapes. M.S. thesis. University of Toledo 20. J. L. Wilson, A computer (1975). Experimental methods for speech Synthesis by rule, IEEE Trans. 16, 198-202 (1968). 21. I. G. Mattingly, Speech cues and sign stimuli, Am. Sci. 60, 327-337 (1972). 22. I. G. Mattingly, Corporation. 23. Quanda, An Interactive Subroutine Using the VR12 Display, Digital Equipment

About the Authm-Jmm WXSON was born in Toledo, Ohio in 1948. He received a Bachelor’s degree in Electrical Engineering from the University of Toledo, Toledo, Ohio in 1971, and an M.S.E.E. from the school in 1975. Mr Wilson worked for five years in research on memory at the Medical College of Ohio at Toledo. Currently, he is employed in the Communications Group of Motorola, Inc. He is co-author of two papers on operant conditioning and behavioral analysis using a minicomputer.

About the Author--LEE T. ANDREWS received a B.S. degree in Electrical Engineering in 1964 from Milwaukee School of Engineering. After working as a design engineer for National Cash Register Company in Dayton, Ohio he returned to graduate school at the University of Toledo and received a MS. in Electrical Engineering in 1968, a M.S. in Engineering Science in 1969, and the Ph.D. degree in 1973. Dr. Andrews has been on the faculty at the Medical College of Ohio in the Department of Neuroscience since 1971, where he is currently an Associate Professor. His current research interests include computer science, neurophysiology, cardiology, and bioengineering.

Generation of dichotic tapes R. PAREKH received a MS. degree in Industrial Engineering from the University of Toledo in 1970. Since then he has been associated with the Medical College of Ohio at Toledo. He is presently a Senior Systems Analyst in the Biomedical Computer Laboratory of the Department of Neurosciences. Mr. Parekh is interested in the applications of minicomputers in medicine, especially neurophysiology. clinical electromyography and audiology. His expertise lies in the area of software systems for real-time data acquisition and analysis.

About the Author-RAMESH

the Author-MARILYN L. PINHEIROreceived a B.S. degree in English literature and creative writing from Boston University in 1945. and an MA. degree in clinical audiology and speech pathology from Western Reserve University in Cleveland in 1957. The Ph.D. degree in audiology was completed at Case Western Reserve University in 1969, and was followed by two years of postdoctoral work on the nervous system and surface preparations of the inner ear. In September of 1961, Dr. Pinheito accepted a position in the Department of Neurosciences at the Medical College of Ohio, where she is now Associate Professor. She directs the program in speech and hearing, and devotes much of her time to research in auditory perception. In 1975, Dr. Pinheiro was awarded a three year research grant from N.I.H. to continue investigation of auditory pattern perception in both normal and brain damaged children and adults. She has recently been appointed permanent research consultant by the Ear Research Institute in Los Angeles where she is developing research programs in relation to electrical stimulation of the cochlear nerve in deaf patients. About

237

A simple computerized method for generation of dichotic tapes.

A SIMPLE COMPUTERIZED METHOD FOR GENERATION OF DICHOTIC TAPES JAMESWILSON Department of Electrical Engineering. University of Toledo. Toledo. OH 43606...
3MB Sizes 0 Downloads 0 Views