Brain & Language 148 (2015) 64–73

Contents lists available at ScienceDirect

Brain & Language journal homepage: www.elsevier.com/locate/b&l

Temporal dynamics of contingency extraction from tonal and verbal auditory sequences Alexandra Bendixen a,b,⇑, Michael Schwartze c,d, Sonja A. Kotz c,d a Auditory Psychophysiology Lab, Department of Psychology, Cluster of Excellence ‘‘Hearing4all’’, European Medical School, Carl von Ossietzky University of Oldenburg, D-26111 Oldenburg, Germany b Institute of Psychology, University of Leipzig, D-04103 Leipzig, Germany c School of Psychological Sciences, University of Manchester, M13 9PL Manchester, UK d Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, D-04103 Leipzig, Germany

a r t i c l e

i n f o

Article history: Accepted 15 November 2014 Available online 12 December 2014 Keywords: Predictive coding Regularity Deviance detection Mismatch negativity (MMN) Contingency learning Event-related potential (ERP) Brain activity Speech Language

a b s t r a c t Consecutive sound events are often to some degree predictive of each other. Here we investigated the brain’s capacity to detect contingencies between consecutive sounds by means of electroencephalography (EEG) during passive listening. Contingencies were embedded either within tonal or verbal stimuli. Contingency extraction was measured indirectly via the elicitation of the mismatch negativity (MMN) component of the event-related potential (ERP) by contingency violations. MMN results indicate that structurally identical forms of predictability can be extracted from both tonal and verbal stimuli. We also found similar generators to underlie the processing of contingency violations across stimulus types, as well as similar performance in an active-listening follow-up test. However, the process of passive contingency extraction was considerably slower (twice as many rule exemplars were needed) for verbal than for tonal stimuli These results suggest caution in transferring findings on complex predictive regularity processing obtained with tonal stimuli directly to the speech domain. Ó 2014 Elsevier Inc. All rights reserved.

1. Introduction Speech comprehension requires auditory information processing in real-time because the acoustic input is ephemeral – it disappears right after it is encountered. The computationally intense real-time processing load is partly reduced by the fact that speech sounds usually do not follow each other randomly. Instead, upcoming words and phonemes within natural speech are often predictable at various processing levels such as high-level semantics (e.g., Federmeier, 2007) or low-level acoustics (e.g., Arnal & Giraud, 2012). In the present study, we investigated a particular type of predictability, namely contingent transitions between phonemes (e.g., A can be followed by B but not C). Such contingencies are relevant in phonotactic constraints (e.g., Steinberg, Truckenbrodt, & Jacobsen, 2010, 2011) and in learning words on the basis of consistent co-occurrence of a word’s constituents. We specifically ⇑ Corresponding author at: Auditory Psychophysiology Lab, Department of Psychology, Cluster of Excellence ‘‘Hearing4all’’, European Medical School, Carl von Ossietzky University of Oldenburg, Küpkersweg 74, D-26129 Oldenburg, Germany. Fax: +49 (0)441 798 5522. E-mail addresses: [email protected] (A. Bendixen), michael. [email protected] (M. Schwartze), [email protected] (S.A. Kotz). http://dx.doi.org/10.1016/j.bandl.2014.11.009 0093-934X/Ó 2014 Elsevier Inc. All rights reserved.

wanted to assess how quickly such contingencies can be newly acquired from initially arbitrary streams of syllables. Based on the rationale that picking up contingent co-occurrences of phonemes is important during language acquisition, we hypothesized that the healthy human brain should be able to extract such contingencies quickly and efficiently. We capitalized on previous work on the extraction of feature contingencies from pure tone sequences (Bendixen, Prinz, Horváth, Trujillo-Barreto, & Schröger, 2008; Paavilainen, Arajärvi, & Takegata, 2007). These authors investigated the extraction of contingency rules of the type, the duration of one sound predicts the frequency of the next sound (whose duration then again predicts the frequency of the following sound in the sequence, and so on; cf. Fig. 1a). By means of event-related potential (ERP) components extracted from continuous electroencephalography (EEG) recordings, both studies compared the brain activity elicited by tones following the contingency rule with activity elicited by tones violating this rule. Processing differences between these events were taken to demonstrate the successful acquisition of the contingency rule. These processing differences took the form of a frontocentral negative displacement of the ERP for tones violating the contingency rule from 140 to 180 ms following violation onset, known as the so-called mismatch negativity (MMN) component

65

A. Bendixen et al. / Brain & Language 148 (2015) 64–73

of the ERP (e.g., Kujala, Tervaniemi, & Schröger, 2007; Näätänen, Paavilainen, Rinne, & Alho, 2007). The MMN component is often observed in response to deviations from auditory sequential regularities, even when participants pay no attention to the tones. MMN is interpreted as an indirect indicator of the auditory system having picked up the regularity (Schröger, 2007; Winkler, 2007). MMN results showed that it is possible for human listeners to extract the above-mentioned feature contingency (the duration of one sound predicts the frequency of the next sound) even when the tone sequences are presented outside of the focus of attention (Bendixen et al., 2008; Paavilainen et al., 2007). Furthermore, embedding the contingency rule into a dynamic stimulus protocol, in which contingencies kept emerging and vanishing, revealed that the process of contingency extraction happens very quickly: Just 15–20 exemplars of the contingency rule need to be encountered for the brain to pick up the contingent relations and detect subsequent contingency violations (Bendixen et al., 2008). One may wonder why the auditory system should be able to extract such relatively arbitrary transition rules between the duration and frequency of consecutive tones in a sequence. In previous studies, this impressive capacity was interpreted with reference to its importance for language processing (Bendixen et al., 2008; Paavilainen et al., 2007). For instance, using the duration of one acoustic event to predict some characteristics of the next event would be relevant for learning phonotactic rules such as, a long [a] is followed by consonant 1 while a short [a] is followed by consonant 2. Yet though this analogy of tone and speech contingencies is suggestive, no study has yet attempted to transfer the tonal contin-

gency extraction paradigm to speech material. It is, therefore, unclear whether the principles of contingency extraction revealed by pure-tone ERP studies indeed translate to the speech domain. In the present study, we put this issue to a direct test by implementing strictly parallel manipulations of auditory feature contingencies based on tonal stimuli (closely following Bendixen et al., 2008) and based on verbal stimuli (designed to be conceptually identical to the tonal version). We hypothesized common principles of contingency extraction for both stimulus types, thus providing further evidence for the close relation between language and basic auditory temporal processing (Kotz & Schwartze, 2010). More specifically, we tested (1) how many contingent exemplars would be needed within either stimulus set before an auditory contingency would be extracted. We then compared (2) the generators of the involved auditory processes across stimulus types by means of EEG source localization (Michel et al., 2004; Trujillo-Barreto, Aubert-Vázquez, & Valdés-Sosa, 2004). Finally, we investigated (3) whether the contingencies extracted outside the focus of attention (as revealed by ERPs obtained during passive listening) would be accessible during active listening, when participants were asked to detect and overtly report contingency violations. In both previous studies (Bendixen et al., 2008; Paavilainen et al., 2007), active detection performance had been relatively poor, which may have been due to the relatively arbitrary association of basic tone features. We hypothesized that active contingency extraction would be easier within speech stimuli; a finding that would lend further credit to the importance of the investigated processes for everyday language processing.

Frequency

(a)

250ms(Long)

150ms(Short)

High

100ms (ISI)

Low I1

S1 S2

S4

S5

S6

S7

S8

S9

S10

S11

S12

S13 S14 D15

I2

I3

S1

S2

S3

S4

S5

S6

S7

S8

S9

D10

I1

I2

Time

Short followed by low, long followed by high

Onset consonant 250ms (Long)

150ms (Short) ka

K T ta ta

ta:

S1

S2

I1

Rule

(c)

ka ta:

S3

S4

S5

ka ka ta

ta

ta:

S6

S7

S8

100ms (ISI) ka:

ka:

S9 D10

I1

ta

ta:

I2

I3

ka ka

ka:

ka:

ta I4

S1

Short followed by T, long followed by K

S2

S3

S4

ta:

ta

S5

S6

ka

ka:

ta S7

S8

S9

S10

ka:

ka ta:

ta:

ta

S11

S12

S13 S14 D15

ta I1

Time

Short followed by K, long followed by T

Frequency 250ms(Long)

150ms(Short)

High

100ms (ISI)

Low I1

Rule

I1

Short followed by high, long followed by low

Rule

(b)

S3

S1 S2

S3

S4

S5

S6

S7

S8

S9

S10

S11

S12

S13 S14 S15

Short followed by high, long followed by low

?

? I1

S1

S2

S3

S4

S5

S6

S7

S8

S9

D10

Time

Short followed by low, long followed by high

Fig. 1. Exemplary stimulus sequences. (a) Tonal stimuli as presented during passive listening. (b) Verbal stimuli as presented during passive listening. (c) Stimulus sequence during active listening (illustrated here with tonal stimuli, but following identical principles for verbal stimuli). Stimulus categories are abbreviated as follows: S = Standard, D = Deviant, I = Irregular stimulus (i.e., random sequences interspersed between consecutive regular sequences). Numbers indicate positions of the corresponding stimulus category. Question marks indicate positions after which the sequences were stopped during active listening to wait for the participant’s judgment of the last tone as ruleconforming or rule-violating.

66

A. Bendixen et al. / Brain & Language 148 (2015) 64–73

2. Material and methods 2.1. Participants EEG was recorded from 23 healthy volunteers (mean age 25.9 years, range 20–29 years; all right-handed; 12 male, 11 female) in two sessions separated by an average amount of 7.3 days (range 3–12 days). All participants were native speakers of German, had reportedly normal hearing and were not taking any medication affecting the central nervous system. In compliance with the Declaration of Helsinki, all experimental procedures were explained prior to the beginning of the experiment, and participants gave written informed consent. Participants received modest financial compensation for their participation.

2.2. Apparatus and stimuli Participants were seated in an electrically and acoustically shielded chamber. Depending on condition, either tonal or verbal stimuli were presented binaurally via on-ear headphones. Tonal stimuli were created and verbal stimuli were post-processed with Matlab; stimulus delivery was controlled using the Presentation software. During passive listening blocks, participants were additionally presented with a silent movie composed of nature scenes at a screen approximately 1 m in front of them. During active listening blocks, participants were provided with a response keypad containing two buttons. Two different stimulus sets were presented in different conditions. In the tonal condition (cf. Fig. 1a), pure tones were presented in a continuous series with an inter-stimulus interval (ISI) of 100 ms. There were four different tone types, resulting from a combination of two duration values (short, 150 ms; long, 250 ms; both with 10/10 ms raised cosine onset/offset ramps) with two frequency values (low, 1000 Hz; high, 1500 Hz). Note that the frequency and duration differences between the tones were an order of magnitude above the just noticeable difference. The four tone types occurred with equal proportions across a stimulus block; their local sequential order was experimentally manipulated to create emerging and vanishing contingency rules. The duration of each tone was chosen quasi-randomly with the restriction of no more than three tones of the same duration succeeding each other. The frequency of each tone was determined by the duration of the immediately preceding tone according to one of the following, mutually exclusive contingency rules: (1) Short tones are followed by low tones, long tones are followed by high tones. (2) Short tones are followed by high tones, long tones are followed by low tones. Either one of these rules was valid for 9, 14, or 19 consecutive stimuli (standards). The rule was then violated with equal probability in positions 10, 15, or 20; i.e. a deviant stimulus occurred for which the frequency was different from the one predicted by the preceding tone’s duration according to the currently valid rule. Between the deviant tone and the next standard sequence, 3–7 irregular tones were interspersed that quasirandomly followed either rule; i.e. there was no consistent relation between tone duration and tone frequency. These irregular tones were chosen according to the following restrictions: (1) The same rule does not occur more than twice in succession. (2) In order to avoid overlap in the emergence of the new rule, the last two irregular tones violate the rule of the upcoming standard sequence. The verbal condition (cf. Fig. 1b) was constructed according to the same principles and keeping stimulus parameters as close as possible to the tonal condition. Stimuli were four different consonant-vowel (CV) syllables ([ka], [ka:], [ta], [ta:]) with overall equal proportions, resulting from a combination of two syllable durations (short, 150 ms; long, 250 ms; produced by short or long dura-

tions of the vowel [a]) with two onset consonants ([k] vs. [t]). Again, the employed stimulus differences were clearly above discrimination threshold. Syllable duration and onset consonant were chosen according to the same principles as duration and frequency in the tonal conditions. The two mutually exclusive contingency rules were thus specified as follows: (1) A short [a] is followed by a syllable starting with [k], a long [a:] is followed by a syllable starting with [t]. (2) A short [a] is followed by a syllable starting with [t], a long [a:] is followed by a syllable starting with [k]. Contingency rules were again violated in positions 10, 15, or 20; and irregular tones were interspersed before a new rule emerged. As in the tonal condition, syllables were presented with an ISI of 100 ms. The rationale for selecting the CV syllables [ka] and [ta] was based on the transition frequencies reported in the CELEX database for spoken language (Baayen, Piepenbrock, & van Rijn, 1993). We aimed for relatively balanced values for the summed frequencies of occurrence of the relevant transitions in order to avoid any biases towards one or the other contingency rule. The relevant transitions are the ones across syllables, i.e. the frequency of occurrence of short or long [a]s followed by [k] or [t], which are well balanced in German (summed occurrence frequencies of within-word transitions per million according to CELEX: [at] 2767, [a:t] 2256, [ak] 1221, [a:k] 1135). Recordings of the syllables [ka] and [ta] were taken from AT&T (Natural Voices Text-to-Speech Demo: www2.research.att.com/ ~ttsweb/tts/demo.php). The syllables were post-processed to make them physically identical during the vowel part in order to have identical predictive cues in all stimulus exemplars. To this end, the [a] part of the syllable [ta] was taken to replace the [a] part of the syllable [ka]; care was taken to make the cutting unnoticeable and to ensure that the [ka] syllable still sounded natural. As a consequence, the [ta] and [ka] syllables were physically identical from 70 ms after stimulus onset. The resulting CV syllables were shortened to 150 ms (short vowel exemplars: [ta], [ka]) or 250 ms (long vowel exemplars: [ta:], [ka:]) by applying a raised cosine offset ramp of 25 ms duration. The resulting CV syllables were thus identical in the following ways: [ta] = [ta:] from 0 to 125 ms, [ka] = [ka:] from 0 to 125 ms, [ta] = [ka] from 70 to 150 ms, [ta:] = [ka:] from 70 to 250 ms. 2.3. Procedure The two different stimulus sets (tonal/verbal stimuli) were presented in separate stimulus blocks. Each block lasted 6 min and contained 1200 stimuli, comprising 20 deviant stimuli per sequence length (i.e., depending on the number of preceding standards consistently following one contingency rule: deviants in positions 10, 15, or 20). Ten blocks of each stimulus type (tonal/ verbal) were administered; hence 200 deviant stimuli per sequence length and stimulus type were presented altogether. The stimulus blocks were distributed evenly across two different recording sessions. Within each session, the blocks of the two stimulus types were presented in an alternating manner, with the starting block counterbalanced across participants and sessions. Participants were instructed to ignore the auditory stimuli and focus on the movie presented on screen (passive listening). At the end of the second session, participants were informed about the rules and were asked to perform a 15-min behavioral post-test, in which their capability to actively extract the contingency rules and report rule violations was probed (active listening). During this part of the experiment, stimuli were no longer presented in continuous series, but in short sequences of 11, 16, or 21 stimuli (cf. Fig. 1c). The first stimulus of each sequence was randomly chosen, the second to penultimate stimuli followed one of the two contingency rules, and the last stimulus either followed

A. Bendixen et al. / Brain & Language 148 (2015) 64–73

or violated that rule with equal probability (50%). Participants were asked to judge the correctness of the final tone of the sequence by corresponding button presses. The next sequence was initiated 800 ms after the participant’s button press. No feedback was provided. Two blocks per stimulus type were administered, each containing 18 rule-conforming and 18 rule-violating sequences distributed evenly across the length of the regular sequence (10/15/20 stimuli following the contingency rule). These blocks were preceded by a brief practice run of 6 rule-conforming and 6 rule-violating sequences. Button-response assignment and the order of the two stimulus types during active listening were again counterbalanced across participants. 2.4. Data recording and analysis Tones with different frequencies and durations, as well as syllables with different onset consonants and durations, were collapsed during the analysis because only the rule-conformance (standard vs. deviant status), but not the feature values were of interest for the analysis. Note that the overall probability of each stimulus category was 25%; hence any processing difference arising from physical differences between stimuli would cancel out by averaging. During active listening, participants’ responses were recorded, and were scored as correct or incorrect. The proportion of correct responses was calculated separately for each stimulus type (tonal/verbal) and sequence length (i.e., position of the final event), collapsing across sequences with rule-conforming or rule-violating endings. Performance was analyzed in an analysis of variance with repeated measurements (rmANOVA) comprising the factors stimulus type (2 levels: tonal vs. verbal) and sequence length (3 levels: 10/15/20). All ANOVA results for behavioral and EEG data (see below) are reported along with the partial g2 effect size measure. If applicable, post-hoc tests were conducted using the Bonferroni correction of the confidence level for multiple comparisons. The assumption of sphericity was not violated in any of the ANOVAs as indicated by Mauchly’s test. During passive listening, EEG was continuously recorded with Ag/AgCl active electrodes attached to 61 locations according to the 10% extension of the International 10–20 system (Chatrian, Lettich, & Nelson, 1985; Jasper, 1958). Electrodes were mounted in a nylon cap. Recordings were made against a reference electrode placed at the tip of the nose. Additional electrodes were placed at the left and right mastoid sites. Eye movements were monitored by bipolar recordings from electrodes placed above and below the left eye (vertical electrooculogram, VEOG) and lateral to the outer canthi of both eyes (horizontal electrooculogram, HEOG). EEG and EOG signals were amplified (DC to 250 Hz) and sampled at a 500 Hz rate on a BrainAmp EEG system (BrainProducts, Gilching, Germany). Electrode impedances were kept below 5 kX. EEG data were analyzed with EEGlab (Delorme & Makeig, 2004). Correction of EEG activity related to ocular or other stereotypical artifacts (e.g., heartbeat) was achieved with independent component analysis (ICA; see Debener, Thorne, Schneider, & Viola, 2010, for a review). EEG preprocessing steps were optimized separately for the ICA decomposition and for the later ERP analysis (following Debener et al., 2010; see also Hauthal, Thorne, Debener, & Sandmann, 2014; Puschmann et al., 2013). For ICA purposes only, EEG data were filtered with a 1-Hz high-pass filter (finite impulse response [FIR] filter, Kaiser-windowed, Kaiser b = 5.65, filter length 9057 points) and were cut into arbitrary 1-s epochs. Epochs containing unique, non-stereotypical artifacts were removed prior to ICA based on joint probability and kurtosis with a threshold of three standard deviations (Delorme, Sejnowski, & Makeig, 2007). Principal component analysis (PCA) was applied to reduce data dimensionality, and 48 independent components (ICs) were com-

67

puted using the extended infomax algorithm implemented in EEGlab. Stereotypical ICs related to eye blinks, lateral eye movements, and heartbeats were identified. These ICs were removed from the raw EEG data (i.e., without the 1-Hz high-pass filter applied for ICA purposes). The ICA-corrected data were filtered with a bandpass filter from 0.5 to 100 Hz (FIR filter, Kaiser-windowed, Kaiser b = 5.65, filter length 1813 points). Epochs were extracted from the continuous EEG record from 100 to 350 ms relative to the onset of each stimulus. The 100ms pre-stimulus interval was used for baseline correction. Remaining artifacts were removed by rejecting epochs with an amplitude change exceeding 100 lV on any channel from further analysis. This led to retaining 95% of the epochs on average across participants and stimulus types. All artifact-free epochs were averaged separately for each stimulus type (tonal/verbal), rule conformance (standard/deviant), and sequence length (10/15/20) per participant. Difference waves were formed by subtracting the ERPs elicited by standard sounds from the ERPs elicited by their deviant counterparts separately for each stimulus type and sequence length. If the contingency rule was extracted, contingency violations (deviants) should elicit the MMN component relative to standards. The interval for assessing MMN elicitation was taken from Bendixen et al. (2008) because the tonal condition was an almost exact replication of that study. MMN amplitude was thus quantified in an interval from 140 to 180 ms after stimulus onset in the single-subject ERPs separately for each stimulus type and sequence length. The electrodes for assessing MMN were based on the known distribution of the component, showing up as a fronto-central negativity with polarity inversion at the mastoid electrodes (e.g., Näätänen et al., 2007). Following standard recommendations for MMN analysis (e.g., Schröger, 2005), the ERP data were re-referenced against the average activity recorded at the mastoid electrodes, and MMN amplitude measurements were then taken at the FCz electrode. This procedure captures both the frontal and the mastoidal contributions to the MMN signal and is thus best suited to fully evaluate the MMN component (Kujala et al., 2007). To verify the presence of MMN, amplitudes were first tested against zero separately for the two stimulus types and the three sequence lengths using one-sample, one-tailed Student’s t tests. To then compare MMN across conditions, MMN amplitudes were analyzed in an rmANOVA with the factors stimulus type (2 levels: tonal vs. verbal) and sequence length (3 levels: 10/15/20). MMN scalp topography was further studied by means of ERP voltage distributions of the deviant-minus-standard difference waves in the MMN latency range (140–180 ms), separately for each stimulus type and sequence length. These voltage distributions are displayed with the original nose reference in order to demonstrate polarity inversion at the mastoid electrodes. In those conditions where a significant MMN was elicited at the sensor level, a source-space analysis was performed to reveal the cortical generators of the contingency MMN. To this end, brain electrical tomography analyses were conducted with variable resolution electromagnetic tomography (VARETA; Bosch-Bayard et al., 2001; Valdés-Sosa, Marti, García, & Casanova, 2000). With this technique, sources are reconstructed by finding a discrete splineinterpolated solution to the EEG inverse problem: estimating the spatially smoothest intracranial primary current density (PCD) distribution compatible with the observed scalp voltage distribution. This allows point-to-point variation in the amount of spatial smoothness and restricts the possible solutions to the gray matter voxels, based on the probabilistic brain tissue maps available from the Montreal Neurological Institute (Evans et al., 1993). This procedure minimizes the possibility of ‘‘ghost sources’’, which are often present in linear inverse solutions (Trujillo-Barreto et al., 2004). A 3D grid of 3244 points (voxels, 7 mm grid spacing), representing

68

A. Bendixen et al. / Brain & Language 148 (2015) 64–73

possible sources of the scalp potential, and the recording array of 63 electrodes (excluding the bipolar eye channels) were registered with the average probabilistic brain atlas developed at the Montreal Neurological Institute. Subsequently, the scalp potentials in the MMN latency range were transformed into source space (at the predefined 3D grid locations) using VARETA. Statistical parametric maps (SPMs) of the PCD estimates were constructed based on a voxel by voxel Hotelling T2 test against zero (group statistics, based on N = 23) in order to localize the sources of the component separately for the two conditions. For all SPMs, Random Field Theory (Worsley, Marrett, Neelin, & Evans, 1996) was used to correct activation threshold for spatial dependencies between voxels. 100

Stimulus type Tonal Verbal

Accuracy (%)

80

60

40

20

0

10

15

20

Sequence length Fig. 2. Behavioral results. Mean proportions of correct responses in classifying sequence endings into rule-conforming or rule-violating as a function of stimulus material (tonal, black; verbal, gray) and sequence length. Whiskers indicate standard deviation.

Results are shown as 3D activation images constructed on the basis of the average brain. 3. Results During active listening, participants discriminated between rule-conforming (standard) and rule-violating (deviant) endings of the sequences with a mean accuracy of 53.7% (54.2% for tonal stimuli, 53.3% for verbal stimuli; see Fig. 2). Though relatively low, both accuracy values were significantly above chance level as determined by one-sample, one-tailed Student’s t tests against 50% [tonal, t(22) = 2.988, p < 0.01; verbal, t(22) = 2.761, p < 0.05]. Performance did not vary as a function of stimulus type or sequence length, as indicated by a 2-factorial rmANOVA: There was no main effect of stimulus type, F(1, 22) = 0.195, p = 0.663, g2 = 0.009; no main effect of sequence length, F(2, 44) = 0.729, p = 0.488, g2 = 0.032; and no interaction of the effects of stimulus type and sequence length, F(2, 44) = 1.522, p = 0.230, g2 = 0.065. Fig. 3 shows the ERPs elicited during passive listening by ruleconforming (standard) and rule-violating (deviant) stimuli as well as the deviant-minus-standard difference waves for each combination of stimulus type and sequence length. With tonal stimuli, significant MMN components were elicited for all sequence lengths as determined by one-sample, one-tailed t tests against zero [position 10, mean amplitude 0.52 lV, t(22) = 3.718, p < 0.01; position 15, mean amplitude 1.03 lV, t(22) = 3.550, p < 0.01; position 20, mean amplitude 0.68 lV, t(22) = 3.120, p < 0.01]. With verbal stimuli, a significant MMN component was elicited only for sequences of 20 elements [mean amplitude 0.72 lV, t(22) = 3.058, p < 0.01], but not for shorter sequences of 10 or 15 elements [position 10, mean amplitude 0.04 lV, t(22) = 0.275, p = 0.786; position 15, mean amplitude 0.01 lV, t(22) = 0.049, p = 0.961]. Comparing MMN amplitudes across stimulus types and sequence lengths in a two-factorial rmANOVA revealed a main

Fig. 3. ERP results. Grand-average ERPs (N = 23) elicited by standards (black) and deviants (blue), as well as deviant-minus-standard ERP difference wave (red) for sequence lengths of 10 (left), 15 (middle), and 20 (right) with tonal (top) and verbal (bottom) stimuli. Recordings are displayed at FCz re-referenced against the average activity at the mastoid electrodes. The latency range for assessing the MMN component is marked in gray. Note that significant MMN components were elicited for all sequence lengths with tonal stimuli, but only for a sequence length of 20 elements with verbal stimuli.

A. Bendixen et al. / Brain & Language 148 (2015) 64–73

69

Fig. 4. MMN topography and source localization. Left: Voltage topographies (generated with a smoothing parameter of 10 7) in the latency range of the contingency MMN, separately for each stimulus type and sequence length. Right: VARETA source localizations of the contingency MMN for those conditions where significant activity was observed at the sensor level. Significant centers of activation are coded in gray-scale, with darker colors for higher probability values (one-way ANOVA; thresholded to p < 0.0001).

effect of stimulus type, F(1, 22) = 7.812, p < 0.05, g2 = 0.262; a trend for a main effect of sequence length, F(2, 44) = 2.537, p = 0.091, g2 = 0.103; and a trend for an interaction of the effects of stimulus type and sequence length, F(2, 44) = 2.559, p = 0.089, g2 = 0.104. Follow-up one-factorial ANOVAs with the 3-level factor sequence length separately for the two stimulus types yielded no significant effect of sequence length on the MMN component amplitude for tonal stimuli, F(2, 44) = 1.263, p = 0.293, g2 = 0.054, but a significant effect of sequence length on MMN for verbal stimuli,

F(2, 44) = 4.427, p < 0.05, g2 = 0.168. Bonferroni-corrected followup pair-wise t tests indicated significantly larger MMN amplitudes in position 20 than position 10 (p < 0.05), a tendency for larger MMN amplitude in position 20 than position 15 (p = 0.096), and no amplitude difference between positions 10 and 15 (p > 0.999). The voltage topographies displayed in Fig. 4 (left) show that the contingency MMN had highly similar topography across sequence lengths and stimulus types. Fronto-central negativity and polarity inversion at the mastoid electrodes were observed, as is character-

70

A. Bendixen et al. / Brain & Language 148 (2015) 64–73

istic for the MMN component (Kujala et al., 2007; Schröger, 2005), suggesting generators in auditory cortical areas. This was confirmed by the statistical results of the source-space reconstruction for the activity in the omission MMN latency range (Fig. 4, right). The reconstructed generator configurations for the four conditions with significant activity at the sensor level were again highly similar, involving bilateral temporal regions with a maximum in the superior temporal gyrus (STG). 4. Discussion The present study was designed to investigate whether auditory feature contingencies are extracted in a similar manner from tonal and verbal sequences. Results suggest that contingency extraction is possible with both types of stimuli, but occurs sooner (i.e., after encountering fewer exemplars) for tonal than for verbal sequences. For either stimulus type, participants had difficulties to access the information about contingencies and their violations when asked to overtly detect contingency violations. These results are discussed in turn below. 4.1. Contingency extraction from tone sequences Two previous studies had demonstrated that a tonal contingency rule of the type, the duration of one sound predicts the frequency of the next sound, can be extracted during passive listening (Bendixen et al., 2008; Paavilainen et al., 2007). The present results successfully replicate these findings. They also confirm that the contingency rule does not need to be valid for prolonged time-periods, but can be extracted from short tone sequences following such a rule embedded into an ever-changing tone series containing many instantiations of a conflicting rule (i.e., a rule making opposite predictions) as well as interspersed random stimulus arrangements (Bendixen et al., 2008). According to the present data, 10 rule-conforming tones are sufficient to extract the contingency rule amidst this complex dynamic tone series. This is suggestive of highly efficient local rule extraction mechanisms in the auditory system (cf. Costa-Faidella, Grimm, Slabu, Díaz-Santaella, & Escera, 2011; Ulanovsky, Las, Farkas, & Nelken, 2004), even for very abstract rules (cf. Näätänen, Astikainen, Ruusuvirta, & Huotilainen, 2010; Näätänen, Tervaniemi, Sussman, Paavilainen, & Winkler, 2001). The present observation of a minimum number of 10 rule-conforming tones even exceeds previous estimates of 15–20 exemplars being necessary for the extraction of this particular rule (Bendixen et al., 2008). This difference between studies may be due to higher signal-to-noise ratio in the present data (note that the sequence lengths could not be assessed individually in the previous study due to unfavorable signal-to-noise ratio; Bendixen et al., 2008). Other differences relative to the previous study include a change in the global context of the stimulus sequences (by excluding regular sequences with only 5 elements) as well as changes in temporal parameters: Tones were longer (by 100 ms) and the silent ISIs were shorter (by 200 ms) in the present study. This led to a faster overall pace, which possibly facilitated the formation of predictive links spanning from one tone to the next (Winkler, Czigler, Jaramillo, Paavilainen, & Näätänen, 1998; Yabe et al., 1998). Once the contingency rule had been extracted, there was no further increase of MMN amplitude with more rule-conforming events (i.e., after sequence lengths of 15 or 20 tones as compared to 10 tones). This contrasts findings from MMN paradigms based on simple repetition rules, in which the brain response to a deviant sound is strongly affected by the number of preceding standards (e.g., Baldeweg, Klugman, Gruzelier, & Hirsch, 2004; Bendixen,

Roeber, & Schröger, 2007; Haenschel, Vernon, Dwivedi, Gruzelier, & Baldeweg, 2005). Bendixen and Schröger (2008) suggested that the apparent increase of MMN amplitude with the length of the preceding regular sequence in case of simple repetition rules is in fact due to confounding contributions from adaptation processes affecting the N1 component (see also Horváth et al., 2008). The present data strengthen the view that violations of more abstract rules elicit the MMN in an all-or-none fashion. Topography and source localization of the tonal contingency MMN were consistent with generators in auditory cortical areas. The strong right-hemispheric preponderance of the cortical generators observed in Bendixen et al. (2008) was not replicated here. Instead, the three different EEG source solutions (for the three sequence lengths in the tonal condition) consistently yielded a symmetric activation pattern in the present study, suggesting the involvement of both left- and right-hemispheric temporal regions.

4.2. Contingency extraction from speech sequences The main aim of the present study was to transfer the tonal contingency MMN paradigm to an analog version with speech stimuli, involving verbal contingency rules of the type, the duration of one CV syllable’s vowel predicts the onset consonant of the next CV syllable. The present data show that this transfer was successful: It is indeed possible to extract structurally identical contingencies from verbal stimuli, again without paying attention to the stimuli. This is indicative of the generality of the involved contingency learning mechanism in the auditory system. There was, however, an important difference between the tonal and the verbal condition: With CV syllable sequences, 10 or 15 exemplars were not sufficient to extract the contingency rule and detect violations from it, but a minimum of 20 rule-conforming events had to be encountered. A possible explanation for this is that the long and short versions of the syllables ([ka], [ka:], [ta], [ta:]) both activated the same (and strongly overlearned) categorical representation ‘‘ka’’ or ‘‘ta’’, causing the auditory system to initially disregard the duration difference in the vowel. Only with longer exposure would the acoustic difference between [a] and [a:] ‘‘win’’ against this categorization process, thereby making rule extraction possible after a higher number of syllables following the contingency rule. Another explanation would be that the silent interval between the syllables, though being as short as 100 ms, weakened the binding across consecutive syllables as it made the syllables stand out as individual events (more so than the individual tones in the tonal version of the paradigm). In order to tease apart these two explanations, future studies may employ differences in vowel identity rather than vowel duration (to counteract the categorization process), or they may reduce the duration of the silent interval between syllables (to strengthen the binding across syllables). Yet even if more exemplars than for tone sequences were needed, it seems remarkable that the speech contingency could be extracted at all – and again from short (20-element) sequences within rapidly changing auditory event series. Together with recent findings on segmental predictions in speech perception (Gagnepain, Henson, & Davis, 2012), the present findings make a strong case for the contribution of local predictions (linking consecutive segments or phonemes) to the efficient processing of verbal stimuli. This complements previous studies on learning verbal contingencies across intervening elements as is needed for acquiring syntactic rules (e.g., Friederici, Mueller, & Oberecker, 2011; Mueller, Friederici, & Männel, 2012; Mueller, Oberecker, & Friederici, 2009). Nevertheless, it remains to be shown that the process of contingency learning demonstrated in the present experiment has implications for corresponding effects in natural

A. Bendixen et al. / Brain & Language 148 (2015) 64–73

speech such as acquiring phonotactic constraints or learning words on the basis of contingent co-occurrences. Based on the assumed ‘‘everyday’’ relevance of the contingency extraction process under study here, we had hypothesized that active contingency extraction would be easier with speech than with tonal stimuli However, participants’ accuracy in detecting contingency violations was as poor with speech stimuli as with tonal stimuli and very similar to previous studies in the tonal domain, in which performance had likewise been just above chance level (Bendixen et al., 2008; Paavilainen et al., 2007). This leads us to reject the hypothesis that the arbitrariness of the tonal stimuli was responsible for precluding conscious access to the automatically extracted contingencies. 4.3. Relation between ERP and behavioral indicators of rule extraction For tonal as well as verbal stimuli, active detection performance was surprisingly weak, given that the information about the contingency violations was available to the auditory system as indicated by the MMN component. The results indicate a dissociation between implicit (covert) rule extraction as measured by MMN and explicit (overt) rule extraction as measured by the behavioral data. This is consistent with some previous studies in which active rule violation detection for complex auditory rules was poor despite MMN being elicited during passive listening (Paavilainen, Simola, Jaramillo, Näätänen, & Winkler, 2001; van Zuijen, Simoens, Paavilainen, Näätänen, & Tervaniemi, 2006). In contrast, a tight relation between MMN-based and behavioral indicators of rule extraction is typically observed for more simple auditory rules (e.g., Tiitinen, May, Reinikainen, & Näätänen, 1994). A possible explanation is that when confronted with complex rules, participants attempt to extract rules and detect rule violations by conscious (top-down) strategies instead of relying on the output of their pre-attentive (bottom-up) deviance detection system. This strategy may be advantageous in some situations in which top-down rule extraction is actually more accurate than MMN-related processing (e.g., Chennu et al., 2013), but it would be disadvantageous in the present case in which the MMN-generating system is evidently better able to extract the complex contingencies. Future studies should clarify whether the effortless acquisition of feature contingencies during passive listening can be exploited during active listening when a more elaborate feedback and training procedure is implemented than the one employed in the present and previous experiments (Bendixen et al., 2008; Paavilainen et al., 2007). If it is possible to teach participants to ‘‘read out’’ the output of their pre-attentive deviance detection system, this should selectively improve performance in those conditions in which MMN is elicited. Results along these lines would strengthen the behavioral relevance of the underlying brain mechanism of complex rule extraction. In any case, the present data and both previous studies (Bendixen et al., 2008; Paavilainen et al., 2007) constitute interesting examples of ERPs being more sensitive than behavior to the brain’s capacities for processing complex auditory material.

71

source-space results suggest that largely the same brain areas are involved in detecting contingency violations for both stimulus types. Naturally, this finding is put into perspective by the limited accuracy of EEG source localization (Michel et al., 2004), and studies employing techniques with higher spatial resolution would provide valuable additional support. Crucially, it is clear from the present EEG data that any claims on analogies in contingency extraction from tonal or verbal stimuli must be qualified by differences in the temporal dynamics of the involved processes. Specifically, extraction occurred sooner (i.e., only half as many exemplars of the rule were needed) for tone than for syllable sequences. This is suggestive of a qualitative difference in the brain’s treatment of the different stimulus types. Such material-specific constraints must be considered, for instance, when building computational models of sequential regularity extraction } hm, Bendixen, Winkler, & or auditory object formation (e.g., Mill, Bo Denham, 2013): The number of elements needed to extract a sequential pattern emitted by a given sound source cannot be characterized for speech stimuli on the basis of results obtained with tonal stimuli (or vice versa), but must be empirically determined separately for each stimulus type. By investigating the dynamics of regularity extraction, the present results provide an informative complement to other studies comparing regularity- and deviance-related processes between tonal and verbal stimuli. These studies have mainly focused on amplitude differences between the MMN components elicited by similar deviations from tonal or verbal stimuli (e.g., Aaltonen, Tuomainen, Laine, & Niemi, 1993; Jaramillo, Alku, & Paavilainen, 1999; Korpilahti, Krause, Holopainen, & Lang, 2001). Some studies have found larger MMN responses for acoustically similar deviations within verbal than within tonal stimuli, especially when the acoustic deviation was phonemically relevant in the native language of the listener (e.g., Jaramillo et al., 2001; Kuuluvainen et al., 2014). Other studies have produced opposite findings, showing larger MMN responses to similar changes in tonal than in corresponding verbal stimuli (e.g., Maiste, Wiens, Hunt, Scherg, & Picton, 1995; Tampas, Harkrider, & Hedrick, 2005). The latter finding is often attributed to the categorical nature of speech perception (cf. Winkler et al., 1999; Xi, Zhang, Shu, Zhang, & Li, 2010) that leads the auditory system to neglect acoustic variation in the stimulus as long as phonemic categories are retained. This account is similar to the one we give above for explaining the higher number of exemplars needed to extract the present verbal contingency rule. Altogether, there are relatively few direct comparisons of MMN elicited by tonal or verbal stimuli in the same participants, and even fewer studies using complex (non-repetition-based) rules such as the one employed here. Since MMN for verbal stimuli as such is very well established now (see Pulvermüller & Shtyrov, 2006, for a comprehensive review), more systematic comparisons between the tone and speech domains may be performed in future studies to assess whether findings in either field can reliably inform the other.

5. Conclusions 4.4. Transferring findings from tonal to verbal stimuli Transferring the tonal contingency paradigm to verbal stimuli was, among other aims, motivated by the question of whether it is generally possible to infer characteristics of speech processing from paradigms based on pure-tone sequences alone. The present data suggest that such inferences are possible at least on a coarse scale: Structurally identical contingencies could be extracted from both types of stimuli. Topographical and tomographical data support the hypothesis of similarity in the underlying processes of tonal and verbal contingency extraction. Both sensor- and

The present results suggest that complex contingencies between consecutive sound events can be acquired from tonal or verbal stimuli in an effortless manner, without the need for focused attention on the auditory stimuli. The process of contingency extraction is more efficient for tonal than for verbal stimuli, which may be attributed to the categorical nature of speech perception. Despite the rapid acquisition of contingency rules during passive listening, participants do not seem to be able to access this information consciously for the overt detection of regularity and deviance.

72

A. Bendixen et al. / Brain & Language 148 (2015) 64–73

Acknowledgments This work was funded by the German Research Foundation (Deutsche Forschungsgemeinschaft, [DFG], individual Grant DFGKO 2268/6-1 to S.A.K., DFG Cluster of Excellence 1077 ‘‘Hearing4all’’, and SFB/TRR 31 ‘‘The active auditory system’’). EEG data analysis was performed with EEGlab (Delorme & Makeig, 2004) and additional EEGlab plug-ins written by Andreas Widmann, University of Leipzig, Germany. VARETA source localization was performed with scripts provided by Nelson Trujillo-Barreto, University of Manchester, UK. The authors are grateful to Ingmar Brilmayer and Cornelia Schmidt for assistance in data acquisition as well as to Chaitra Venkataramana Nayak for assistance in figure preparation.

References Aaltonen, O., Tuomainen, J., Laine, M., & Niemi, P. (1993). Cortical differences in tonal versus vowel processing as revealed by an ERP component called mismatch negativity (MMN). Brain and Language, 44, 139–152. Arnal, L. H., & Giraud, A.-L. (2012). Cortical oscillations and sensory predictions. Trends in Cognitive Sciences, 16, 390–398. Baayen, R. H., Piepenbrock, R., & van Rijn, H. (1993). The CELEX lexical database. Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania. Baldeweg, T., Klugman, A., Gruzelier, J., & Hirsch, S. R. (2004). Mismatch negativity potentials and cognitive impairment in schizophrenia. Schizophrenia Research, 69, 203–217. Bendixen, A., Prinz, W., Horváth, J., Trujillo-Barreto, N. J., & Schröger, E. (2008). Rapid extraction of auditory feature contingencies. NeuroImage, 41, 1111–1119. Bendixen, A., Roeber, U., & Schröger, E. (2007). Regularity extraction and application in dynamic auditory stimulus sequences. Journal of Cognitive Neuroscience, 19, 1664–1677. Bendixen, A., & Schröger, E. (2008). Memory trace formation for abstract auditory features and its consequences in different attentional contexts. Biological Psychology, 78, 231–241. Bosch-Bayard, J., Valdés-Sosa, P., Virues-Alba, T., Aubert-Vázquez, E., John, E. R., Harmony, T., et al. (2001). 3D statistical parametric mapping of EEG source spectra by means of variable resolution electromagnetic tomography (VARETA). Clinical Electroencephalography, 32, 47–61. Chatrian, G. E., Lettich, E., & Nelson, P. L. (1985). Ten percent electrode system for topographic studies of spontaneous and evoked EEG activities. American Journal of EEG Technology, 25, 83–92. Chennu, S., Noreika, V., Gueorguiev, D., Blenkmann, A., Kochen, S., Ibáñez, A., et al. (2013). Expectation and attention in hierarchical auditory prediction. Journal of Neuroscience, 33, 11194–11205. Costa-Faidella, J., Grimm, S., Slabu, L., Díaz-Santaella, F., & Escera, C. (2011). Multiple time scales of adaptation in the auditory system as revealed by human evoked potentials. Psychophysiology, 48, 774–783. Debener, S., Thorne, J., Schneider, T. R., & Viola, F. (2010). Using ICA for the analysis of multi-channel EEG data. In M. Ullsperger & S. Debener (Eds.), Simultaneous EEG and fMRI: Recording, analysis, and application (pp. 121–134). New York: Oxford University Press. Delorme, A., & Makeig, S. (2004). EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of Neuroscience Methods, 134, 9–21. Delorme, A., Sejnowski, T., & Makeig, S. (2007). Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis. Neuroimage, 34, 1443–1449. Evans, A. C., Collins, D. L., Mills, S. R., Brown, E. D., Kelly, R. L., & Peters, T. M. (1993). 3D statistical neuroanatomical models from 305 MRI volumes. 1993 IEEE Conference Record on Nuclear Science Symposium and Medical Imaging Conference, 3, 1813–1817. Federmeier, K. (2007). Thinking ahead: The role and roots of prediction in language comprehension. Psychophysiology, 44, 491–505. Friederici, A. D., Mueller, J. L., & Oberecker, R. (2011). Precursors to natural grammar learning: Preliminary evidence from 4-month-old infants. PLoS ONE, 6, e17920. Gagnepain, P., Henson, R. N., & Davis, M. H. (2012). Temporal predictive codes for spoken words in auditory cortex. Current Biology, 22, 615–621. Haenschel, C., Vernon, D. J., Dwivedi, P., Gruzelier, J. H., & Baldeweg, T. (2005). Event- related brain potential correlates of human auditory sensory memorytrace formation. Journal of Neuroscience, 25, 10494–10501. Hauthal, N., Thorne, J., Debener, S., & Sandmann, P. (2014). Source localisation of visual evoked potentials in congenitally deaf individuals. Brain Topography, 27, 412–424. Horváth, J., Czigler, I., Jacobsen, T., Maeß, B., Schröger, E., & Winkler, I. (2008). MMN or no MMN: No magnitude of deviance effect on the MMN amplitude. Psychophysiology, 45, 60–69. Jaramillo, M., Alku, P., & Paavilainen, P. (1999). An event-related potential (ERP) study of duration changes in speech and non-speech sounds. NeuroReport, 10, 3301–3305.

Jaramillo, M., Ilvonen, T., Kujala, T., Alku, P., Tervaniemi, M., & Alho, K. (2001). Are different kinds of acoustic features processed differently for speech and nonspeech sounds? Cognitive Brain Research, 12, 459–466. Jasper, H. H. (1958). The ten-twenty electrode system of the International Federation. Electroencephalography and Clinical Neurophysiology, 10, 371–375. Korpilahti, P., Krause, C. M., Holopainen, I., & Lang, A. H. (2001). Early and late mismatch negativity elicited by words and speech-like stimuli in children. Brain and Language, 76, 332–339. Kotz, S. A., & Schwartze, M. (2010). Cortical speech processing unplugged: A timely subcortico-cortical framework. Trends in Cognitive Sciences, 14, 392–399. Kujala, T., Tervaniemi, M., & Schröger, E. (2007). The mismatch negativity in cognitive and clinical neuroscience: Theoretical and methodological considerations. Biological Psychology, 74, 1–19. Kuuluvainen, S., Nevalainen, P., Sorokin, A., Mittag, M., Partanen, E., Putkinen, V., et al. (2014). The neural basis of sublexical speech and corresponding nonspeech processing: A combined EEG–MEG study. Brain and Language, 130, 19–32. Maiste, A. C., Wiens, A. S., Hunt, M. J., Scherg, M., & Picton, T. W. (1995). Eventrelated potentials and the categorical perception of speech sounds. Ear and Hearing, 16, 68–90. Michel, C. M., Murray, M. M., Lantz, G., Gonzalez, S., Spinelli, L., & Grave de Peralta, R. (2004). EEG source imaging. Clinical Neurophysiology, 115, 2195–2222. }hm, T. M., Bendixen, A., Winkler, I., & Denham, S. L. (2013). Modelling Mill, R. W., Bo the emergence and dynamics of perceptual organisation in auditory streaming. PLoS Computational Biology, 9, e1002925. Mueller, J. L., Friederici, A. D., & Männel, C. (2012). Auditory perception at the root of language learning. Proceedings of the National academy of Sciences of the United States of America, 109, 15953–15958. Mueller, J. L., Oberecker, R., & Friederici, A. D. (2009). Syntactic learning by mere exposure: An ERP study in adult learners. BMC Neuroscience, 10, 89. Näätänen, R., Astikainen, P., Ruusuvirta, T., & Huotilainen, M. (2010). Automatic auditory intelligence: An expression of the sensory–cognitive core of cognitive processes. Brain Research Reviews, 64, 123–136. Näätänen, R., Paavilainen, P., Rinne, T., & Alho, K. (2007). The mismatch negativity (MMN) in basic research of central auditory processing: A review. Clinical Neurophysiology, 118, 2544–2590. Näätänen, R., Tervaniemi, M., Sussman, E., Paavilainen, P., & Winkler, I. (2001). ‘Primitive intelligence’ in the auditory cortex. Trends in Neurosciences, 24, 283–288. Paavilainen, P., Arajärvi, P., & Takegata, R. (2007). Preattentive detection of nonsalient contingencies between auditory features. NeuroReport, 18, 159–163. Paavilainen, P., Simola, J., Jaramillo, M., Näätänen, R., & Winkler, I. (2001). Preattentive extraction of abstract feature conjunctions from auditory stimulation as reflected by the mismatch negativity (MMN). Psychophysiology, 38, 359–365. Pulvermüller, F., & Shtyrov, Y. (2006). Language outside the focus of attention: The mismatch negativity as a tool for studying higher cognitive processes. Progress in Neurobiology, 79, 49–71. Puschmann, S., Sandmann, P., Ahrens, J., Thorne, J., Weerda, R., Klump, G., et al. (2013). Electrophysiological correlates of auditory change detection and change deafness in complex auditory scenes. Neuroimage, 75, 155–164. Schröger, E. (2005). The mismatch negativity as a tool to study auditory processing. Acta Acustica United with Acustica, 91, 490–501. Schröger, E. (2007). Mismatch negativity: A microphone into auditory memory. Journal of Psychophysiology, 21, 138–146. Steinberg, J., Truckenbrodt, H., & Jacobsen, T. (2010). Preattentive phonotactic processing as indexed by the mismatch negativity. Journal of Cognitive Neuroscience, 22, 2174–2185. Steinberg, J., Truckenbrodt, H., & Jacobsen, T. (2011). Phonotactic constraint violations in German grammar are detected automatically in auditory speech processing: A human event-related potentials study. Psychophysiology, 48, 1208–1216. Tampas, J. W., Harkrider, A. W., & Hedrick, M. S. (2005). Neurophysiological indices of speech and nonspeech stimulus processing. Journal of Speech, Language, and Hearing Research, 48, 1147–1164. Tiitinen, H., May, P., Reinikainen, K., & Näätänen, R. (1994). Attentive novelty detection in humans is governed by pre-attentive sensory memory. Nature, 372, 90–92. Trujillo-Barreto, N. J., Aubert-Vázquez, E., & Valdés-Sosa, P. A. (2004). Bayesian model averaging in EEG/MEG imaging. Neuroimage, 21, 1300–1319. Ulanovsky, N., Las, L., Farkas, D., & Nelken, I. (2004). Multiple time scales of adaptation in auditory cortex neurons. Journal of Neuroscience, 24, 10440–10453. Valdés-Sosa, P., Marti, F., García, F., & Casanova, R. (2000). Variable resolution electric-magnetic tomography. Proceedings of the Tenth International Conference on Biomagnetism, 2, 373–376. van Zuijen, T. L., Simoens, V. L., Paavilainen, P., Näätänen, R., & Tervaniemi, M. (2006). Implicit, intuitive, and explicit knowledge of abstract regularities in a sound sequence: An event-related brain potential study. Journal of Cognitive Neuroscience, 18, 1292–1303. Winkler, I. (2007). Interpreting the mismatch negativity. Journal of Psychophysiology, 21, 147–163. Winkler, I., Czigler, I., Jaramillo, M., Paavilainen, P., & Näätänen, R. (1998). Temporal constraints of auditory event synthesis: Evidence from ERPs. NeuroReport, 9, 495–499. Winkler, I., Lehtokoski, A., Alku, P., Vainio, M., Czigler, I., Csépe, V., et al. (1999). Preattentive detection of vowel contrasts utilizes both phonetic and auditory memory representations. Cognitive Brain Research, 7, 357–369.

A. Bendixen et al. / Brain & Language 148 (2015) 64–73 Worsley, K. J., Marrett, S., Neelin, P., & Evans, A. C. (1996). Searching scale space for activation in PET images. Human Brain Mapping, 4, 74–90. Xi, J., Zhang, L., Shu, H., Zhang, Y., & Li, P. (2010). Categorical perception of lexical tones in Chinese revealed by mismatch negativity. Neuroscience, 170, 223–231.

73

Yabe, H., Tervaniemi, M., Sinkkonen, J., Huotilainen, M., Ilmoniemi, R. J., & Näätänen, R. (1998). Temporal window of integration of auditory information in the human brain. Psychophysiology, 35, 615–619.

Temporal dynamics of contingency extraction from tonal and verbal auditory sequences.

Consecutive sound events are often to some degree predictive of each other. Here we investigated the brain's capacity to detect contingencies between ...
1MB Sizes 0 Downloads 7 Views