• Keine Ergebnisse gefunden

PROCESSING AND REPRESENTATION OF TONES IN SWEDISH

4.2.2 Processing of Stress and Tone

Experiments comparing the relative importance of suprasegmental and segmental cues can be differentiated into those dealing with prosodic word stress and those with lexical tones. Experiments on lexical tones exclusively examined Asian tonal languages. In these languages, tone plays a crucial role in word recognition because only the segmental level is not sufficient to uniquely identify the correct item. This is not the case for Scandinavian tonal languages. Linguistically, they undisputedly employ lexical tones, but the psychological importance of these tones is not as clear, since there are very few minimal tonal pairs in these languages and because there are even dialects with no tone at all in Swedish, Norwegian and Danish. Therefore, we also briefly review the psycholinguistic literature on the impact of word stress in recognition.

4.2.2.1 Processing of Word Stress: Behavioural Experiments

Using cross modal fragment priming, Soto-Faraco et al. (Soto-Faraco, Sebastian-Gallés & Cutler, 2001) examined how the deviance between prime and target in either stress or segments affected word access in Spanish. Spanish has very few minimal stress pairs (e.g. saBAna – SAbana, ‘savannah - sheet’; uppercase indicates stress placement), but stress frequently has a grammatical function (e.g.

CAso – caSO, ‘to marry - he married’). They used pairs of trisyllabic words with identical first and second syllables except for stress placement on first vs. second syllable (e.g. PRINcipe – prinCIpio, ‘prince – beginning’), or the vowel of the second syllable (e.g. abanDOno – abunDANcia, ’abandon – abundance’), or the consonant of the second syllable (e.g. paPIlla – paTIlla, ’baby food – side burn’).

When the target was preceded by its own first two syllables (e.g. prinCI-PRINCIPIO), lexical decision responses were facilitated as compared to a control condition (e.g. mos-PRINCIPIO). When the target was preceded by the first two

syllables of the pair member that deviated in either stress or a segment (e.g.

PRINci-PRINCIPIO), responses were slower than in the control condition. That is, any deviance between prime fragment and target word caused inhibition of responses. The amount of inhibition was similar across stress, consonants and vowels. The authors therefore concluded that segmental and stress information equally contribute to word activation in Spanish.

Similar findings held true for Dutch. Employing the same design, but restricted to stress differences only (e.g. OCtopus – okTOber) van Donselaar et al.

(van Donselaar, Koster & Cutler, 2005) obtained inhibition for stress differences in the test as compared to the control condition. Contrarily, English listeners showed no inhibition in such cases (e.g. ADmiral – admiRAtion) but reaction times to targets primed by the first two syllables of the differently stressed member were as slow as those to the control condition (Cooper, Cutler & Wales, 2002). When only one syllable was used as prime which could be either stressed or unstressed (e.g.

MUsic – muSEum), responses to the stress mismatch condition were still slower than to the identical condition, but faster than to the control condition. A repetition priming experiment using Dutch minimal stress pairs (e.g. VOORnaam – voorNAAM, ‘first name - respectable’) yielded similar results too (Cutler & van Donselaar, 2001). Lexical decisions to a target were faster when it was preceded by itself than by some unrelated word. When a word was preceded by its minimal-stress-pair-member, lexical decision times were the same as when preceded by an unrelated word. The different stress pattern prevented activation of the lexical entry.

In all studies mentioned above, the primes that had different stress than the target still represented beginnings of existing words in the language. This was not the case in another cross modal fragment priming study by Cutler and van Donselaar (2001). Trisyllabic Dutch target words (e.g. muSEum, ‘museum’) were preceded by disyllabic auditory primes consisting either of the targets’ first syllables (e.g. muSE-), the targets’ first syllables with different stress pattern (e.g.

*MUse-), or with different onset consonant (e.g. *luSE-), or by an unrelated prime (e.g. *VIba-). The items with correct and incorrect stress patterns were spoken by a female speaker who read them all as disyllabic nonsense strings (i.e. *muZEE,

*MUzee, *luZEE, *VIba). Again, lexical decisions to the condition with the stress difference were slower than to the identical condition, but they were still faster than reactions to the segmentally different condition, which did not differ from the control condition. That is, a stress mismatch reduced the activation of the target, but did not have as much impact as the segmental mismatch had.

To sum up these behavioural experiments, stress appears to serve as a strong cue in lexical access. A mismatch in stress between prime and target has been shown to prevent preactivation (Cutler and van Donselaar, 2001; Cooper, Cutler & Wales, 2002) or to even lead to inhibition of the target (Soto-Faraco, Sebastian-Gallés & Cutler, 2001; van Donselaar, Koster & Cutler, 2005). Only if very little stress information is given (e.g. only the first syllable) (Cooper, Cutler &

Wales, 2002), or if the fragment with incorrect stress is not part of an existing word in the language (Cutler and van Donselaar, 2001), is the target activated despite mismatching stress. This preactivation, however, is weaker than when the prime matches in stress.

4.2.2.2 Processing of Word Stress: Electrophysiological Experiments

There are also some EEG studies on pitch information, namely stress, at the word level. Friedrich et al. (Friedrich, Alter & Kotz, 2001) presented subjects with disyllabic German words, half of which originally carried stress on the first syllable and half on the second. All stimuli were manipulated so that they were presented once with correct stress and once with incorrect stress. This was achieved by imposing the pitch contours of one stressed and one unstressed item of the set on all stimuli. Duration and loudness were not changed. Therefore, strictly speaking, not stress, but pitch was manipulated and investigated. Participants had to judge either the correctness of the stress pattern, or the animacy of the items. Neither task yielded any observable differences in event related potentials depending on the correctness of the pitch pattern. However, in posterior regions, the P2, a positive deflection between 200 and 280 ms, was more pronounced for initially unstressed than initially stressed items, irrespective of correctness of this stress pattern. The authors interpreted the enhanced positivity for the unstressed pitch contour as a kind of mismatch detection, because most German words are stressed on the first syllable.

The effects of stress mismatch in German were further examined in a cross modal fragment priming study (Friedrich, Kotz, Friederici & Alter, 2004). The prime fragment was either identical to the first syllable of the visual target (e.g. re- reGAL, with re- taken from reGAL, ‘shelves’) or deviated from it in pitch (e.g. RE- reGAL, with RE- taken from REgel, ‘rule’). Segmentally unrelated targets with same (e.g. re- akTEUR, ‘protagonist’, stressed on second syllable) or different stress pattern (re- WIRbel, ‘curl’, stressed on first syllable) served as controls. All carrier-words were spoken by a female speaker and the first syllable was isolated.

F0 contours were then averaged within all stressed and unstressed first syllables.

The resulting ‘typical’ contours were superimposed on all syllables, i.e. the originally unstressed syllable re- from reGAL was assigned a stressed, as well as an unstressed pitch contour. The voiced parts of the signal were further manipulated so that duration and amplitude were kept constant across stressed and unstressed syllables. Thus it was again pitch, rather than stress that was examined.

It should be mentioned here that it is not very natural for duration and amplitude not to vary between stressed and unstressed syllables. In a 300-400ms time window, the segmental as well as the pitch mismatch elicited less negative ERP amplitudes than the identical condition. Pitch thus behaved in line with the P350 effect found for segments. The effects of segmental violations were significant in the posterior and in the left anterior region. Pitch mismatches affected the amplitude only in posterior regions. Interestingly enough, there was no interaction of pitch with segmental information. In a 400-600ms time window, where typically the N400 (Kutas & Hillyard, 1980) is observed, amplitudes for segmental mismatch were more negative than for segmental match in the posterior regions.

No such N400 effect was obtained for deviant pitch. The P350 thus proved sensitive to pitch/stress placement differences in a cross-modal fragment priming design. However, scalp topography differed between segments and pitch. While segmental effects are lateralized to the left, visible in posterior as well as anterior regions, the pitch effects are restricted to posterior regions and not lateralized. In addition, the authors inferred from the missing interaction between segments and pitch, that pitch information is coded and processed independently of segmental information. The main effect of pitch indicates that unexpected pitch (e.g. RE-) reduces the activation of a target word (e.g. reGAL) despite matching segmental information. What might come as a surprise is that incorrect pitch also has an effect on target words with no segmental overlap, i.e. words that were not thought to be activated in the first place (e.g. RE- akTEUR). At least this conclusion has to be drawn from the missing segment-pitch-interaction. It has to be noted, however, that the effects of pitch on P350 amplitudes is smaller than that of segments. Further, the N400 amplitude appeared to be exclusively sensitive to segments, not to pitch.

Since only pitch, but not duration and amplitude of the words was varied between

“stressed” and “unstressed” prime fragments, it is difficult to treat the results as word-stress effects. However, if the P350 shows sensitivity to isolated pitch information, it seems to be rather safe to assume that it would do the same to complete stress information.

In a study on the processing of German word stress, Domahs et al.

(Domahs, Wiese, Bornkessel-Schlesewsky & Schlesewsky, 2008) had participants

judge the correctness of word stress. All words in the experiment were trisyllabic German nouns with three stressable syllables. Correct stress was equally distributed across syllables (e.g. A.na.nas7, ‘pineapple’, bi.KI.ni, ‘bikini’, vi.ta.MIN, ‘vitamin’). In each trial, the word appeared written on the screen, then a neutral auditory sentence was played with the same word embedded in it (e.g. Er soll nun Ananas sagen, ‘He is supposed to say pineapple now.’) and participants had to judge whether the critical word (in this case Ananas) was stressed correctly.

Each word was presented with all three stress patterns. Words with correct and incorrect stress were recorded and spliced into the carrier sentence. According to analyses on stress parameter values, correct and incorrect stress was realized with comparable F0, intensity and duration. Results revealed a late centro-parietal positivity effect occurring approximately between 500 and 1200ms. Incorrect stress evoked more positive amplitudes than correct stress. However, this was only the case if the foot structure had to be reorganized. For instance, correctly stressed A.na.nas has the foot structure (σσ)(σ)8, while incorrectly stressed *a.NA.nas has the foot structure (σ)(σσ) and thus produced a pronounced positivity effect. In comparison, incorrectly stressed *a.na.NAS had the same foot structure as the correct form, namely (σσ)(σ), and did not lead to the positivity effect. This positivity effect was interpreted as a manifestation of the P300 effect, reflecting general task dependent match-mismatch processing. Furthermore, the timing of the positive peak depended on the timing of the incorrectly stressed syllable in the word. That is, the peak in response to *BI.ki.ni was earlier than the one for

*bi.ki.NI. Incorrect presence of stress, rather than unexpected absence of stress, seems to determine detection of stress violation as reflected in the P300. Initial absence of expected (primary or secondary) stress (i.e. in *a.NA.nas, *a.na.NAS,

*bi.ki.NI and *vi.TA.min) caused a negative deflection in the ERP around 400-900ms. This negativity was interpreted as signalling the detection of an incorrectly de-stressed syllable, on its own not being sufficient to reject the word as wrongly stressed.

In a mismatch negativity study on stress in Hungarian, Honbolygó et al.

(Honbolygó, Csépe & Rago, 2004) presented a disyllabic word with stress on the first syllable (BAnán, ‘banana’) as standard and either a word differing in the onset consonant (PAnán) or a word differing in stress (baNÁN) as deviant. Both deviants elicited a mismatch response at fronto-central sites, suggesting that stress plays a role in word recognition in Hungarian.

7 A full stop denotes syllable boundaries here.

8 σ denotes a syllable and ( ) a foot here.

Obviously, EEG literature on the effects of stress is sparser and more controversial than behavioural results. Both, Friedrich et al. (Friedrich, Alter &

Kotz, 2001) and Domahs et al. (Domahs, Wiese, Bornkessel-Schlesewsky &

Schlesewsky, 2008) employed the same task: participants were to judge the correctness of word stress. While Friedrich et al. (2001) found no effect what so ever of incorrect stress assignment, Domahs et al. (2008) obtained an enhanced P300 for incorrect stress paired with incorrect foot structure, and a negativity effect for the absence of initial stress. Despite employing the same task, there are several differences between the studies. First, stimulus preparation and presentation differ remarkably. While Friedrich et al. (2001) artificially imposed the same pitch contour on all items and presented them in isolation, Domahs et al. (2008) deliberately pronounced incorrectly stressed items and embedded them in carrier sentences. Furthermore, participants knew in advance which word they were going to hear and could anticipate a certain stress-pattern. It also has to be noted, that Friedrich et al. (2001) did not take foot structure into consideration and Domahs et al. (2008) found EEG-effects for incorrect stress only in cases where foot structure was additionally violated. Also note that Friedrich et al. (2001) always referenced against the nose while Domahs et al. (2008) referenced against the mastoids. This makes it difficult to compare the effects found. In a cross-modal priming experiment, Friedrich et al. (Friedrich, Kotz, Friederici & Alter, 2004) found a P350 effect for incorrect stress/pitch information. While segmental mismatches caused less negative amplitudes in anterior as well as posterior regions, pitch mismatches only lead to posterior P350 effects. Further, the missing segment-pitch interaction suggests that pitch is processed independently of segmental information. There was no N400 effect for incorrect pitch assignment. Also the MMN effect proved sensitive for stress information (Honbolygó, Csépe & Rago, 2004).

In sum, correctness judgments as well as priming studies show effects of stress in ERPs. There is no consistent ERP effect of stress across these studies, which could be easily due to different stimulus materials, designs and reference electrodes used.

For as to compare behavioural and EEG studies on stress, the complete prevention of activation or even inhibition of items known from many behavioural studies (Cutler and van Donselaar, 2001; Cooper, Cutler & Wales, 2002; Soto-Faraco, Sebastian-Gallés & Cutler, 2001; van Donselaar, Koster & Cutler, 2005) has yet to be shown in EEG experiments. The P350 effect in the study of Friedrich et al. (Friedrich, Kotz, Friederici & Alter, 2004) is comparable to the result of

reduced activation for incorrect stress in the experiments of Cooper et al. (Cooper, Cutler & Wales, 2002). In both cases, only the first syllable was given as prime.

The results of Domahs et al. (Domahs, Wiese, Bornkessel-Schlesewsky &

Schlesewsky, 2008) suggest enhanced processing loads for incorrectly stressed words. They do not directly allow an inference to the activation status of the words in the mental lexicon. Similarly, the MMN results (Honbolygó, Csépe & Rago, 2004) indicate that stress was monitored and a mismatch was detected, but don’t allow insight on effects of stress mismatch on lexical activation in the first place.

4.2.2.3 Processing of Tone: Behavioural Experiments

Turning to lexical tone, several studies have examined the processing of tones in Chinese and Japanese. Similar to the behavioural studies on stress processing mentioned above, Cutler and Otake (1999) conducted an auditory repetition priming study, including minimal accent-pairs of bimoraic Japanese words (e.g.

ame, once with a high-low (HL) contour, once with LH). The prime preceded the target either immediately or with up to three intervening items. They obtained faster lexical decisions for words that were identical to their primes in segmental as well as tonal pattern (e.g. ame HL – ame HL) than to a segmentally unrelated control condition (e.g. eki HL – ame HL). Segmentally identical words differing in accent (e.g. ame LH – ame HL) did not show priming, but caused inhibition when analyzed across subjects. This lead to the assumption that suprasegmental features are as powerful as segmental features in restricting search space in the Japanese mental lexicon. The same was shown by Sekiguchi and Nakajima (1999), employing cross modal priming with either full words or word fragments as primes. A prosodically incongruent homophone could not preactivate the target word. In a later experiment using full word cross modal priming, Sekiguchi (2006) obtained differential effects depending on word familiarity. Stimuli were pairs of prosodically different Japanese homophones, one member of the pair being clearly more familiar than the other. If the target was more familiar than its prosodically different homophone, then this target was preactivated by a differing tonal contour.

However, if the target was less familiar than the pair member, it was not preactivated by the prosodically different item. More familiar words do not seem to be sensitive to tonal information, while less familiar words seem to. These results are affected by relative familiarity differences between pair members rather than absolute familiarity ratings.

In a target monitoring task in Mandarin Chinese (Ye and Connine, 1999), participants were to indicate by pressing a button whether they heard a certain

target-vowel with a certain target-tone (e.g. vowel a with tone 2, i.e. high-rising) or rather some other non-target stimulus that differed in either vowel or tone quality.

All vowels were embedded in existing CV syllables. Reaction time results showed that the correct classification as non-targets took longer if the item differed from the actual target in tone (e.g. vowel a with tone 4, i.e. high-falling) rather than in vowel quality (e.g. vowel i with tone 2). This advantage of segments over tone was explained with the relatively late acoustic availability of tone as compared to vowel information. There was no difference in error rate.

Studies by Cutler and Chen (1997) on Cantonese Chinese resulted in similar advantages for segmental over tonal information. For a lexical decision task, seven disyllabic pseudowords were generated from one disyllabic word by changing one or several aspects of the second syllable’s onset consonant, vowel or tone. Most errors were made if the pseudoword differed from the real word in tone only. Differences in segments or in a combination of segments and tone caused fewer mistakes. Furthermore, the more distinct the incorrect tone was from the tone of the real word, the fewer mistakes were made. That is, if the difference was only in tone, a pseudoword could be mistakenly accepted for a real word, particularly if the tones were very similar. Similarly, in a same-different judgment task, slowest reaction times and highest error rates were obtained for word pairs that differed only in tone (Cutler and Chen, 1997).

In sum, behavioural studies on tone processing show a different pattern of results depending on whether the design employed priming or not. Priming studies, using repetition as well as cross modal (fragment) priming, have shown that tone in Asian languages is very salient. An incorrect tone prevents lexical access (cf.

Cutler and Otake, 1999; Sekiguchi and Nakajima, 1999; Sekiguchi, 2006).

However, priming effects of incorrect tonal information were not compared to effects of subtle segmental differences between prime and target. This comparison was made only in target monitoring, plain lexical decision and same-different judgment tasks (cf. Ye and Connine, 1999; Cutler and Chen, 1997). All of these non-priming methods resulted in a smaller impact of tone as compared to segments on word recognition. It is not clear whether this disadvantage of tone is due to the relatively late acoustic availability, as compared to segmental information, or due

However, priming effects of incorrect tonal information were not compared to effects of subtle segmental differences between prime and target. This comparison was made only in target monitoring, plain lexical decision and same-different judgment tasks (cf. Ye and Connine, 1999; Cutler and Chen, 1997). All of these non-priming methods resulted in a smaller impact of tone as compared to segments on word recognition. It is not clear whether this disadvantage of tone is due to the relatively late acoustic availability, as compared to segmental information, or due