• Keine Ergebnisse gefunden

Neurolinguistic evidence for the representation and processing of tonal and segmental information in the mental lexicon

N/A
N/A
Protected

Academic year: 2022

Aktie "Neurolinguistic evidence for the representation and processing of tonal and segmental information in the mental lexicon"

Copied!
223
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

N EUROLINGUISTIC E VIDENCE FOR THE

R EPRESENTATION AND P ROCESSING OF T ONAL AND S EGMENTAL I NFORMATION

IN THE M ENTAL L EXICON

D

ISSERTATION

ZUR

E

RLANGUNG DES AKADEMISCHEN

G

RADES DES

D

OKTORS DER

N

ATURWISSENSCHAFTEN

AN DER

U

NIVERSITÄT

K

ONSTANZ IM

F

ACHBEREICH

P

SYCHOLOGIE

VORGELEGT VON

D

IPL

.-P

SYCH

. V

ERENA

F

ELDER

T

AG DER MÜNDLICHEN

P

RÜFUNG

: 13.07.2009 R

EFERENT

: P

ROF

. C

ARSTEN

E

ULITZ

R

EFERENTIN

: P

ROF

. A

DITI

L

AHIRI

Konstanzer Online-Publikations-System (KOPS)

(2)
(3)
(4)
(5)

T HANK YOU

Aditi Lahiri, for supervising me in a way no one else could. You inspired me, taught me how to do psycholinguistic experiments and introduced me into the world of linguistic thought. You believed in me, sometimes more than I did. I lack the words for expressing my gratitude for the immense support I received from you, scientifically and personally.

Carsten Eulitz, for your supervision. Thank you for the freedom you allowed me in developing and following my own ideas and interests. Also thank you for the brainstorming and your quick thoughts in methodological and statistical discussions.

Claudia Friedrich, Elisabet Jönsson-Steiner and Mathias Scharinger for their fruitful cooperation in designing and conducting some of the reported experiments.

It was a great pleasure to work with you!

Mathias Scharinger and Allison Wetterlin for helpful comments on earlier versions of this thesis.

My colleagues Frank Zimmerer and Mathias Scharinger, who shared F550 with me for three and a half years and had to experience my ups and downs. Thank you for the marvellous time, the many scientific discussions, private exchange and the whistling nose.

My colleagues Nicole Altvater-Mackensen, Alex Bobrov, Sonia Cornell, Ronny Hannemann, Silvia Lipski, Muna Pohl, Winny Schlee and Eva Smolka for helpful discussions and personal support.

Willi Nagl, for his ever-open door and immediate help with statistical questions.

Your salary should be doubled.

Patrick Fissler and Ramona Grutschnig, the most reliable and talented Hiwis ever.

(6)

My parents Angelika and Helmut Felder, for their support of any of my decisions.

My partner Gernot Segelbacher, for the right question in the right moment and for introducing the taste of wilderness into my reality.

T HANK YOU

(7)

T ABLE OF C ONTENTS

0 ABSTRACT 1

0.1 ZUSAMMENFASSUNG 3

CHAPTER 1 INTRODUCTION 7

1.1 The Process of Word Recognition 8

1.2 Models of Speech Perception and Word Recognition 9 1.2.1 The TRACE Model of Speech Perception 9

1.2.2 Shortlist 10

1.2.3 Distributed Cohort Model 12

1.2.4 The Featurally Underspecified Lexicon Model 14

CHAPTER 2 METHODOLOGICAL ASPECTS 19

2.1 The Concept of Priming and Lexical Decision 19

2.2 Cross-Modal Fragment Priming 20

2.3 Reaction Time Data 21

2.4 Electroencephalography 21

2.5 Event Related Potentials in Cross Modal Fragment Priming 22

CHAPTER 3 PROCESSING AND REPRESENTATION OF

VOWELS IN THE MENTAL LEXICON 25

3.1 Introduction 25

3.1.1 Experimental Findings on the Role of Segments

in Lexical Activation 25

3.1.2 Graded Activation in Lexical Access 26

3.1.3 Inhibition in Lexical Access? 29

3.1.4 Comparison of the Findings of Soto-Faraco and

Colleagues (2001) and Friedrich (2005) 32 3.2 Experiment 1: Behavioural Study on Lexical Activation in

Cross Modal Fragment Priming 34

3.2.1 Methods 34

3.2.1.1 Stimulus Material 34

3.2.1.2 Stimulus Production and Presentation 34

3.2.1.3 Experimental Design 35

3.2.1.4 Participants 36

(8)

3.2.2 Results 36

3.2.2.1 Response Time 36

3.2.2.2 Response Accuracy 37

3.2.3 Discussion of Experiment 1 38

3.3 Experiment 2: EEG Study on Lexical Activation in Cross

Modal Fragment Priming 39

3.3.1 Methods 39

3.3.1.1 Stimulus Material 39

3.3.1.2 Experimental Design 39

3.3.1.3 Data Acquisition and Analysis 41

3.3.1.4 Participants 41

3.3.2 Results 41

3.3.2.1 Behavioural Results 41

3.3.2.2 Event Related Potentials 43

3.3.3 Discussion of Experiment 2 47

3.3.4 General Discussion 50

3.4 Going Beyond the Surface – What Is Represented? 53 3.4.1 Empirical Evidence for the FUL-Model 53

3.4.1.1 Underspecified Representation of Vowels

in the Mental Lexicon 54

3.4.1.2 Underspecified Representation of Consonants

in the Mental Lexicon 56

3.4.2 Alternative Explanations 60

3.5 Experiment 3: EEG Study on Underspecified Lexical

Representation of Vowels 63

3.5.1 Methods 65

3.5.1.1 Stimulus Material 65

3.5.1.2 Stimulus Production and Presentation 66

3.5.1.3 Experimental Design 67

3.5.1.4 Data Acquisition and Analysis 68

3.5.1.5 Participants 68

3.5.2 Results 69

3.5.2.1 Behavioural Results 69

3.5.2.2 Event Related Potentials 70

3.5.3 Discussion of Experiment 3 77

3.5.3.1 N400 versus P350 Effect 78

(9)

3.5.3.2 Direction of the Asymmetry in the Global Data Analysis and Selective Impact of Height

Information 79

3.5.3.3 Word- versus Pseudoword-Fragments as

Primes 81

3.6 Chapter Summary 82

CHAPTER 4 PROCESSING AND REPRESENTATION

OF TONES IN SWEDISH 87

4.1 Introduction to Tone in Scandinavian Languages 87 4.1.1 Tonal Word Accents in Stockholm Swedish 88 4.1.2 Linguistic Approaches to Swedish Tones 89 4.1.2.1 Privative Approach to Swedish Tones 89 4.1.2.2 Equipollent Approach to Swedish Tones 89 4.1.3 Analyses of Swedish Accent Assignment 90 4.1.3.1 Complexity of Accent Assignment 90 4.1.3.2 Analysis by Riad (1998, 2003): Accent 2

is Lexically Marked 92

4.1.3.3 Analysis by Bruce and Gussenhoven (1977, 1999, 2005): Both Accents are Lexically Marked 94 4.1.3.4 Analysis by Lahiri et al. (2005): Accent 1

is Lexically Specified 95

4.1.4 Danish Stød 98

4.2 Psycholinguistic Experiments on Suprasegmental Information 98 4.2.1 Localization of Tone Processing 99

4.2.2 Processing of Stress and Tone 100

4.2.2.1 Processing of Word Stress: Behavioural

Experiments 100

4.2.2.2 Processing of Word Stress:

Electrophysiological Experiments 102 4.2.2.3 Processing of Tone: Behavioural Experiments 106 4.2.2.4 Processing of Tone: Electrophysiological

Experiments 108

4.2.2.5 Summary of Experimental Findings on Stress

and Tone 109

Experiments on Tone-Processing in Stockholm Swedish 111 4.3 Experiment 1: Forced Choice Experiment 112

(10)

4.3.1 Introduction 112

4.3.2 Methods 113

4.3.2.1 Stimulus Material 113

4.3.2.2 Stimulus Production 116

4.3.2.3 Experimental Design 117

4.3.2.4 Participants 117

4.3.3 Results 117

4.3.3.1 Response Accuracy 118

4.3.3.2 Response Time 118

4.3.4 Discussion of Experiment 1 119

4.4 Experiment 2: Validating accent specification hypothesis:

segmental coarticulation or tonal influence? 120

4.4.1 Methods 120

4.4.1.1 Stimulus Production 120

4.4.1.2 Experimental Design 121

4.4.1.3 Participants 122

4.4.2 Results 122

4.4.3 Discussion of Experiment 2 123

4.4.4 Discussion of Experiments 1 and 2 124

4.5 Experiment 3: EEG Experiment 129

4.5.1 Introduction 129

4.5.2 Methods 132

4.5.2.1 Stimulus Material 132

4.5.2.2 Experimental Design 132

4.5.2.3 Data Acquisition and Analysis 133

4.5.2.4 Behavioural Post-test 134

4.5.2.5 Participants 134

4.5.3 Results 134

4.5.4 Discussion of Experiment 3 142

4.5.4.1 Effects of Segmental Information 142 4.5.4.2 Relative Impact of Tonal as Compared

to Segmental Information 143

4.5.4.2.1 Impact of Tone in Case of a

Segmental Mismatch 143

4.5.4.2.2 Impact of Tone in Case of a

Segmental Match – ACC2 Target Words 148 4.5.4.2.3 Impact of Tone in Case of a

(11)

Segmental Match – ACC1 Target Words 148 4.5.4.2.4 Implications for Tonal

Versus Segmental Processing 149 4.5.4.3 Relevance of Specification in the Mental

Lexicon for Tone Processing 150

4.5.4.3.1 Impact of Specification on Target

Processing 151

4.5.4.3.2 Interaction of Specification Set

with Prime Information 153

4.6 Chapter Summary 158

CHAPTER 5 GENERAL DISCUSSION 163

5.1 Questions, Their Answers, and New Questions 163 5.2 Lexically Underspecified Representations 164 5.3 Segmental versus Suprasegmental Information Processing 166 5.4 Lexical Access in the Light of Behavioural versus EEG Data 166

5.5 The P350 Effect 169

5.5.1 A Short History of the P350 Effect 171 5.5.2 P350 Effect and Semantic Priming 173

5.5.2.1 Behavioural Semantic Priming Experiments

on Segment Processing 173

5.5.2.2 P350-like Effects in Semantic Priming Studies 174 5.5.2.3 Semantic Cross Modal Fragment Priming Study 175

6 REFERENCES 179

7 APPENDICES 195

Appendix A: Testwords of Experiments 1 and 2 of Chapter 3 195 Appendix B: Analyses for all ROIs in both time windows of

Experiment 2 197

Appendix C: Testwords of Experiment 3 of Chapter 3 201 Appendix D: Testwords of Experiments 1, 2 & 3 of Chapter 4 203 Appendix E: Extract Lengths Used in Experiment 2, Chapter 4 205 Appendix F: Prestudy for Experiment 2, Chapter 4 206

Appendix G: Table of Figures 207

Appendix H: Table of Tables 209

(12)
(13)

0 A BSTRACT

In this dissertation we examined several aspects of language processing and representation in the mental lexicon. Chapter 1 gives an introduction to psycholinguistic models of speech processing and representation. Chapter 2 introduces the methods used in the later experiments.

Chapter 3 deals with the processing and representation of segmental information. First, we investigated the mechanisms at work in lexical selection, particularly the question if there is active inhibition between competing lexical candidates. Simultaneously, we examined if behavioural and electrophysiological data display similar processes and can be directly compared. In a behavioural and in an ERP cross modal fragment priming experiment, disyllabic prime fragments preceded trisyllabic target words. In the identity condition the fragments were identical to the beginning of the target (e.g. ano– ANORAK), in the related condition they deviated from the target in the vowel of the second syllable (e.g.

ana– ANORAK), and in the control condition they were completely unrelated to the target (e.g. paste– ANORAK). Whereas the behavioural results showed blocking of activation in the related condition (i.e. no difference between related and control condition), target-locked ERP amplitudes differentiated between the three conditions in terms of graded activation in the P350 and N400 components.

Response-locked ERPs again revealed blocking of the related condition, similarly to the behavioural data. There was no evidence of lexical inhibition. We concluded that ERP data reflect earlier stages of speech processing than behavioural data.

In a third cross modal fragment priming experiment we tested the FUL model’s (Lahiri & Reetz, 2002) assumption of featurally underspecified lexical representations, focusing on vowels and the assumed underspecification of the feature [CORONAL]. We had the same conditions as described above, with monosyllabic prime fragments this time. Half of the target words had a coronal vowel in their first syllable, half a dorsal one. Our contention was, that a target with a coronal vowel (e.g. TENNIS) could be preactivated by both, a prime fragment with a coronal (e.g. ten-) or dorsal (e.g. tan-) vowel, whereas a target word with a dorsal vowel (e.g. TANNE) will only be activated by a dorsal vowel (e.g. tan-). Unexpectedly, ERP results pointed in the opposite direction. These results do not only contradict the assumptions of the FUL model, they are also not anticipated by any other theory of lexical representation and processing. We give suggestions for further research in this direction.

(14)

In Chapter 4, we go beyond the segmental level, to the suprasegmental level of word accent representation in Swedish. First, we conducted a forced choice experiment where participants heard the first syllable of a disyllabic Swedish noun (e.g. ham-) and then had to decide which of two visually presented words (e.g.

HAMBO HAMPA) this syllable was taken from. The two words were identical in segmental information of the first syllable and only differed in tone (Accent 1 versus Accent 2). We tested the theory of Lahiri, Wetterlin and Jönsson-Steiner (2005), that the Accent 1 contour can be lexically represented while Accent 2 is assigned by default. We hypothesized, that those words specified for ACC1 should be identified faster as targets to their preceding ACC1 fragment, than ACC1 and ACC2 words that are not specified for accent when preceded by their respective syllables. This is because the lexical specification leads to a match between the tonal information in the signal and the tonal information in the lexicon, while in the case of an unspecified representation, no decision on tone can be made on behalf of the lexical representation only. Reaction time confirmed this hypothesis.

Second, we conducted an EEG experiment with cross modal fragment priming. A visual target word was either preceded by an auditory syllable that was identical to its own first syllable in segments and tone, in segments only, in tone only, or neither in segments nor tone. We were interested in effects of lexical specification on word accent processing and in the relative importance of segmental versus suprasegmental information. For ACC2 target words, the P350 as well as the N400 effect distinguished between a full match in segments and accent between prime and target, a match in segments but a difference in accent, and a difference in segments irrespective of accent. For ACC1 target words, we did not obtain a typical P350 effect, but an N400 effect that differentiated between cases of a match in segments and accent, and a difference in segments or accent. This shows that in Swedish segments are more powerful in lexical access than accent, but accent still plays a role. Concerning effects of lexical specification, targets that were specified for ACC1 had less pronounced amplitudes than unspecified ACC1 and ACC2 targets in certain ROIs. This resembles the pattern of the response times in the forced choice experiment and points to a special status of specified Accent 1 words.

Chapter 5 provides a general discussion of all data and gives proposals for future experiments.

(15)

0.1 Z USAMMENFASSUNG

In der vorliegenden Dissertation untersuchten wir Aspekte der Sprachverarbeitung und Repräsentation im mentalen Lexikon. Kapitel 1 enthält eine Einführung in psycholinguistische Modelle der Sprachverarbeitung. Kapitel 2 führt kurz in die später angewandten Methoden ein.

Kapitel 3 behandelt die Verarbeitung und Repräsentation segmentaler Information. In einem ersten Schritt untersuchten wir die Mechanismen der lexikalischen Auswahl, speziell die Frage, ob sich aktivierte lexikalische Wortkandidaten gegenseitig aktiv unterdrücken. Gleichzeitig untersuchten wir, ob behaviorale und elektroenzephalographische Daten ähnliche Prozesse widerspiegeln und direkt miteinander verglichen werden können. In einem behavioralen und einem elektroenzephalographischen cross modalen Fragment Priming Experiment präsentierten wir zweisilbige auditorische Prime Fragmente, gefolgt von einem dreisilbigen Zielwort. In einer identischen Bedingung war das Fragment gleichzeitig der Beginn des Zielworts (z.B. ano– ANORAK), in der relatierten Bedingung stimmte der zweite Vokal des Fragments nicht mit dem Zielwort überein (z.B. ana– ANORAK), und in der Kontrollbedingung bestand keine Ähnlichkeit zwischen Fragment und Zielwort (z.B. paste– ANORAK). Die behavioralen Daten zeigten keine Voraktivierung in der relatierten Bedingung.

Targetreiz-korrelierte Potentiale im EEG differenzierten zwischen den drei Bedingungen in Form von graduell abgestuften Amplituden der P350 und N400 Komponenten. Reaktions-korrelierte Potentiale hingegen stimmten mit den behavioralen Daten überein und zeigten keine Aktivierung in der relatierten Bedingung. Wir konnten keine Anzeichen für aktive Unterdrückung finden und schlossen, dass EKPs frühere Stufen der Sprachverarbeitung widerspiegeln als behaviorale Daten.

In einem dritten cross modalen Fragment Priming Experiment testeten wir die Annahmen des FUL Modells (Lahiri & Reetz, 2002) bezüglich der unterspezifizierten Repräsentation von Lautmerkmalen im mentalen Lexikon, speziell die Unterspezifikation des Merkmals [KORONAL] bei Vokalen. Das Experiment war gleich aufgebaut wie das obige, die Prime Fragmente waren aber monosyllabisch und die Hälfte der Zielworte hatte einen koronalen Vokal in der ersten Silbe, die andere Hälfte einen dorsalen. Wir nahmen an, dass ein Zielwort mit koronalem Vokal (z.B. TENNIS) sowohl von einem Fragment mit koronalem (z.B. ten-) als auch dorsalem Vokal (z.B. tan-) voraktiviert werden kann, während ein Zielwort mit dorsalem Vokal (z.B. TANNE) nur durch ein Fragment mit

(16)

dorsalem Vokal (z.B. tan-) aktiviert wird. Unerwarteter Weise wiesen die EKP Daten in die entgegengesetzte Richtung. Dieses Ergebnis widerspricht nicht nur den Vorhersagen des FUL Modells, es wird auch von keiner anderen Theorie zur Sprachverarbeitung angenommen. Wir bringen Vorschläge für weitere Untersuchungen in diese Richtung.

In Kapitel 4 gehen wir über die Ebene der segmentalen Verarbeitung hinaus und untersuchen die suprasegmentale Verarbeitung anhand von Wortakzenten im Schwedischen. Zuerst berichten wir ein Forced Choice Experiment bei dem die Teilnehmer die erste Silbe eines schwedischen Wortes hörten (z.B. ham-) und dann entscheiden mussten zu welchem von zwei visuell gezeigten Worten (z.B. HAMBO HAMPA) diese Silbe gehörte. Die beiden visuellen Worte waren identisch in ihrer segmentalen Information der ersten Silbe und unterschieden sich nur im Wortakzent (Akzent 1 versus Akzent 2). Wir testeten die Theorie von Lahiri, Wetterlin und Jönsson-Steiner (2005), dass nur die Kontur des Akzent 1 im Lexikon repräsentiert sein kann, und nahmen an, dass Worte, die für Akzent 1 spezifiziert sind, schneller den entsprechenden Akzent 1 Fragmenten zugeordnet werden können, als Akzent 1 und Akzent 2 Worte die nicht für Akzent spezifiziert sind ihren entsprechenden Fragmenten zugeordnet werden können. Der Grund liegt darin, dass lexikalische Spezifikation die Akzent- Information im Signal bestätigen kann und ein Wort dadurch schneller erkannt wird als Worte mit nicht spezifiziertem Akzent, die zur Entscheidung auch nicht beitragen können. Die erhobenen Reaktionszeiten haben diese Annahmen bestätigt.

Zusätzlich führten wir ein weiteres EEG Experiment mit cross modalem Fragment Priming durch. Ein visuelles Zielwort folgte entweder einer auditorischen Silbe, die identisch war zur ersten Silbe des Zielworts, einer Silbe die die selbe segmentale Struktur hatte, aber einen anderen Wortakzent, eine Silbe mit unterschiedlichen Segmenten aber selbem Akzent, oder einer Silbe mit unterschiedlichen Segmenten und Akzent. Wir waren interessiert an den Auswirkungen von lexikalischer Spezifikation auf die Verarbeitung von Wortakzenten, und an der Beachtung, die Wortakzenten im Vergleich zu Segmenten geschenkt wird. Im Fall von Zielworten mit Akzent 2 unterschieden der P350 und der N400 Effekt zwischen einer Übereinstimmung zwischen Prime und Zielwort in Segmenten und Akzent versus nur in Segmenten versus nicht in Segmenten. Für Zielwörter mit Akzent 1 zeigte sich keine P350 Komponente, aber eine N400, die zwischen Fällen mit Übereinstimmung in Segmenten und Akzent und Fällen mit einem Unterschied in einem der beiden Faktoren unterschied.

Daraus schließen wir, dass im Schwedischen Segmente mehr Einfluss auf die

(17)

Sprachverarbeitung haben als Wortakzente, letztere aber dennoch eine Rolle spielen. Bezüglich der Effekte von lexikalischer Spezifikation zeigten Zielworte mit spezifiziertem Akzent 1 weniger stark ausgelenkte EKP Amplituden als unspezifizierte Zielworte. Dies gleicht dem Muster der Reaktionszeiten im Forced Choice Experiment und weist auf einen speziellen Status von spezifizierten Akzent 1 Worten hin.

Kapitel 5 bietet eine generelle Diskussion aller Daten und weist auf notwendige zukünftige Experimente hin.

(18)
(19)

C HAPTER 1 I NTRODUCTION

Despite liking ice cream a lot, one day I decided to stop eating it. A few days later I innocently sat in the kitchen as Freud decided to visit me. He came in, opened the fridge and asked: ‘Willst du auch ein Ei zum Frühstück?’ ([vɪlst duː aux aɪn aɪ tsʊm fryːʃtʏk]; ‘Do you also want an egg for breakfast?’). I got very mad at him because my deprived brain had heard: ‘Willst du auch ein Eis mit Früchten?’

([vɪlst duː aux aɪn aɪs mɪt fryçtn]; ‘Do you also want some ice cream with fruit?’). What had gone wrong? His speech was as clear as always (which is, as we will see below, as clear as mud) and my ears worked well. It was the thing between my ears that parsed eggs into ‘ice cream’ and breakfast into ‘fruit’. Although such misperceptions occur rather rarely, they tell us that there is more to language than just a simple one to one mapping of the speech signal onto some kind of mental representation of words and meaning.

Today we have a rather good understanding of the functioning of the motor and sensory organs involved in speech production and perception. The challenge that remains is to figure out what goes on right before, nearby and after speech is uttered and perceived. It’s the functioning of the brain that challenges psychologists; and where language meets the brain, linguists meet psychologists and they end up in the thrilling field of psycholinguistics.

During the past decades psycholinguists have come up with a vast number of theories concerning both, speech production and perception. In the following I will confine to speech perception. While listening to others, words seem to reach us as clear-cut entities. However, the stream of acoustic energy that hits our ear cannot be neatly divided into clearly separable pieces, each of them representing one word: the physical speech signal is quasi-continuous. Imagine you hear the phrase ‘The catalogue in a library’, and there is no pause between the words. Apart from thinking about a catalogue in a library, your brain will have to activate representations of a cat, a log, a lie, an eye etc. because all these words are embedded in the speech stream as well (Norris & McQueen, 2008), and yet the brain ends up with the correct percept seemingly on the fly.

Furthermore, the speech signal of the same phonetic segment varies within and across different words. Among others, it is modulated by phonological context, syllabification and stress pattern. In addition, speech is coarticulated. While producing one sound, the vocal tract already prepares for the next segment. That is,

(20)

a segment is not invariant, but its form is determined by neighbouring segments, not only within a certain word, but also across word boundaries. As a consequence, a word cannot be regarded as an ever-constant entity, but surfaces in many different forms.

Other factors that influence the speech signal are the position of the words in the prosodic structure of a sentence, the speaker’s sex, age and social status, dialectal variation, speaking rate and style, background noise, etc. For more details see McQueen (2004).

1.1 The Process of Word Recognition

The miracle of how the brain perceives language is far from being solved, but in any event there needs to be a process that converts the acoustic speech stream into more abstract entities that are compared to a mental representation of words and their phonological form. The listener extracts segmental as well as suprasegmental (e.g. stress and tone) information from the speech stream. As already mentioned, segmental information is highly variable in that there are countless possible realizations of one single segment that have to be mapped onto some kind of discrete mental representation for successful recognition. Theories on language processing differ widely in the exact kind of information that is extracted from the signal and the concreteness or abstractness of the mental representation of segmental information. They range from minimalistically abstract representations of phonological features (e.g. Lahiri & Reetz, 2002), to more fully specified strings of phonemes (McClelland & Elman, 1986; Norris, 1994) to very detailed, episodic representations including speaker information (Goldinger, 1998). The locus of these representations is called mental lexicon, which, in psycholinguistic models, is thought to store “those aspects of the representation of lexical form that participate directly in the process of recognising spoken words, allowing the listener to identify the sequence of lexical items being produced by a given speaker” (Lahiri

& Marslen-Wilson, 1991; p. 246)1. The information extracted from the incoming speech signal accesses the mental lexicon and activates the lexical entries of all word-representations that are compatible with this input. For instance, the segmental information of the string cat activates the lexical entries of the items cat,

1 It may well be simplified to think of the mental lexicon as a clear-cut entity in the human brain. Brain-structures around the Sylvian fissure of the left hemisphere including early auditory and somatomotor cortices are involved in the representation of phonemic word forms. As it comes to the conceptual representation of word meaning, we need to think within a larger network, or rather multiple networks, depending on the concept and category that is represented (Domasio, Grabowski, Tranel, Hichwa & Damasio, 1996;

Caramazza, 1996).

(21)

catalogue, category, catastrophe etc., i.e. of any word beginning with cat-. As the string proceeds and turns into cata-, items like catalogue and catastrophe are further activated because they are in line with the input information. Cat also stays active for the moment, because the following –a may well be the beginning of the next word rather than a continuation of the current item. Instead, words like category lose their lexical activation because they no longer match the input stream. The latter is called a mismatch in lexical access. From this it follows that multiple candidates are activated simultaneously in spoken word recognition. The better the fit between a word’s lexical representation and the input signal, the stronger this item is activated. All activated items ultimately compete for lexical selection, the decision that this was the item in question. This process is easy if the signal matches only one or very few lexical items and gets increasingly complex as the number and frequency of similar-sounding words increases (Vitevitch & Luce, 1998). Early on in the word recognition process, also semantic, contextual and syntactic information is considered and adds to segmental and suprasegmental information (for a comprehensive review on aspects of language processing see McQueen, 2007). Psycholinguistic models differ in several aspects of this depicted process of word recognition, including the number and kind of levels of representation, the kind of information that is extracted from the signal and represented in the mental lexicon, the mechanisms of competition between activated items, consequences of a mismatch between signal and representation, the relative importance of semantic and syntactic information during lexical activation and selection, bottom-up vs. top-down connections between levels of processing, and many more. In the following section we will shortly discuss three of the current psycholinguistic models of language processing and their assumptions, for as to outline the different approaches to speech processing, their strengths and weaknesses. Then we focus on a fourth, the Featurally Underspecified Lexicon Model (Lahiri & Reetz, 2002), which is presented in more detail and will be further discussed and tested in Chapters 3 and 4. For comprehensive reviews on models of speech perception see Klatt (1989) and Gaskell (2007).

1.2 Models of Speech Perception and Word Recognition 1.2.1 The TRACE Model of Speech Perception

The TRACE model (McClelland & Elman, 1986) is organized hierarchically into three levels for the processing of features, phonemes and words, respectively. The feature-level is divided up into seven dimensions along which phonemes can be

(22)

described, for instance the dimensions vocalic, consonantal, voiced. Speech input at a given point in time activates feature nodes of certain dimensions on this feature level. These feature nodes pass on their activation to the next level of processing, the phoneme-level. At this level there are separate detectors for all phonemes.

These detectors are activated by the bottom-up input if they are consistent with the given feature information and they also project back onto the feature-level via top- down connections. Finally, at the word level, there is a unit for every word. Any word node that incorporates a phoneme that is currently active at the phoneme- level receives activation. This procedure is repeated for numerous consecutive time slices during the perception of a word and the model remembers the trace of the activation patterns. This way evidence for the target word can accumulate so that one word eventually remains as the most activated one. This process of lexical selection is enhanced by competition within each of the three levels. Mutually inconsistent units within one level (for example the dimensions vocalic and consonantal at the feature-level) compete with each other via inhibitory connections between them. There are no inhibitory connections between levels of processing, only within them. All connections are bidirectional, also the excitatory connections between levels, allowing for bottom-up as well as top down processing.

Despite accounting for many phenomena in human language processing, the model has several drawbacks. It does not take word frequency into account although the authors note that frequency can have powerful effects and could in principal be incorporated into the model. Also the model duplicates all units and representations with their current activation pattern at every time slice of word processing. This leads to a massive number of multiple lexical representations and is only possible with a limited number of lexical items. The TRACE model would not work efficiently with a mental lexicon as large as the human one. Still, TRACE successfully simulates phenomena of human speech perception such as lexical cohort effects, categorical perception, effects of context and formant transitions, effects of phonotactic rules and coarticulation, and many more.

1.2.2 Shortlist

The Shortlist-model (Norris, 1994) consists of two stages of word processing. In the first stage of lexical search a phoneme is provided as input into the system and all words are activated that have this phoneme in their onset. Of these, a small number of best-matching candidates enter into a candidate set, the shortlist. As the next phoneme is presented, the lexical search procedure is repeated, i.e. all words

(23)

starting with the current phoneme are activated, and the words in the shortlist are compared to the previous and current phoneme input and receive more activation if they match the input or lose activation if they mismatch the input. The shortlist is eventually updated with new and better matching candidates and their activation score is adjusted, depending on how well a given item in the shortlist matches the phoneme-input. This procedure is very similar to the TRACE model in that representations of words are activated via phonemes and their activation is weighted according to their goodness of fit with the input. Also, for each new time slice or phoneme, the procedure is repeated. However, Shortlist avoids the problem of reduplicating the whole mental lexicon several times by selecting only a few items for further analysis, namely those that entered the shortlist. Once a word is included into this candidate set it stays there until it is replaced by a word candidate with a higher score.

In the second stage of word processing, the word candidates in the shortlist are wired into a constraint satisfaction network. In there, words with overlapping phonological information are connected via inhibitory links and compete with each other. The amount of inhibition between two candidates is proportional to the number of phonemes by which they overlap. Furthermore, the influence of one word on the other candidates depends on its bottom up activation, determined by its goodness of fit to the input. In contrast to TRACE, the activation of a candidate can be also suppressed below zero. The word with the highest final score is selected.

There are no top-down connections in Shortlist. One point of criticism in the TRACE model is that for instance at the phoneme-level it cannot be determined whether a phoneme unit is active due to bottom-up input from the feature-level or due to top-down activation from the word-level. Further, Shortlist avoids massive amounts of duplicated data by processing only a few items in the candidate set, rather than the whole lexicon, during each time step. However, if a word has many neighbours, the correct item might stay out of the shortlist in the beginning and thus will not gain in activation. It is proposed that words with higher frequency could be selected first into the shortlist, thereby accounting for frequency effects.

TRACE and Shortlist are both interactive activation networks, meaning that each word or lexical candidate is represented by a single node that receives more activation if it matches the input signal and loses activation if it mismatches the input or is inhibited by a competitor. These models manage to determine the correct target word, but they cannot model human behaviour directly, i.e. their processing time is not proportional to human response time or correctness

(24)

measures in experiments. Therefore Norris and McQueen (2008) have further developed Shortlist into Shortlist B, which abandons absolute activation levels and works with the concept of likelihood and probability instead. The architecture of Shortlist B is equivalent to Shortlist, with a few exceptions. The input into the system is no longer a phoneme, since this is oversimplified in light of the high variability within one phoneme over time and across contexts. Therefore the input into Shortlist B is the probability of being a certain phoneme, which is computed for three time slices per segment. This probability is compared to the lexical entries, i.e. the system computes the likelihood for each word given the string of phoneme-probabilities in the input. Furthermore, the words are ranked by their prior probability of occurrence, independent of the input, which is determined by word frequency. Another major difference is that there is no inhibition among lexical candidates. Instead, the path probabilities of all lexical hypotheses are continuously compared and the system selects either the item with the highest probability, or terminates the search process if the likelihood of one word exceeds a predetermined threshold.

The new model allows the authors to model human response time, they take effects of word frequency, neighbourhood frequency and neighbourhood density into account, they can detect word boundaries in continuous speech and revise decisions on prior words as new input is perceived, they can cope with unclear or distorted input and can explain asymmetries in speech perception, i.e.

the fact that one phoneme may be more often misperceived as another phoneme than vice versa.

1.2.3 Distributed Cohort Model

The third model to be discussed here is the Cohort model (Marslen-Wilson &

Welsh, 1978; Marslen-Wilson, 1987; Marslen-Wilson, 1993; Gaskell & Marslen- Wilson 1997; Gaskell & Marslen-Wilson 1999). Core assumptions of this model have changed over time and we will focus on the most recent version of it, the Distributed Cohort Model (DCM) (Gaskell & Marslen-Wilson 1997; Gaskell &

Marslen-Wilson 1999). TRACE and Shortlist are localist models, in that there is a separate word-node for every item that is represented in the lexicon. In contrast, the DCM relies on a distributed representation of words. This model is built on a simple recurrent network (Elman, 1991) that receives phonetic features as input from the speech stream and projects them via hidden units onto parallel output- levels of semantic and phonological processing. However, these levels do not consist of separate word nodes but they contain abstract and distributed

(25)

representations of form and meaning. In other words, all items are represented on the same nodes and two words that overlap in form or meaning will activate the same nodes, the difference being in the weighting and exact pattern of node- activation. As only the beginning of a word is perceived, many words are compatible with this input information and their simultaneous activation causes a blend of the relevant distributed representations. All matching items will be activated to some extent, but none can be recognized because the information of all representations is blended. As more and more of the word is perceived, this blend is refined to represent the reduced set of words that still match the input, and as only one possible form remains, the word can be recognized. The term “activation”

has a very different meaning in a distributed model than in a localist one. In localist models activation is reflected in the score of a word-node, while in the DCM activation refers to the closeness of the actual blend to the relatively stable pattern of the distributed representation of a given item across all nodes. As in Shortlist B there is no more need for inhibitory connections between word nodes because competition works in terms of interference between multiple distributed representations. Since this model has no intervening prelexical level, it preserves subphonemic detail throughout lexical access. Further, it learns to bias its output during states of ambiguity towards more frequent word candidates.

The model we will be concerned with in more detail below, the Featurally Underspecified Lexicon (FUL) model (Lahiri & Reetz, 2002), was developed based on the original Cohort model (Marslen-Wilson, 1987; Lahiri & Marslen- Wilson, 1992), which also incorporated simultaneous activation of multiple word candidates (the cohort of activated words) and used phonetic features as input that directly mapped on the word representations without an intermediate prelexical level. More strongly than previous models, the FUL model aims at explaining how language is represented and processed in the human brain; not necessarily in its structure and computational mechanisms, but regarding the kind and content of its lexical representations and the basic assumptions on lexical access. There also exists an automatic speech recognition system based on the assumptions of the FUL model (Reetz, 1998, 1999; Lahiri, 1999; Lahiri & Reetz, 2002), but we do not provide a detailed description of its implementation, because we are primarily interested in the model’s assumptions on language processing and representation in the human brain. Since our studies on segmental (Chapter 3) and suprasegmental (Chapter 4) processing were based on the FUL model, its basic assumptions will be described in more detail in the next section.

(26)

1.2.4 The Featurally Underspecified Lexicon Model

As the original Cohort model (Marslen-Wilson, 1987; Lahiri & Marslen-Wilson, 1992) was a localist rather than a distributed model, also the Featurally Underspecified Lexicon (FUL) model (Lahiri, 1999; Lahiri, 2000; Lahiri & Reetz, 2002; Lahiri & Reetz, in press) states that each morpheme has a single, unique lexical representation. This representation consists of hierarchically structured features that make up the segments of the words. There are no clear-cut boundaries between segments, as these are not present in uttered speech either. During speech processing, the perception system extracts rough acoustic features from the signal and transforms them into phonological features. These are mapped directly onto the featural word-representations at the lexical level. There is no conversion into segments and therefore no intermediate representation. The strength of the FUL model is that it represents the lexical items in a way that it can cope with a lot of variance in the acoustic signal and is supported by linguistic theories in that it can explain diachronic and synchronic language-phenomena. This is accomplished by not fully specifying all possible features of the phonemes. That means, for each phoneme in the signal, the mental lexicon stores sufficient features to clearly identify and distinguish it from all other phonemes. However, features that are redundant and can be derived by rule, or features that can vary for instance due to segmental or prosodic context or dialectal and speaker characteristics are not stored in the mental lexicon. They are underspecified. What features exactly are represented is determined by universal properties and language specific requirements. Therefore, the same segment can have different lexical representations in different languages (Lahiri & Marslen-Wilson, 1992; Winkler, Lehtokoski, Alku, Vainio, Czigler, et al., 1999) and the representation of a segment can undergo change, as the language itself changes over time (Ghini, 2001).

In the process of word recognition all features are extracted from the speech signal, regardless of whether they are represented in the mental lexicon or not. Then these features are compared to the lexical entries of all morphemes in the mental lexicon and all items that are compatible with the extracted feature information are activated. Usually, models of language processing distinguish between a match and a mismatch in lexical access. A match means that the feature in the signal is the same as the feature in the lexical representation. A mismatch occurs if a certain feature is extracted, but the lexicon contains a different feature that is incompatible with the extracted one. In case of a match, the lexical candidate receives activation, while in case of a mismatch, the lexical item is not activated or – if it has been activated before – it is removed from the cohort of

(27)

possible word candidates. The FUL model extends this binary logic into a ternary matching logic and adds the case of a nomismatch. Such a nomismatch condition is created either if no feature is extracted from the signal although there are features stored in the lexicon, or if a feature is extracted from the signal but not represented in the lexicon, i.e. there is an empty slot. In case of a nomismatch the item stays in the cohort of possible candidates, but receives less activation than in case of a full match between signal and representation. A scoring formula allows ranking of the candidates, depending on their goodness of fit:

Score = (Nr. of Matching Features)2 / [(Nr. of Features from Signal) x (Nr. of Features in Lexicon)]

For example, the place of articulation of a phoneme can be either labial, coronal or dorsal. As phonemes are perceived, the respective features [LABIAL], [CORONAL] and [DORSAL] are extracted from the acoustic signal. However, only the features [LABIAL] and [DORSAL] are assumed to be represented in the mental lexicon, while the feature [CORONAL] is underspecified2 and thus the slot for place of articulation stays empty in the mental representation. For example, as a labial phoneme (e.g.

/b/) is perceived, the feature [LABIAL] is extracted from the signal and mapped onto the mental lexicon. It will match the lexical representation of a /b/ with a specified labial place of articulation. It will mismatch with the representation of a /g/, which is specified for dorsal place of articulation. As a labial /b/ is mapped onto the representation of a coronal /d/, this leads to a nomismatch, because the feature [CORONAL] is not stored in the mental representation and consequently the feature [LABIAL] from the signal is mapped onto an empty place of articulation in the mental lexicon. Consequently a labial /b/ activates both, the lexical representation of a /b/ and of a /d/, the latter to a lesser extent than the former due to fewer matching features. In the opposite case, when we perceive a /d/, the feature [CORONAL] is extracted from the signal and mapped onto the lexicon. It will mismatch both, the representation of a /b/ and a /g/, because the feature [CORONAL] mismatches with the respective features [LABIAL] and [DORSAL] in the mental lexicon. The feature [CORONAL] is not represented in the mental lexicon and hence there is a nomismatch in terms of place of articulation between a /d/ in the signal

2 Several phenomena lead to the assumption that the feature [CORONAL] is not specified at a lexical level. [CORONAL] seems to be the default place of articulation in many languages, a coronal sound is far more likely to assimilate to non-coronal places of articulation than vice versa, coronal consonants are phonotactically less restrictive (they allow for more

combinations than consonants with other places of articulation) and within one language, coronal sounds can split up into several contrastive phonemes with different places of articulation (palatoalveolar, palatal, retroflex). See also Lahiri (2000) and Steriade (1995).

(28)

and a /d/ in the lexical representation. Consequently, there is an asymmetry in lexical activation. A coronal phoneme can only activate lexical entries of other coronal phonemes, while a non-coronal phoneme can activate its own representation as well as the representations of coronal phonemes. With this ternary matching logic, the system over-generates possible word candidates but it still removes impossible ones from the cohort. Since not the segments per se are stored, but their abstract phonological features, some feature-information in the incoming speech signal can be missing or influenced by phonological context, and still the listener is able to correctly identify the input word. This is very convenient in the case of place assimilation that leads to surface variation, something that frequently happens in speech production. In regressive place assimilation, the place feature of a coronal phoneme is assimilated to a following non-coronal place of articulation. For instance, the coronal /n/ in ‘Where could Mr. Bean be?’ is often pronounced as a labial /m/ because it is followed by labial /b/: ‘Where could Mr.

Beam be?’). The reverse is not usually true, that is, a non-coronal like the /m/ in the utterance ‘lame duck’ would not assimilate to the following coronal (*‘lane duck’).

Since the coronal /n/ in ‘Mr. Bean’ is unspecified for place, the lexical representation can be activated by both ‘Mr. Bean’ and ‘Mr. Beam’. A fully specified system, such as Trace, Shortlist or DCM would lead to a mismatch in the latter case and needed two separate lexical representations for the two surface forms. A model like Shortlist B is able to cope with assimilation as it works based on probabilities and prior experience. Please note, that in the FUL model all coronal phonemes are underspecified for place of articulation, not only those that are subject to surface variation, as for instance word final coronal consonants. This means that the underspecification of the feature [CORONAL] is not entirely experience-based. Other theories assume a more graded form of lexical underspecification, where only coronal phonemes with non-coronal surface variants are underspecified in a given morpheme (Inkelas, 1994).

What exactly is stored in the mental lexicon according to the FUL model?

The lexicon contains phonological, morphological, semantic and syntactic information for each word. Only the information about features is used to find word candidates in the lexicon. All additional data help excluding unlikely candidates on a higher level of processing. These higher-level processes do not wait until they are fed with a few remaining word candidates. They operate in parallel with the basic feature mapping procedure right from the beginning of word perception (Lahiri, 1999; Lahiri & Reetz, 2002). However, the FUL model is most

(29)

explicit about the phonological aspects of the mental lexicon and we will restrict descriptions to these.

In the lexicon a segment is represented with a root node and its hierarchically structured relevant features. “This hierarchical representation reflects the fact that phonological processes consistently affect certain subsets of features and not others. Individual features or subsets of features are functionally independent units and are capable of acting independently. (…) Features are organised into functionally related groups dominated by abstract class nodes (such as place). The phonological features are the terminal nodes, and the entire feature structure is dominated by the root node (made up of the major class features like [CONSONANTAL] and [SONORANT]) which corresponds to the traditional notion of a single segment” (Lahiri, 1999, p. 251). The feature tree of the FUL model is depicted in Figure 1.1.

Figure 1.1: Feature geometry following the FUL model. Taken from Lahiri &

Reetz (in press).

The place features are split into three independent nodes: an articulator node containing Place of Articulation, a Tongue Height node and a Tongue Root node.

In the study on the processing of vowels in Chapter 3 we are particularly interested in Place of Articulation, but also Tongue Height is shown to play a role. It is still a matter of debate in linguistics whether vowels and consonants can be defined and processed in the same way (Chomsky & Halle, 1968; Sagey, 1986; Clements &

Hume, 1995; Halle, Vaux & Wolfe, 2000; Lahiri & Evers, 1991; Lahiri & Reetz,

(30)

2002, in press). In the FUL model the root node distinguishes consonants from vowels, but both share the same place features. All features are monovalent, meaning that they are either present or absent and no negative feature values are assigned (Lahiri & Reetz, 2002).

The FUL model has been successfully tested on many phonological and morphological phenomena (Ghini, 2001; Lahiri & Reetz, 2002; Obleser, Lahiri &

Eulitz, 2003, 2004; Eulitz & Lahiri, 2003, 2004; Wheeldon & Waksler, 2004;

Lahiri, Wetterlin & Jönsson-Steiner, 2005, 2006; Felder, 2006; Scharinger, 2006;

Kabak, 2007; Friedrich, Eulitz & Lahiri, 2008; Hannemann, 2008; Scharinger &

Zimmerer, 2009; Wetterlin, 2009; Zimmerer, 2009; Cornell, Lahiri & Eulitz, subm.; Felder, Jönsson-Steiner, Eulitz & Lahiri, subm.). Some of the empirical evidence pro and contra the FUL model will be reviewed in Chapter 3. Before, we will consider methodological aspects in Chapter 2, particularly those methods that have been used in the experiments reported later. Thereafter, Chapter 3 reports two experiments on the mechanisms of lexical access, particularly the question of whether there are inhibitory links between word candidates, as some models predict, and one experiment on the FUL model’s hypothesis of lexical underspecification in the case of vowels, thereby testing a case that is not typically experienced in everyday language use. Chapter 4 then extends this work to a suprasegmental level of lexical processing, investigating word accents in Swedish.

(31)

C HAPTER 2

M ETHODOLOGICAL A SPECTS

From the vast range of methods and experimental paradigms in psycholinguistic research, in our studies we mostly used the so-called Cross-Modal Fragment Priming design, and recorded either reaction time and correctness data and/or electrical brain responses in lexical decision tasks. In the following sections we give a short overview of these methods.

2.1 The Concept of Priming and Lexical Decision

Priming paradigms are employed in virtually all fields of psychological research and are not restricted to psycholinguistics. Generally speaking, one assumes that by processing one piece of information, related concepts are also activated to some extent or our behaviour is influenced. For instance, in social psychology, the politeness of people or their judgements about other people could be shaped by a simple word-list task including polite vs. impolite (or positive vs. negative) items (Bargh, Chen & Burrows, 1996; Higgins, Rholes & Jones, 1977). In psycholinguistics, semantic priming (among others) is a well-examined phenomenon (for reviews of priming in psycholinguistics see Zwitserlood, 1996;

Tabossi, 1996; Nicol, 1996; Drews, 1996). If we hear rose, this will activate the higher-order concept of flower, or, if we hear peak, this will activate the synonym summit in the mental lexicon. Semantic priming also works with associates, that is, two words that are often mentioned together, like cats and dogs, or with antonyms, such as sweet and sour. That a word like peak activates the word summit in the mental lexicon is inferred, among others, from reaction time data. A usual task in reaction time experiments is the so-called lexical decision task (see Goldinger, 1996 for a review): Participants are presented words and pseudowords (i.e. words that follow the phonotactic rules of the language in question and could in principle be existing words, but are not) and are asked to decide whether what they just saw or heard was a real word or not. They indicate their lexical decisions by pressing one of two respective buttons. Innumerable experiments have shown that participants are faster to say that summit is a real word, if they were presented peak before, as compared to a case where they were presented some unrelated word like cake before the word summit. Also, across all trials, responses are usually more correct for semantically related prime-target pairs (e.g. peak-summit) than for unrelated pairs (e.g. cake-summit). We call the word that is meant to preactivate

(32)

another word (e.g. peak) the prime word, and the word that is presented next and shall be decided upon (e.g. summit) the target word. The gain in response speed in case of semantically related prime-target words is explained by faster word processing in the brain; that is, the word summit is processed faster, because it was preactivated by its synonym peak. In terms of the distributed cohort model (Gaskell

& Marslen-Wilson 1997; Gaskell & Marslen-Wilson 1999, see chapter 1) this is a straight-forward assumption, because at the semantic level of representation, the word peak activates the same node-pattern as the word summit would.

2.2 Cross-Modal Fragment Priming

Words in the mental lexicon are not only activated via semantic relations, but also – and primarily – via segmental information. As we already learnt in Chapter 1, in speech perception the lexical representations of all those words are activated, that match the input stream. For instance, if we hear cat, the lexical representations of cat, catalogue, category, catastrophe, etc. are activated and are classified faster as existing words in a lexical decision task than if we had heard rat, or mug or anything else before. In a fragment priming design, a fragment of a word (e.g.

com-) is presented, usually auditorily, and the target is a complete word, that either matches the segmental information of the prime fragment (e.g. comma, comment, commander, etc.), or not. This paradigm can be used to investigate the impact that a difference between signal and representation has on lexical access. For instance, it can be assessed how harmful a mismatch in one segment is in lexical access, by comparing the reaction time in response to a complete match (e.g. com- comma) to the reaction time for a mismatch in one segment (e.g. con- comma) and to the reaction time for something completely different (e.g. tun- comma). Alternatively, the impact of stress information can be assessed if segments are kept constant but the stress pattern differs, for instance by using admiral (with stress on the first syllable) as target word and preceding it with either admi- from the very same word as prime fragment or by admi- from the word admiration (with stress on the penultimate syllable). If stress information is used in lexical access, we expect slower reaction times in the latter case.

In cross-modal fragment priming, prime and target are presented in different modalities. Usually the prime fragment is given auditorily via loudspeakers or headphones, and the target word is presented visually on a screen.

We assume that this way the priming mechanism taps into modality independent representations in the mental lexicon and priming effects are not due to intra-modal comparisons of acoustic or visual form (see for example Marslen-Wilson, 1990).

(33)

2.3 Reaction Time Data

In cross-modal fragment priming, reaction times are usually defined as the time span between the onset of the visual target word on the screen and the pressing of the response button for lexical decision. In lexical decisions, responses to words are usually faster than responses to pseudowords, and within the word targets, responses to successfully preactivated/primed targets are faster than responses that are unrelated to the preceding prime fragment. Reaction time data give valuable insight into phenomena of human language processing. However, one drawback is, that they only deliver the final product of all processes that have taken place in between target presentation and button press. Psycholinguists have come up with various reaction time paradigms that allow them to tap into different stages of processing, but still reaction time data provide no tool for online measurement of speech processing. Such online information is provided by electroencephalography (EEG), the measurement of electrical brain activity.

2.4 Electroencephalography

The EEG derives from summated postsynaptic electrical potentials, generated from cells in the cortex. These potentials are registered with electrodes attached to the scalp (Davidson, Jackson & Larson, 2000). Only sufficiently large amounts of electrical brain activity can be measured on top of the scalp: at least 10000 brain cells have to be aligned in the same direction and fire simultaneously and synchronically in order to produce a signal that can be detected by an EEG system (Seifert, 2005). From these signals, event-related brain potentials (ERPs) can be calculated. ERPs are regarded as a manifestation of brain activity that is time locked to internal or external events, i.e. Potentials that are Related to an Event. In cross modal fragment priming this event is the target word. That is, we are interested in the neuronal activity in response to a certain target word and assume that this response can be modulated by the preceding prime fragment. The brain’s response to a single event/target word will not be visible in the signal because the surrounding brain activity causes too much noise. ERPs are gained by averaging a lot of brain responses to stimuli in the same priming condition. This way, the non- systematic noise is cancelled out while the systematic brain responses that are triggered by the target words show up. An ERP consists of several positive and negative peaks in the signal, which differ from each other in scalp distribution, polarity and latency. Consequently, an ERP can be regarded as an aggregate of a number of ERP components. A component is commonly taken to reflect the

(34)

tendency of parts of the ERP waveforms to covary in response to specific experimental manipulations (Fabiani, Gratton & Coles, 2000).

2.5 Event Related Potentials in Cross Modal Fragment Priming

In cross modal fragment priming studies, several components can be observed. We focus on two of them, the so-called P350 and the N400 component. The P350 usually peaks between 320 and 400 ms after the onset of the visual target word and differentiates matching from mismatching or unrelated prime-target pairs (Friedrich, Kotz, Friederici & Alter, 2004; Friedrich, Kotz, Friederici & Gunter, 2004). Although the P suggests that this component has a positive polarity, it is in fact a negative peak that is lateralized to left anterior regions of the brain. The amplitude of the P350 component is consistently more negative for matching prime-target pairs (e.g. com- comma) as compared to unrelated pairs (e.g. tun- comma). If the prime deviates from the target in only one segment (e.g. con- comma), amplitudes of the P350 fall in between those of the fully matching and the completely different condition. Importantly, the amplitude of the P350 has shown to be sensitive to fine-grained segmental differences between prime fragment and target word and also to the representation of the target word in the mental lexicon, as it is assumed by the FUL model. This means that the amplitudes reflect the predicted asymmetry in priming between coronal and non-coronal signals and representations outlined in Chapter 1. For instance, if one perceives con- as prime fragment, the feature [CORONAL] is extracted from the /n/ in the signal and mismatches with the feature [LABIAL] in the representation of the /m/ in comma.

Consequently the P350 amplitude in this condition is similar to the control condition (e.g. tun- comma) and less negative than the matching condition (e.g.

com- comma). However, if the target word is cannon, and the prime fragment is cam-, the feature [LABIAL] is extracted from the /m/ in cam- and mapped onto the underspecified representation of the /n/ in cannon. This results in a nomismatch.

Consequently both fragments, can- and cam- activate the target word cannon and their P350 amplitudes pattern together and are more negative than the control condition (tun- cannon) (compare Friedrich, Eulitz & Lahiri, 2008).

Another component that is reliably found in fragment priming is the N400, a negative going component over posterior electrode sites that peaks between 400 and 600 ms after target onset. The N400 amplitude is more negative for unrelated as compared to related words. While the P350 is assumed to reflect automatic lexical activation, the N400 appears to be related to the effort needed in processing a single word in the context of a preceding sentence or priming situation (Friedrich,

Referenzen

ÄHNLICHE DOKUMENTE

Stefanie Bollin, UB Greifswald www.vifanord.de Academica Tartu 2011.. Geographical Range: Northern European and Baltic

However, Wurm and Baayen (2006) observed a word frequency effect for English regular inflected words well below this threshold in both visual and auditory lexical

Computational modeling of Maltese noun inflections with LDL showed that it is possible to model comprehension and production of Maltese nouns by considering mappings between form

While originating from a philosophical background, this approach links well to the cognitive architecture of skilled action (Schack 2004a; Schack and Hackfort 2007), its idea of

Thus, if we take the [o] /ø/ versus [ø] /o/ pair, when /ø/ is the standard, and thus taps the underlying representation that is not specified for its place feature [ CORONAL ], there

The effects of implementation intentions have been shown to be contingent upon high commitment to the respective goal intention (as motivational property;

[r]

Therefore, a more accurate interpretation of the experimental re- sults is that the precision of the open type and less frequent lexical entries does not have a large impact on