• Keine Ergebnisse gefunden

Agreement in Sentence Comprehension

4.3 Attraction Errors in Sentence Comprehension

4.3.3 Non-Intervening Attraction

4.3.5.1 Method ParticipantsParticipants

Forty-eight students of the University of Konstanz participated in the experiment.

All were native speakers of German and naive with respect to the aims of the experiment. They received either course credits or were paid for participating in the experiment.

Materials

Forty sentences like (15) were created. For later analyses, one sentence had to be excluded because of a typo. All sentences were introduced by a matrix clause fol-lowed by an embedded clause. The subject of this embedded clause was a complex NP headed by the controller and modified by the distractor which was a genitive phrase. Distractor and verb were separated by an adverbial. The embedded verb

occurred in clause-final position and thus immediately before the judgment. Un-grammatical sentences were constructed by changing the number feature of the verb resulting in an agreement violation. Each sentence appeared in eight condi-tions. Three experimental factors were manipulated, each having two values. The factor Controller varied the number specification of the controller NP which was either singular or plural, whereas the factor Distractor varied the number spec-ification of the modifier NP which either matched or mismatched the controller in number. Finally, the factor Grammaticality refers to the grammaticality of the whole sentence (grammatical versus ungrammatical) which was manipulated by means of changing the number specification of the embedded verb. Ungrammati-cal sentences were derived by replacing a singular verb by a plural verb and vice versa. Table 4.1 presents an example sentence in all its versions; for the complete material, see the appendix.

Table 4.1: Example of Sentences in Experiment 1.

Matrix clause: Ich erinnere mich daran (‘I remember that’) Singular Controller

Match dass die Oma des Kindes ziemlich lautstark geklatscht hat that the granny of-the child fairly vociferously applauded has Mismatch dass die Oma der Kinder ziemlich lautstark geklatscht hat

that the granny of-the children fairly vociferously applauded has

‘that the granny of the child(ren) applauded fairly vociferously’

Plural Controller

Match dass die Omas der Kinder ziemlich lautstark geklatscht haben that the grannies of-the children fairly vociferously applauded have Mismatch dass die Omas des Kindes ziemlich lautstark geklatscht haben that the grannies of-the child fairly vociferously applauded have

‘that the grannies of the child(ren) applauded fairly vociferously’

Note. Ungrammatical sentences were derived by changing the number specification of the final verb: hat was replaced by haben and vice versa).

From the experimental sentences, eight stimulus lists were generated which contained an equal number of sentences within each condition but each sentence only in one of its eight versions. The experimental sentences within these lists were randomized for each participant individually. The stimulus lists were in-terspersed in a greater list of 108 filler sentences representing a wide range of

syntactic constructions. The filler items mostly served as experimental items in other experiments and included grammatical as well as ungrammatical sentences.

Procedure

Sentences were presented visually using the DMDX software developed by K.

Forster and J. Forster at Monash University and the University of Arizona (Forster and Forster, 2003). Participants were seated in front of a computer monitor. They were told that they would be presented sentences on the screen and that their task was to judge the grammaticality of each sentence as quickly and as accurately as possible. Participants initiated each trial by pressing the space-bar which triggered the display of three fixation points in the center of the screen for 1,050 millisec-onds. Thereafter, the sentence appeared on the screen: word by word with each word appearing at the same position (mid-screen). Each word was visible for 225 milliseconds plus additional 25 milliseconds for each character to compensate for length effects. There was no interval between words. Immediately after the last word of a sentence, three red question marks appeared on the screen, signaling to the participants that they now were to make their judgment. Figure 4.4 illustrates the procedure. Participants indicated their judgment by pressing either the left or the right shift key on a computer keyboard. They used their right hand to indicate that a sentence was grammatical and their left hand to indicate that it was ungram-matical. If participants did not respond within 2,000 milliseconds, a red warning line "zu langsam" (too slow) appeared on the screen and the trial was finished.

Prior to the experimental session, participants received practice trials to ensure that they had understood the task. The practice items included both grammatical and ungrammatical sentences, but no agreement violation. During the practice trials but not during the experimental session participants received feedback as to the correctness of their judgments.

4.3.5.2 Results

Data in Experiment 1 and all subsequent experiments were analyzed using R, ver-sion 2.10.1 (R Development Core Team, 2009),12 and in particular the languageR library (Baayen, 2010, see also Baayen, 2008) which makes use of the lme4 pack-age (Bates, 2005; Bates and Maechler, 2010). The analyses use mixed models which have been argued to be superior to traditional analyses of variance because they allow to simultaneously consider contributions of all factors including par-ticipants and items and covariates bound to them (cf. Baayen et al., 2008). Each

12R is an open source software for statistical computing and graphics, cf. http://www.r-project.org/.

...

fixation points (1050ms)

Ich weiß,

dass die

Oma angerufen

hat.

???

sentence (225ms+25ms/character)/word

prompt for response

Figure 4.4: Illustration of the speeded-grammaticality procedure used in Experiment 1

model includes participants and items as random effects to avoid the language-as-a-fixed-effect fallacy (Clark, 1973). The models’ validity was assessed by means of likelihood ratio tests comparing the models with fixed effects to corresponding null models including only the random effects. For all models reported in this thesis, this test verified a significant difference between the model including fixed effects and the null model.

Judgment data were analyzed using mixed logit models—generalized linear mixed models suited for binomially distributed outcomes like binary judgments (cf. Jaeger, 2008; Dixon, 2008). The analysis was run using R’s lmer function with family specified as binomial to allow for a response variable with binary outcome as constituted by the binary judgments in the experiment. In order to run the analyses, responses were scored as either 1 (correct judgments) or 0 (erroneous judgments). For each experimental factor, a contrast was defined and centered.

The model analysis yields estimates, standard errors, z-values and p-values for each fixed effect and interaction as well as variance and standard deviation for each random effect. Furthermore, lmer returns the quasi-log-likelihood of the model which is computed using Laplace approximation.

Response times were analyzed using linear mixed-effects models with partic-ipants and items included as random effects and the experimental factors as fixed effects. Again R’s lmer function was used. Prior to analysis, response times more than 2 standard deviations above from the subject’s mean were replaced by the

cut-off value (the value equal to 2 standard deviations above the mean). Homogeneity and normality were checked by visual inspection of the residual plots (plotting residuals against fitted values). The model provides estimates, standard errors and t-values for each fixed effect and interaction. Corresponding p-values were com-puted using R’s pvals.fnc function that estimates p-values based on Markov chain Monte Carlo (MCMC) sampling (cf. Baayen, 2008; Baayen et al., 2008).

Judgments

Percentages of correct judgments in Experiment 1 are shown in Table 4.2. Table 4.3 summarizes the estimates, standard errors (SE), z-values and corresponding p-values for each fixed effect and interaction in a mixed logit model that includes participants and items as random effects and Grammaticality, Controller and Dis-tractor as fixed effects. The quasi-log-likelihood value is provided in the table’s caption.

Table 4.2: Percentages of Correct Judgments in Experiment 1 Singular Controller Plural Controller

Match Mismatch Match Mismatch Mean

grammatical 94 (1.6) 82 (2.5) 91 (1.9) 92 (1.8) 90

ungrammatical 95 (1.4) 93 (1.7) 91 (1.9) 91 (1.8) 93

Mean 95 87 91 92

Note. Values enclosed in parentheses are standard errors.

Table 4.3: Summary of Fixed Effects in the Mixed Logit Model for Judgments in Experi-ment 1 (Log-Likelihood=−531)

Estimate SE z p

(Intercept) 2.64 .14 17.97 < .001 **

Grammaticality .33 .18 1.82 .068 +

Controller −.13 .18 −0.72 .472

Distractor −.38 .18 −2.10 .036 *

Grammaticality×Controller −.74 .36 −2.06 .040 *

Grammaticality×Distractor .41 .36 1.16 .248

Controller×Distractor 1.06 .36 2.94 .003 **

Grammaticality×Controller×Distractor −.87 .72 −1.22 .224 + p<.1, * p<.05, **p<.01

The pattern arising from Table 4.2 suggests an interaction involving all three factors: Plural sentences yielded comparable error rates across conditions; singu-lar sentences show a mixed pattern—grammatical sentences with a plural modifier (=mismatching distractor) deviate from the rest in showing a substantial drop in accuracy. Yet, the mixed logit model failed to find a significant interaction of Controller, Grammaticality and Distractor. Instead, the model found a significant interaction of Controller and Distractor as well as a main effect of Distractor. The factor Distractor had an effect in singular sentences but not in plural sentences.

For singular sentences, judgment errors were more common when controller and distractor mismatched in number. In addition, the model indicates a significant interaction of Grammaticality and Controller and a marginally significant main effect of Grammaticality. On average, participants produced more judgment er-rors for grammatical sentences than for ungrammatical sentences. This means, they erroneously rejected a grammatical sentence more often than they failed to notice a real violation. However, this effect is absent in plural sentences. In fact, it is substantial only in singular sentences with a plural distractor. In singular sen-tences with a singular distractor it is hardly visible. Still, the interaction involving all three factors is not significant.

Although the numbers in Table 4.2 suggest that participants produced more errors in plural sentences, there the model failed to find a significant main effect of Controller. The mean percentages of correct judgments did not differ between singular sentences and plural sentences (both 91%). This statement has to be qual-ified by the Controller×Distractor interaction. The drop of accuracy in singular sentences with a plural distractor may have obscured a small effect of Controller.

To assess the distractor effect across pairs of conditions, I ran a mixed logit model with participants and items as random effects and a grand factor as fixed effect. The grand factor has eight levels according to the eight conditions that re-sult from crossing the original experimental factors. I then specified contrasts for each pair of match and corresponding mismatch condition. The analyses confirm the mismatch effect for grammatical sentences with a singular subject (estimate = −1.328,SE = .34,z =−3.94,p< .001). The effect is absent in all other conditions (ungrammatical sentences with a singular subject: estimate

=−.477,SE =.41,z=−1.15,p=.25; grammatical sentences with a plural sub-ject: estimate=−.163,SE=.34,z=.48,p=.63; ungrammatical sentences with a plural subject: estimate=−.141,SE=.34,z=.42,p=.67).

Judgment Times

Response times more than 2 standard deviations above from the subject’s mean were replaced by the cutoff value (the value equal to 2 standard deviations above the mean). The resulting mean response times for correct judgments are shown in

Table 4.4. Since only few judgment errors were made, further analyses included only response times for correct judgments. Response times were analyzed using a linear mixed-effects model with participants and items as random effects and Grammaticality, Controller and Distractor as fixed effects. Table 4.5 summarizes the fixed effects.

Table 4.4: Mean trimmed response times for correct judgments (in ms) in Experiment 1 Singular Controller Plural Controller

Match Mismatch Match Mismatch Mean

grammatical 597 (20) 687 (27) 558 (19) 591 (20) 606

ungrammatical 586 (19) 671 (23) 642 (21) 698 (25) 648

Mean 591 678 600 644

Note. Values enclosed in parentheses are standard errors.

Table 4.5: Summary of Fixed Effects in the Linear Mixed-Effects Model for Response Times in Experiment 1 (Log-Likelihood=−11867)

Estimate SE t pMCMC

(Intercept) 631.47 29.96 21.08 < .001 **

Grammaticality 44.19 12.01 3.68 < .001 **

Controller −10.27 12.00 −.86 .390

Distractor 66.55 12.00 5.55 < .001 **

Grammaticality×Controller 94.28 24.03 3.92 < .001 **

Grammaticality×Distractor 6.15 24.00 .26 .803

Controller×Distractor −38.38 23.99 −1.60 .110

Grammaticality×Controller×Distractor 13.16 48.00 .27 .766 Note. Only response times for correct responses entered analyses.

**p<.01

The model revealed a significant main effect of Distractor: Judgment times were shorter when the genitive modifier shared the number specification of the con-troller. Furthermore, the model indicates a significant main effect of Grammati-cality: On average, grammatical sentences were judged faster than ungrammatical ones. Yet, this grammaticality effect is present only in sentences with a plural controller. For this sentence type, we see an increase of about 1000 ms in ungram-matical sentences. For sentences with a singular controller, response times hardly differ; numerically, grammatical sentences have even longer response times than ungrammatical sentences. The Controller×Grammaticality interaction is signif-icant. As a main effect, Controller is not signifsignif-icant. The factor Distractor, in

contrast, was significant as a main effect. Response times were shorter when sub-ject and modifier matched in number. Numerically, the penalty for a mismatching distractor was higher in sentences with a singular subject; statistically, however, the Controller×Distractor interaction failed significance—though it approached marginal significance.

4.3.5.3 Discussion

Experiment 1 has two main results: The German genitive construction is vul-nerable to attraction—visible in higher error rates and prolonged response times for correct judgments. Moreover, as evident in Figure 4.5, attraction in the Ger-man genitive construction exhibits the singular–plural asymmetry known from the PP-modifier construction in other languages, e.g., English. If the head noun and therefore the entire subject NP is singular, a plural modifier increases the inci-dence of judgment errors. Singular modifiers lack a corresponding effect: If the head noun is plural, error rates are the same in sentences with a singular modi-fier and in sentences with a plural modimodi-fier. The pattern in response times is less conclusive. Numerically, attraction penalties are visible for both singular distrac-tors and plural distracdistrac-tors. Statistically, the critical interaction between Controller and Distractor just missed marginal significance. Note that a clear-cut interpreta-tion of response times in a speeded-grammaticality judgment experiment is dif-ficult because an increase may reflect several distinct things. In the context of attraction, prolonged response times may indicate difficulty to deal with conflict-ing information—attraction—or just higher processconflict-ing load due to higher seman-tic complexity—the computation of an 1:1 relation in the match conditions (the granny of the child, the grannies of the children) might be easier in comparison to an 1:n relation (the granny of the children) or an n:1 relation (the grannies of the child). The plausibility and/or naturalness of these relations may vary as well, thereby contributing to the inconclusiveness of the response time pattern.

The inclusion of grammatical and ungrammatical sentences complicates the in-terpretation of response times even further. In grammatical sentences, a correct judgment results if the parser ignores the distracting number information or re-vises an initial error because this error lead to an agreement violation. In ungram-matical sentences, in contrast, revision is unlikely as a source for an increase in response times for correct judgments—an error would obscure an actual viola-tion and hence fail to trigger reanalysis. For the moment, I will not pursue the discussion of response times and concentrate on the error pattern instead.

For error rates, Experiment 1 attests plural attraction and fails to find singu-lar attraction. This singusingu-lar–plural asymmetry in Experiment 1 contrasts with the study by Hölscher and Hemforth (2000) who found a reversed asymmetry (singu-lar attraction but no plural attraction). As explained earlier, the use of main clauses

Attraction Rate

grammatical ungrammatical Error−rate difference −5051015

singular subject plural subject

Attraction Penalty

grammatical ungrammatical Response−time difference (ms) 050100150

singular subject plural subject

Figure 4.5: Experiment 1: Attraction rates, defined as error rate difference between a mis-match condition and the corresponding mis-match condition, and attraction penalties, defined as response time increase in a mismatch condition relative to the corresponding match condition

in Hölscher and Hemforth (2000) together with the decision to use an eye-tracking procedure makes it possible that parafoveal reading underlies the reversed asym-metry. Parafoveal reading may have hidden plural attraction, and it may have created the illusion of singular attraction. Experiment 1 avoided this potential confound by using embedded and therefore verb-final clauses in combination with an end-of-sentence judgment task. I therefore conclude that modifier attraction in German sentence comprehension is indeed restricted to plural distractors—as it is in German sentence production (Hartsuiker et al., 2003; Hemforth and Konieczny, 2003; Hölscher and Hemforth, 2000; Pfau, 2003, 2009).

The parallelism of attraction in production and comprehension suggests a common source, e.g., feature percolation that applies to an asymmetric repre-sentation of number: The plural feature—explicitly represented—of a plural dis-tractor percolates, turns the subject into a plural NP and creates the illusion of an agreement violation; the singular feature of a singular distractor lacks an explicit representation, hence cannot percolate and therefore fails to flaw the subject rep-resentation. Such an explanation leaves open why attraction failed to affect the error rate in ungrammatical sentences. If percolation applies, it would hide the actual agreement violation. As a result, participants should produce a judgment error, contrary to what is found in Experiment 1 .

Admittedly, the unexpected difference between grammatical and ungrammat-ical sentences is problematic for any account. Though the reanalysis account

pro-posed by Wagers et al. (2009) takes grammaticality explicitly into account, it pre-dicts the opposite pattern: According to the reanalysis account, attraction occurs during reanalysis of ungrammatical sentences. I postpone a detailed discussion to after the reanalysis account is introduced.

4.4 Processes of Agreement Computation