• Keine Ergebnisse gefunden

Agreement in Sentence Production

3.2 Agreement Errors

3.2.2 How To Study Agreement Errors

The processing of agreement can be investigated by observing language behavior of healthy as well as impaired individuals and by running experiments using vari-ous techniques. In both cases, agreement errors—either occurring spontanevari-ously or elicited by the experimental stimuli—shed light on the processes underlying the computation of agreement.

Although speech error corpora are a valuable source for psycholinguistic re-search as appreciated above, they also have their limitations. At least traditional corpora collected by writing down a speech error once it is heard do not meet the criteria of validity and reliability (Ferber, 1995; Meyer, 1992). Corpora of this

sort run the risk of random effects due to selective collecting and misinterpreta-tions/mishearings. As a result, the corpora are not representative for the frequency distribution of the various speech error types. In particular, more subtle errors might be overlooked or overheard. Recent advances in corpus linguistics made larger corpora available that are not confounded by the aforementioned shortcom-ings. Nevertheless, at least two problems prevail: First, some errors might be simply too rare to be found with a sufficient number of instances in a corpus. Sec-ond, without knowledge about the intended message, some speech errors are not identifiable because they are obscured by ambiguity. In this case, the resulting sentence is grammatical but does not convey the intended message. Agreement errors are particularly prone to this effect given the existence of disambiguation by agreement. Take for instance relative clause attachment. Producing a verb with the wrong number specification forces attachment that does not fit the intended message but it does not result in an ungrammatical sentence either. Therefore, the agreement error is hard to detect. Even, when the error becomes evident, we often cannot locate the error without knowing what the speaker intended to say. Guess-ing the speaker’s intention might be easy in case of erroneous word exchanges as in (10), but it is less straightforward or even impossible in cases like (11) (repeated from above) and (12).

(10) I left the briefcase in my cigar.

(Garrett, 1980: 188)

(11) * We argue that the discrepancy between the results obtained with the two tasks inform us about the relative time-course of phrase structure building and agreement processing in sentence comprehension.

(12) * The made-up ungrammaticalities that the linguist use will rarely be encountered by the parser.

(Foster, 2004: 201; author’s emphasis)

In sentences (11) and (12) the embedded subject (discrepancy/linguist the corre-sponding verb inform/use do not agree in number. There are two ways to correct the agreement error and both are equally reasonable. One could either substitute the singular form of the subject by the corresponding plural form or maintain the singular subject but change the verb into a singular verb. In both cases, we would have to add a single morpheme (and due to an oddity of English it would be the suffix -s in both cases).

(13) a. We argue that the discrepancy between the results obtained with the two tasks informs us . . .

b. We argue that the discrepancies between the results obtained with the two tasks inform us . . .

(14) a. The made-up ungrammaticalities that the linguists use will rarely be encountered by the parser.

b. The made-up ungrammaticalities that the linguist uses will rarely be encountered by the parser.

The choice of the appropriate correction of (12) depends on the writer’s intentions:

Either he or she had one particular linguist in mind who is known to construct unlikely ungrammatical sentences, then (14b) would be the correct form. If the writer is talking about linguists in general (or at least a bunch of them), (14a) would be the correct version. For (11), however, even knowing what the authors refer to is of no help. At least to me, (13a) and (13b) have the same meaning.

Foster (2004) notes that in her dataset, errors of the type shown in (12) can always be corrected by changing the verb into a singular verb. She claims this to be in line with a heuristic proposed by Genthial et al. (1994) for automatic error corrections. Genthial and colleagues make two assumptions. First, they assume that it is more likely that something is missing in the erroneous sentence, and second, they assume that the error is more likely to appear in the rightmost element. The first assumption does not help in the case of (12), since the omission could either took place on the agreement controller linguist(s) or on the agreement target use(s). The second assumption could be based on language production stating that the speaker is more confident about earlier items.

An alternative view, however, would be to say that the speaker is more con-fident about feature specifications which relate to the meaning she or he wants to express. The speaker might lack such confidence for features which are not interpretable. Feature specifications which are purely formal reflexes, e.g., due to agreement requirements, might therefore be more vulnerable to errors. Under this assumption, speakers would be confident about the number specification of nouns but less confident about the number specification of verbs. Thus, (12) would be an instance of incorrect agreement at the verb. Consider what happens if we reverse the order of controller and target, as in (15).

(15) * Do the linguist use made-up ungrammaticalities that will rarely be en-countered by the parser?

Again, we could either correct the plural verb to a singular verb or the singular subject to a plural subject. Which correction to choose is hard to say as long as we do not know what the speaker is talking about. According to the second assumption of Genthial et al. (1994) the verb has the right form and the noun has to be corrected. According to the proposal made above that interpretable features are less vulnerable to speech errors, the verb has to be changed whereas the noun is correct. Without further knowledge about the intended utterance, we cannot decide between these two options. Although, heuristics like the one proposed by

Genthial et al. (1994) might be useful for automatic text processing, they are of no help for the study of agreement processing, essentially because they presuppose what we want to find out and furthermore, because we might come to wrong conclusions based on wrong assumptions about the speakers’ intentions.

Even in cases where we can be relatively sure about the intended grammatical utterance, we can only speculate about the source of the agreement errors. Let us come back once more to two examples given in the chapter 1 (repeated below) and assume for the sake of the argument that the subject occurs as intended by the speaker.

‘The standard cases which Chomsky always considers are such ...’

(17) * Firstly, the reference to Chomsky’s notions of E-Language (Exter-nal(ised) Language) and I-language (Inter(Exter-nal(ised) Language) make clear that we acknowledge these two aspects of language.

We could describe the agreement error in (16) either by saying that the verb occurs in the singular despite the plural specification of the subject or by saying that the verb agrees with the relative clause subject Chomsky instead of agreeing with the matrix subject Standardfälle. The first description suggests an explanation according to which the plural specification of the subject got lost and the verb ends up with the default singular. The second description suggests that the intervening NP causes some interference. The error in (17) can be described in a similar way. This time, the verb occurs in the plural although the corresponding subject is a singular NP. Again, we can speculate whether the speaker just produced the wrong verb form, i.e. omitted the inflectional affix, or whether the error is related to the intervening plural NP.

The complexity of the examples suggests that the error arises from a difficulty to keep track of the subject and its number specification. In addition to complex-ity, the intervening NPs might have a more specific effect on the computation of agreement. Note that in all three examples subject NP and intervening NPs differ in their number specifications. In order to decide whether this is just an accident or a prerequisite for agreement errors we need a database containing enough in-stances of agreement errors to check for possible factors. Unfortunately for the re-searcher, but fortunately for the hearer, speakers only rarely produce an agreement error. Accordingly, agreement errors are rare in corpora. Data sparseness makes it difficult to come up with reliable statistics. Fromkin’s Speech Error Database contains only 9 instances of an agreement error (among 30 morphosyntactic errors

regarding the verb)6. Statistical statements are further hampered by the objections discussed earlier. This problem is strikingly proved by Fromkin’s database when comparing the percentage of agreement errors in the current database with ear-lier stages. While the number of recorded speech errors increased impressively, the absolute number of agreement errors remained virtually constant. The ap-pendix in Fromkin (1973) lists 868 speech errors including 7 agreement errors,7 currently (as of 11.12.08) the database contains 8673 speech errors including 9 agreement errors. This discrepancy demonstrates how dangerous it is to draw statistical conclusions from a manually collected error corpus. Nowadays, large electronic corpora are available. But the investigation of speech errors in such a corpus faces the problem of appropriate search keys. Finding an agreement error requires at least a morphosyntactically tagged corpus or even better a treebank en-abling the search for mismatches between the feature specifications of controller and target. Corpora of this type are still rather small, too small given the rareness of agreement errors.

The elicitation of agreement errors under experimentally controlled conditions is therefore a more appropriate means to study agreement processing. In partic-ular, since running an experiment allows testing specific hypotheses. By means of controlling the experimental conditions, the experimenter can restrict possi-ble errors that participants produce and possipossi-ble sources for errors, respectively.

Psycholinguists have developed a variety of elicitation techniques that allow for investigating language production under experimentally controlled conditions (for an overview see Baars, 1992). Most elicitation techniques put some time pressure on the speaker thereby increasing the likelihood of an error. Furthermore, they create some competition between alternative output plans. This competition as-pect is most obvious in the case of tongue twisters involving phonological-motoric competition. For the study of agreement errors the most frequently used proce-dure is a sentence completion task. Participants read or hear a sentence fragment and possibly some further word (e.g., the agreement target in its base form). Their task is then to repeat the fragment and complete it in such a way that a grammati-cal sentence results. Competition occurs naturally since the speaker has to choose the correct form of the agreement target. Further competition is often induced by the presence of a distractor not matching the agreement controller in number (or some other agreement feature) and thereby making the choice of the correct form of the agreement target more difficult and giving rise to so-called attraction errors which will be discussed in turn. In corresponding control conditions this influence is neutralized by means of having controller and distractor match in number.

6http://www.mpi.nl/services/mpi-archive/fromkins-db-folder/ (as of 11.02.2008)

7It has to be noted that the appendix contained only selected items while the actual number of speech errors was higher but substantially lower than in the current database.