• Keine Ergebnisse gefunden

As explained at the beginning, the main aim of this chapter is to test out the GPH on the Catalan modal verbcaldrein two comparable parallel sub-corpora from the COVALT corpus: English-Catalan and French-Catalan. The corpus used will be both parallel and comparable, as data will also be retrieved from a component of Catalan non-translations.Caldre is used as a starting-point for the analysis because it may be said to be a unique item for the English-Catalan but not for the French-Catalan language pair. English does not have a syntactic counterpart for caldre, as possible candidates, such as the expressionit+take +X (as inIt takes a lot of courage to rise to that challenge), are not frequent or grammaticalised to the same extent ascaldreis.1 French, on the other hand, has the verbfalloir, as seen above, which also conveys obligation or necessity and typically occurs in impersonal constructions. As seen in §3, bothcaldreandfalloir fall under the category of relative impersonal verbs, which cuts across all Romance languages, and take the same kind of complements to a large extent – the only difference being thatfalloir cannot take a noun phrase as a complement. They share the other two complements (infinitive andthat-clause), and that is the basis of their syntactic similarity. Formal similarity comes under many guises. The most obvi-ous one is phonological or graphological similarity, especially when it concerns two words with a common origin, e.g. English hound and German Hund. But there may be formal similarity at other levels, such as that of syntax. Catalan

1The query [lemma=“it”][lemma=“take”] in the ST component of the English-Catalan sub-corpus in COVALT yields 44 matches, 15 of which are false positives, the remaining 29 of-ten featuring a time complement, as inIt took three days to…The normalised frequency (f) of this construction is 0.024 per 1,000 words. Just for the sake of comparison, the normalised fre-quency ofmustas an indicator of obligation and intrinsic necessity is 0.55 – over 20 times as high as the frequency ofit+take. (This value is based on a projection of the results yielded by the manual analysis of a random sample of 300 instances, out of the total 993 matches found for the query [lemma=“must”].)

caldreand Frenchfalloir are not cognates, but they share two syntactic patterns in addition to their semantic common ground. The rationale behind using an item that is unique for a certain language pair but not for another is the same as in Hareide (2017), which serves here as a source of methodological inspiration:

the basic contrastive fact around which the study pivots may well give rise to different configurations of factors related to salience and connectivity that may impact translation outcomes.

One of the pre-requisites for this methodology is having two parallel corpora that can be regarded as comparable in all relevant respects, i.e. textual genre, date and place of publication, and type of readership. The COVALT corpus ful-fils such a requirement. COVALT (Valencian Corpus of Translated Literature) is a multilingual corpus made up of the translations into Catalan of narrative works originally written in English, French, and German published in the au-tonomous region of Valencia from 1990 to 2000, together with their correspond-ing source texts. The English-Catalan sub-corpus comprises 36 English source texts, amounting to 1,201,757 words, and their corresponding target texts in Cata-lan (1,343,631 words). The French-CataCata-lan sub-corpus comprises 21 French source texts, amounting to 551,869 words, and their corresponding target texts (566,998 words). COVALT also includes non-translated components for both target lan-guages, Catalan and Spanish. The Catalan non-translated component is a set of narrative works originally written in Catalan intended to be comparable to the translated component in all relevant respects: place of publication (Valencian Community), date of publication (1990-2000), language (Catalan) and genre (nar-rative fiction). The non-translated component amounts to 1,551,521 tokens. These corpora were compiled at the Translation and Communication Department, Uni-versitat Jaume I (Castelló, Spain) and can be accessed for research purposes upon request (http://www.covalt.uji.es).

Before formulating hypotheses, we need at least some basic information on the relative salience of the main verbal indicators of obligation and intrinsic necessity in the three languages involved in this study. Since salience is operationalised as frequency (as will be seen later on), corpus data will be used when suitable.

Grammars tell us that bothcaldreandhaver de+ infinitive are central as regards the expression of obligation and necessity in Catalan, and the same applies to mustandhave (got) tofor English; but which member of the pair is the more fre-quent? In the component of Catalan non-translations in COVALT,caldreoccurs 735 times, with a normalised frequency of 0.47 per 1,000 words, whereashaver de+ infinitive features 1,924 occurrences, with a normalised frequency of 1.24 per 1,000 words.Haver de+ infinitive is about 2.5 times as frequent ascaldre(in

35

terms of normalised frequency) and we may assume, therefore, that it is more salient.

In the case of English, it would not make much sense to compare corpus fre-quencies because the meanings ofmust andhave to do not overlap to such an extent ascaldreandhaver dein Catalan. As seen in §3,mustis the central verb for strong obligation andhave (got) tofor strong intrinsic necessity (Radden & Dir-ven 2007). We can rely on these assumptions in order to formulate hypotheses.

It must also be borne in mind that the absence of obligation is usually conveyed by the negative forms ofhave toandneed (to)– not by the negative form ofmust, which expresses negative obligation, i.e. prohibition.

As to French, we saw above that the main modal or semi-modal verbs con-veying obligation and intrinsic necessity are devoir and falloir. As in the case of English, it would not make much sense here to compare corpus frequencies of these two verbs (e.g. in the ST component of the French-Catalan sub-corpus in COVALT) because their meanings overlap only to a certain extent.Devoir is more polysemous than falloir, as it also conveys epistemic necessity and futu-rity, and it would be necessary to discard these meanings manually. Lewis (2015) reports on a previous study by Labbé & Labbé (2013) according to whichfalloir is much more frequent thandevoirin spoken and literary French, whereas the opposite is true for a corpus of presidential speeches. Lewis claims that her own results from a corpus-based analysis of political speeches in English and French are consistent with Labbé and Labbé’s findings.

We may therefore assume that: a)haver de + infinitive is a more salient in-dicator of obligation and intrinsic necessity thancaldrein Catalan; b) the main prima facie equivalents of caldre (and haver de) in English (must andhave to) and French (falloir) are also salient in their respective modality networks; and c) patterns of connectivity betweencaldreand thoseprima facieequivalents will be stronger for French than for English. The first two assumptions were justified in the previous paragraph. The third assumption is based on the formal similar-ity betweencaldre andfalloir, i.e. on their syntactic overlap (explained above), which is not paralleled bycaldreand any of its English equivalents. On the basis of these assumptions, the following three hypotheses can be formulated:

1. caldrewill be under-represented in the English-Catalan subcorpus when compared to Catalan non-translations, as neither factor 1 (magnetism) nor factor 3 (high degree of connectivity) will be at play – factor 2 (gravita-tional pull) being the only factor that might pull towards over-represen-tation;

2. caldrewill be either over- or under-represented in the French-Catalan sub-corpus when compared to Catalan non-translations, depending on which factor prevails (gravitational pull and a high degree of connectivity will pull towards over-representation whereas magnetism will pull towards under-representation);

3. caldre will be significantly more frequent in the French-Catalan than in the English-Catalan sub-corpus, as over-representation will be favoured by two factors (gravitational pull and a high degree of connectivity) in the former and only one (gravitational pull) in the latter.

The method employed to verify these hypotheses will consist of the following steps:

1. data retrieval with CQP (Corpus Query Processor), a tool that allows to query corpora on the basis of regular expressions containing words, lem-mas and part-of-speech tags. Both the translated components of the Eng-lish- and French-Catalan sub-corpora, and the Catalan non-translated com-ponent will be queried on the lemmacaldre;

2. manual sifting in order to tell apart true from false positives. Corpus que-ries usually yield matches that do not conform to the criteria the analyst had in mind. If false positives are not removed, the data on which quantifi-cation draws will be distorted;

3. quantification + testing for significance. Raw and relative frequencies of caldrein the three components mentioned in step 1 will be established and tested for significance;

4. searching for triggers (i.e. ST segments matching the query word) ofcaldre in the English and French STs;

5. searching for TT segments matching the main triggers of caldre. Query matches will be thinned if their number proves unmanageable. Thinning is the standard method used by CQP for random sampling, and it can be based on a raw figure or a percentage;

6. manual sifting (again), in order to tell apart true from false positives;

7. establishing degrees of connectivity between ST and TT items. The mea-sure to be used for that purpose will be introduced below;

37

8. repeating the whole process forhaver de (the main alternative to caldre in Catalan, as seen above) in the English-Catalan and French-Catalan sub-corpora and the Catalan non-translated component.

As these steps suggest, for the big picture to emerge as regards patterns of salience and connectivity in the two language pairs it is necessary to go beyond the initial pivot of the study (caldre) and look at the main nodes in the mono-lingual and bimono-lingual networks of whichcaldreis a part. This kind of analysis is extremely time-consuming. Thebigpicture may not be thefullpicture, but it is hoped it will include enough relevant information not only to test the hypotheses but also to understand why they are confirmed or refuted.

Before moving on to results and discussion, the thorny question of the rela-tionship between frequency, on the one hand, and salience and connectivity, on the other, must be addressed. Schmid (2010) poses the question in the most ex-plicit possible manner when he wonders whether frequency in text instantiates entrenchment in the cognitive system. Entrenchment is defined as “the degree to which the formation and activation of a cognitive unit is routinized and auto-mated” (Schmid 2010: 115). It is fostered by repetitions of cognitive events. Schmid (2010: 116) refers to the “considerable body of evidence from psycholinguistic ex-periments suggesting that frequency is one major determinant of the ease and speed of lexical access and retrieval”, and goes on to argue that, since speed of access and retrieval correlates with routinisation, “this indeed supports the idea that frequency and entrenchment co-vary” (Schmid 2010: 116). But this is not as straightforward as it seems.

Drawing on previous authors, Schmid (2010: 116) claims that “it is not fre-quency of use as such that determines entrenchment, but frefre-quency of use with regard to a specific meaning or function, in comparison with alternative expres-sions of that meaning or function”. The former type of frequency is called ab-solute and the latter relative. Schmid observes that, even though the correla-tion between frequency and cognitive significance is far from unproblematic, cognitively-oriented corpus linguists “try to correlate the frequency of occur-rence of linguistic phenomena (as observed in corpora) with their salience or en-trenchment in the cognitive system” (Schmid 2010: 101). Indeed, it seems difficult to proceed otherwise. Schmid’s caveats are very much in place in methodolog-ical terms, but he provides no alternative to frequency as an operationalisation of salience and entrenchment, as no direct access to the cognitive system seems to be available at present. Halverson advocates a mixed-methods approach with different types of data (elicitation data and analysis of keystroke logs) in addition to corpus data, but, regardless of the type of data under scrutiny, both salience and entrenchment are operationalised as frequency. The same procedure will be

followed here, even though most analyses (as in Halverson 2017) will be based on relative rather than absolute frequency.