• Keine Ergebnisse gefunden

Testing based on the CEFR – A psychometric approach

Katharina Hagenfeld

2.  Testing based on the CEFR – A psychometric approach

The development of the CEFR dates back to the 1970s in which a paradigm shift in language teaching and education evolved (cf. North 2007). As opposed to tra-ditional; teacher centered, teaching methods such as the Grammar-translation method,2 more learner centered and communicative approaches such as Com-municative Language Teaching (CLT) and Task-based Language Teaching (TBLT) arose. The CEFR claims to reflect these changes in describing an action-oriented approach3 to language use (under which language acquisition is subsumed, cf.

CoE 2001: 21) that hypothesizes the development of L2 proficiency to be based on the usage of communication and communicative strategies and activities (CoE 2001: 9). The emphasis on communication and communicative acts is reflected in the framework that seeks to provide “a common basis for the elaboration of language syllabuses, curriculum guidelines, examinations, textbooks, etc. across Europe” (CoE 2001: 1). The CEFR itself thus does not mean to test language

2.  For a discussion on the history of language teaching and traditional teaching methods, see Keßler & Plesser (2011).

3.  Action-oriented approach regards the notion in the CEFR that language learners and users are social agents and that every communicative act is socially founded (cf. CoE 2011: 21).

Psychometric approaches to language testing and linguistic profiling 137 proficiency as such but to provide a basis on which tests can be designed and administered across European member states. The approach to language assess-ment that is suggested in the framework is proficiency testing with rating scales that originate in psychometric studies.

Psychometric language testing evolved out of the scientific field of psychology in order to provide objective measures for subjective items such as personality traits, attitudes and academic achievements (cf. Michell 1999; Kaplan & Saccuzzo 2010). The assumed objectivity is achieved through the use of questionnaires and scales (cf. Stevens 1946) that describe an item, such as a personality trait, and can thus be matched to the perceived reality of the person to be tested. In the case of the CEFR, the matter to be tested is language proficiency. In using an action- oriented approach, the European framework defines language proficiency to be based on a number of competences which “[…] are the sum of knowledge, skills and charac-teristics that allow a person to perform actions.” (CoE 2006: 9). There are general competences which are “[…] not specific to language, but which are called upon for actions of all kinds, including language activities.” (CoE 2006: 9). The compe-tencies are subdivided into several language skills. These communicative skills are described in the global scale, Figure 1, that is “arranged in three bands – A1 and A2 (basic user), B1 and B2 (independent user), C1 and C2 ( proficient user)” (Little 2008: 4). Each level provides descriptors as to the skills that need to be attained to reach a certain level.

The global scale is supposed to provide points of orientation for teachers and curriculum planers (cf. CoE 2001). Additionally, the Council of Europe caters scales for communicative tasks at different levels such as oral/written production and comprehension as well as self-assessment scales. Apart from the broad ben-efits the CEFR was able to manifest, such as encouraging a basis for a cooperation between educational institutions all over Europe, formulating a common ground of criteria for qualifications in the area of language and providing access to cultural manifestations (CoE 2001: 17), it has to face extensive critique when it comes to being the basis for language testing. The following section illustrates major points of critique but raises no claim to completeness; it rather provides a brief overview of points of critique relevant for this study.

2.1  Critique as regards psychometric testing and the CEFR

Psychometric testing itself has been subject to extensive critique for a number of reasons out of which the following four points will be further discussed: (1)  rating scales operate within human limitations; (2) they do not measure directly but through introspection, i.e. post factually, which implies a threat to objectivity. In relation to language testing, the following points are criticized: (3) the concept

13 Katharina Hagenfeld

language that is to be measured is not clearly defined to be readily operationalized for testing purposes. As for the CEFR, much work has been put into the careful formulation of descriptor items based on sociological and philosophical ideas but (4) it still lacks a comprehensive theory of language and its acquisition.

With regard to (1) trained raters use descriptive scales in order to, in the case of language testing, assign learners a language proficiency level. Much work was put into the development of assessment criteria grids that help the analyst with their rating. However, rating scales work only as well as the person who uses them. Biases

Proficient C2 Can understand with ease virtually everything heard or read. Can summarise information from different spoken and written sources, reconstructing arguments and accounts in a coherent presentation. Can express him/herself spontaneously, very fluently and precisely, differentiating finer shades of meaning even in more complex situations.

User C1 Can understand a wide range of demanding, longer texts, and recognise implicit meaning. Can express him/herself fluently and spontaneously without much obvious searching for expressions. Can use language flexibly and effectively for social, academic and professional purposes. Can produce clear, well-structured, detailed text on complex subjects, showing controlled use of organisational patterns, connectors and cohesive devices.

Independent B2 Can understand the main ideas of complex text on both concrete and abstract topics, including technical discussions in his/her field of specialisation. Can interact with a degree of fluency and spontaneity that makes regular interaction with native speakers quite possible without strain for either party. Can produce clear, detailed text on a wide range of subjects and explain a viewpoint on a topical issue giving the advantages and disadvantages of various options.

User B1 Can understand the main points of clear standard input on familiar matters regularly encountered in work, school, leisure, etc. Can deal with most situations likely to arise whilst travelling in an area where the language is spoken. Can produce simple connected text on topics which are familiar or of personal interest. Can describe experiences and events, dreams, hopes & ambitions and briefly give reasons and explanations for opinions and plans.

Basic A2 Can understand sentences and frequently used expressions related to areas of most immediate relevance (e.g. very basic personal and family information, shopping, local geography, employment). Can communicate in simple and routine tasks requiring a simple and direct exchange of information on familiar and routine matters. Can describe in simple terms aspects of his/her background, immediate environment and matters in areas of immediate need.

User A1 Can understand and use familiar everyday expressions and very basic phrases aimed at the satisfaction of needs of a concrete type. Can introduce him/herself and others and can ask and answer questions about personal details such as where he/she lives, people he/she knows and things he/she has. Can interact in a simple way provided the other person talks slowly and clearly and is prepared to help.

Figure 1. Global Scale, taken from the Manual for LTD (CoE 2012)

Psychometric approaches to language testing and linguistic profiling 139 were reported on cultural levels (Rohrmann 2007: 1), ambiguous interpretation of descriptor items, how harsh or lenient raters score with regard to overall perfor-mance, traits (Schaefer 2008) and subject groups (Wigglesworth 1993) as well as accent familiarity (Winke, Gass, & Myford 2013) among others. Irrespective of how strong or weak these factors may influence ratings, they cannot be fully eradicated as raters are sensitive to one factor or another as no person is fully objective.4

(2) Indirect measures are generally perceived as being less concrete than direct measurements. They are usually based on (self-) reports or questionnaires about a behavior, skill or else. The crux here is that the item to be measured is assessed through retro- or introspection. The question remains as to whether it can be determined that what the rater perceives reflects the reality. Thus, the dependence of the test result on the raters’ opinions, judgments and beliefs forms a major drawback in terms of objectivity.

As with psychological variables, (3) language itself is not a concept that is eas-ily defined and operationalized. The matter to be tested is language. Language is built of sounds, intonation, stress, morphemes, words, and arrangements of words having meanings that are linguistic and cultural.[…] They are integrated in the total skills of speaking, listening, reading and writing […] all of which do not advance evenly. (Lado 1961: 25). The broad scope of the concept language makes it hard to determine and test all of its properties.

Since there is, as of yet, no universally accepted and operationalized defini-tion of language proficiency (cf. Pienemann & Keßler 2007: 247) the development of respective tests relies on the definition of the test administrator. With regard to the CEFR (4), language is defined as action-oriented in which learners are seen as subjects who operate in varying social contexts and who have to fulfil varying social activities (CoE 2001: 21). In order to be able to act creatively with language, they thus have to acquire certain communicative competences and strategies (CoE 2001: 21). Harsch (2005: 26) criticizes the vague definition of the term language as well as the CEFRs’ equalization of the terms language use and language acquisition (Ibid., p. 65.). How can we test something without knowing what it is that we want to test? Much effort has been put into the development of theories of language use and language acquisition that are hardly touched upon in the CEFR. This means that (4) a comprehensive theory or approach behind the CEFR cannot be found.

However, it is based on principles that go back to the philosopher Dell Hymes (1974) who hypothesized the development of an individual as being promoted by the acquisition of different competencies while completing everyday activities

4.  However, Wigglesworth (1993) found that raters tend to react to feedback and their willingness to change their behavior to achieve more objective scores.

140 Katharina Hagenfeld

(c.f. Dell Hymes 1974). The broad, action-oriented scope of the term language that the CEFR seeks to cover adds to the difficulty in finding an appropriate theoreti-cal foundation. Thus the CEFR maintains quite a low profile in this regard. Since this influential framework has such a far-reaching influence, it is claimed that it is important to constantly reflect on it and enhance it by including current trends and findings in (second) language acquisition research. Having outlined major points of critique as regards the CEFR, the next chapter focuses on language test-ing based on a theory of language development, i.e. Processability Theory.

3.  Assessing interlanguage development with Rapid Profile and