• Keine Ergebnisse gefunden

CLARIN

Im Dokument Final Report (Seite 50-54)

PART 1 – The Empirical Picture

4.3 CLARIN

Case Overview

What does the project do mainly? CLARIN is a large-scale pan-European collaborative effort to create, coordinate and make language resources and technology available and readily usable.7 Motivations for setting it up: The project builds upon a strong existing network of

researchers, digital tools and technologies, and data archives. The project was submitted under the ESFRI scheme in accordance with its criteria. The aim of CLARIN is to enable the latest developments and initiatives to support and develop language resources, and to broaden both the remit and the impact of current and proposed initiatives among network partners.

Main goals of the project: CLARIN will offer scholars the tools to allow computer-aided language processing, addressing one or more of the multiple roles language plays (i.e. carrier of cultural content and knowledge, instrument of communication, component of identity and object of study) in the Humanities and Social Sciences.

Project maturity: CLARIN is a nascent project, currently in the preparatory phase. This will be followed by a construction period of 5 years.

Project funding: Secured from the EC in the first instance (€4.1m), subsequent phases dependent on national investment from partner countries.

Organizational Structure

Size and composition: CLARIN is a true pan-European project, with partners in almost every European country. Partners include language resource archives, academic departments within universities, and academies of science. Total number of members: 156, Number of countries involved: 32.

Governance: The project is managed by a multi-tiered structure, comprising a Scientific Board consisting of high-level scientists, a Strategic Coordination Board consisting of representatives appointed by the funding agencies, an Executive Board consisting of 8 experts covering the required expertise and each leading a Work Package, an International Advisory Board to give advice to the Executive, Scientific and Strategic Coordination Boards on issues of common interest, and National groups to define an appropriate national coordination structure.

Managing internal and external relations

Management of the project: Management of the project is distributed between a Scientific Board, a Strategic Coordination Board, an Executive Board (composed of senior work package leaders and liaison staff), an International Advisory Board and CLARIN members. Work flows between these management structures are well developed and clear.

Users: Ensuring the continued development of the user community is a high priority for the CLARIN team. The large majority of the current CLARIN community consists of providers rather than users, CLARIN will have to work hard to ensure firm and structural liaisons with potential users in order to make sure that the resource will actually be used.

User recruitment: See above. The problem of user recruitment is one that the CLARIN team are aware of, but until the project is more mature, little action can be taken. CLARIN staff

7 Number of informants: 2, totaling 125 minutes.

Page 27 are actively seeking collaborative opportunities in order to build upon existing expertise (in recruiting users) and active communities.

Drivers and barriers to adoption: The main driver to adoption will be CLARIN’s provision of persistent services that are secure and provide easy access to language processing resources.

At present, in order to perform simple language processing tasks, one needs to find an appropriate program (to do translation, summarization, or extraction of information, etc.), download the program, make sure it is compatible with the computer that will execute the program, understand the form of input it takes, download the data (e.g. novels, newspapers, corpus, videos), and convert them to the correct format for the programs, and all this before one can get started. For most researchers outside computer science, at least one of these tasks will currently be an insurmountable barrier. CLARIN will provide resources for processing language, the data to be processed, as well as appropriate guidance, advice and training, and will be accessible over a distributed network from the user's desktop.

One potential barrier to adoption is that European countries have to opt in to the

organisational structure and provide funding for the construction and operational phases of CLARIN. For researchers in countries that do not join the consortium, there may not be full access to all services..

Challenges in interdisciplinary collaboration: One of the major challenges is to co-ordinate efforts spread across such a wide and varied group of partners. The preparatory phase is to address the legal, organizational, financial and technical challenges to building infrastructures in this field.

Collaboration with other organizations: CLARIN has partners responsible not only for collaborating with other ESFRI preparatory phase projects such as DARIAH, but also with existing and recent language projects, and other European infrastructure projects such as Europeana. CLARIN has dedicated work packages looking at collaboration with other organizations, including researching similar efforts in North America and the potential for linking up with such efforts. CLARIN has recently initiated CHAIN, the Coalition of Humanities and Arts Initiatives and Networks, to work together with CenterNet, DARIAH, Project Bamboo and the Association of Digital Humanities Organisations (ADHO) to ensure interoperability of the shared services that we are developing.

Technology

Main technologies, resources and services: overview of available resources, technologies and services:

Processing: Incorporates advanced multi-lingual language processing technology that supports cultural and linguistic integration. Incorporates, and contributes to, Semantic Web technology to overcome the structural and semantic encoding problems

Network: Includes Data Grid technology to connect the repositories

Data/storage: Builds on ideas launched by the Digital Library community to create Live Archives, and will further such initiatives

Role of technology development: Ensuring interoperability of existing language processing technologies and data sets is an important part of CLARIN’s work.

Data sharing: Existing digital resources which are being made available via repositories in the network will be made available through a new framework which will allow common resource discovery procedures, common metadata formats and procedures. It will also provide existing tools as web services, in a Grid environment, where currently disparate resources will be able to be used together. No privacy/security issues are foreseen. There are legal and ethical

Page 28 considerations regarding language resources that contain copyrighted material and potentially sensitive material relating to people and communities.

Interoperability with similar or connecting infrastructures: CLARIN will rely on existing and emerging technologies to guarantee interoperability of language resources. They are confident that these resources will supply these needs. The main challenge here is to make sure at this early stage that the project avoids conflicting standards and competing services.

Contribution

Main contributions of project: The CLARIN project seeks to build upon and reinforce a network of researchers, language tools and technologies and digital archives to widen both the usage and impact of these resources within and outside the field. It will marry currently disparate tools and datasets, creating interoperability that will significantly enhance the types of research undertaken and widen access to language resources generally. In creating networks with other e-Humanities infrastructures such as the DARIAH project, CLARIN is committed to maximizing the reach of its resources.

Challenges: The main challenge is the timely development of the resource, currently on target, and the continued national and international support for this development. A further challenge pointed to by project personnel is the development of a robust user community. As the project is at an early stage of development, this second factor is likely to be addressed more directly as the resource develops.

Informants’ recommendations to policy makers

CLARIN has recently issued a statement regarding copyright which proposes a research exception to European copyright law to allow researcher to make use of digital materials covered by copyright for educational and academic research purposes within a secure research infrastructures.

SWOT analysis

Table 4-5: CLARIN strengths and weaknesses

Strengths Weakness

Long-term funding Project has already secured some national funding for the next phase of the project, so potential further investment is likely.

Long-term funding has not been secured.

Sustainability Project is currently in a preparatory phase, with clear objectives for next phase (building) and beyond.

User recruitment Project is well integrated in target user communities and has a well researched user engagement plan.

Immaturity of project means that no measures have yet been tested.

Involvement of current users

N/A N/A

Organizational bedding

The project is well established within multiple institutions and a number of overlapping academic communities.

This project is very much a ‘bottom-up’

effort, signifying strong commitment from the institutions involved.

Institutionalized Yes. CLARIN have worked hard to

Page 29 links research and integrate themselves

within similar projects and

infrastructures. Co-operation is further secured through dedicated liaison personnel.

External use of software, tools

N/A N/A

Table 4-6: CLARIN opportunities and threats

Opportunities Threats

Funding of member organizations

Multiple participating organizations so difficult to say, although this could be seen as an advantage – being anchored to so many organizations reduces the threat to the project by unstable funding in one or more partner institutions.

Technology monitoring

Yes. The project has considerable accumulated experience in technology development, and knowledge of potential and actual technologies currently outside the project which may be useful.

Competition with other

infrastructures or technologies

Competition is not strong – collaboration is extremely strong. CLARIN seeks to integrate itself within and consolidate existing efforts rather than compete with them.

Security risks N/A N/A

Change of user communities and fields

Not known. Not known.

Page 30

Im Dokument Final Report (Seite 50-54)