• Keine Ergebnisse gefunden

FEDORA @ AWI Fedora User Meeting Copenhagen, Denmark 28 September, 2005

N/A
N/A
Protected

Academic year: 2022

Aktie "FEDORA @ AWI Fedora User Meeting Copenhagen, Denmark 28 September, 2005"

Copied!
20
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Mastertitelformat bearbeiten

FEDORA @ AWI

Fedora User Meeting Copenhagen, Denmark

28 September, 2005

(2)

-2-

Ana Macario, Computer Center

Mastertitelformat bearbeiten Overview

 AWI and its research scope

 SOA at AWI

 Rationale for choosing FEDORA

 Long-term issues

(3)

Mastertitelformat bearbeiten About AWI

1980

Establishment of the institute in Bremerhaven as a foundation under public law; AWI is one out 15

centers belonging to Helmholtz Society

To date

- Budget: 103 Mill. Euro - 800 Employees

Funding

- 90% Federal Ministry of Education and Research (BMBF) - 8% Bremen state

- 1% Brandenburg and Schleswig-Holstein states

(4)

-4-

Ana Macario, Computer Center

Mastertitelformat bearbeiten Our mission

Wadden Sea Station Sylt

Biologische Anstalt Helgoland

Alfred-Wegener-Institut

für Polar- und Meeresforschung Bremerhaven

Research Unit Potsdam

To contribute to polar

and marine research

in order to advance

insights into the

changeability of the

global environment

and the earth system

(5)

Mastertitelformat bearbeiten Research platforms

Primary data:

• observations acquired in diverse research platforms, long-time series monitoring (observatories)

• numerical models

• lab. experiments

• photographs, maps/charts

Publications Events

Intelectual property rights –

Technology transfer

(6)

-6-

Ana Macario, Computer Center

Mastertitelformat bearbeiten

Backups

Backups

Relational Databases PANGAEA/WDC-Mare Meteorology,Oceanography

Diatom collections GIS, Polarstern expeditions

Directory

People, Organizational Publications

Events

Technology transfer Expeditions

Examples:

Directory services MapServer

Middleware Services

Examples:

Web-based interfaces for

searching primary datasets,

publications, expeditions, etc

Backups

File and Storage systems

Publications full-text Model runs Large datasets ISO 19115

DublinCore

Internet2/

eduPerson eduOrg

DublinCore AuthN&AuthZ

Simplified Overview (2004)

(7)

Mastertitelformat bearbeiten

“Staging”

Versionning and trace-ability relevant to scientists (data

calibration, validation, processing, etc) Distributed data storage

“Role” tailored

access policy to assure data rights

Spatial, temporal and thematic

search/visualization (GIS mapping services)

“Publication”

Long-term archival of quality- controlled digital objects in IR

IR exposed via OAI-PMH and SOAP

Export functionality to international agencies (GCMD,

NGDC, NOAA, GBIF, etc)

PI turns in post-print

PI removes data access

restrictions

In practice…

Fedora

as “active workspace”

(8)

-8-

Ana Macario, Computer Center

Mastertitelformat bearbeiten Why AWI chose to test FEDORA?

 Flexible, extensible digital object model

 Open source; good documentation and tutorials

 Allows for metadata description other than Dublin Core record;

relevant for geo-referenced objects (ISO 19115), bio-diversity

objects (Darwin Core), objects of type people (Internet2/eduPerson), organizational units (Internet2/eduOrg),etc

 Able to distribute load and object storage among several IR instances („Virtual Repository“ concept)

 Standards compliant: XML storage, OAI-PMH and web services

(9)

Mastertitelformat bearbeiten

Why AWI chose to test FEDORA? – cont.

 Promising scalability; Fedora@AWI currently archives 15,000 objects

 Object preservation through content versionning; includes audit trail record for preserving event history

 XML ingest/export assures interoperability with existing in

house information systems

(10)

-10-

Ana Macario, Computer Center

Mastertitelformat bearbeiten

Backups Directory

&

File systems Publications

Events

Technology transfer People

Organizational Units 15,000 objects

Sybase BLOBs

PANGAEA/WDC-MARE

Manage soap

Access soap

Search soap OAI Provider

http

Search soap

OAI Provider

http Fedora Repository

System

OAI Harvester

(PKP)

Backups Sybase

Relational PANGAEA/WDC-

MARE

245,000 objects

FOXML ingest

Frontend Backend

Simplified Overview (2005)

WDC-specific

XML

(11)

Mastertitelformat bearbeiten

SOAP client

(12)

-12-

Ana Macario, Computer Center

Mastertitelformat bearbeiten

SOAP client – cont.

(13)

Mastertitelformat bearbeiten

SOAP client – cont.

(14)

-14-

Ana Macario, Computer Center

Mastertitelformat bearbeiten

A few technical remarks on Fedora 2.0...

 Web services APIs are great; suggested improvements:

- findObjects: browsing list backwards is not possible yet, totalNumberOfResults is missing

- addDatastream: file uploads: could it be done with SOAP-attachments?

 Timestamp resolution in miliseconds has raised problems in „conformance tests“ under www.openarchives.org

 „DeletedRecords“ set to „Transient“ in order to allow for

incremental harvesting by „modified date“

(15)

Mastertitelformat bearbeiten Next steps ...

 Set up new services: naming, full-text indexing & search, large-scale content ingestion (bulk load) together with

metadata

 Metadata transformation services as „disseminator“ –

relevant for data supply to external service providers (e.g., NGDC, GCMD, NOAA, GBIF)

 Set up collections (and respective granularity policies) -

relevant for object-to-object relationship metadata

(16)

-16-

Ana Macario, Computer Center

Mastertitelformat bearbeiten DC-hardwired relation

Resource

Item

Dublin Core Pangaea-

specific

OAI-PMH

records

OAI-PMH identifier – “DOI”

ISO 19115

Descriptive + Administrative

metadata Descriptive

+ Administrative metadata Descriptive

metadata

DC metadata

<dc.source>

locator for content

<dc.relation>

locator for publication(s)

Dataset-to-Publication relationship metadata

should be expressed in RDF/XML and placed in the

“Relations datastream”

(17)

Mastertitelformat bearbeiten

Backups Directory

&

File systems People

Organizational Units Publications

Events

Technology Transfer 15,000 records

We need the XACML-based module in order to

add „live“ data!

Sybase BLOBs

PANGAEA/WDC-MARE

Manage http/soap

Access http/soap

Search http/soap

OAI Provider

http

Search http/soap

OAI Fedora Repository

System

OAI Harvester

(PKP)

Sybase Relational PANGAEA/WDC-

MARE

FOXML ingest

Frontend Backend

Testing triple store query performance

2006:

FOXML

ingest

(18)

-18-

Ana Macario, Computer Center

Mastertitelformat bearbeiten Long-term issues for AWI

 Benchmarking for large number of files; we fear scalability breakpoint related to the size of the filesystem-based

LLStorage area

 Out-of-box web-based client relevant for „acceptance“ by other Helmholtz centers

 Fine-grained access control policies and Shibboleth based AuthN – relevant in DataGRID context

 Support for sets

(19)

Mastertitelformat bearbeiten Long-term issues for AWI – cont.

 Federation model

 Collaboration and support infra-structure

- disseminators for specific visualizations services (e.g.

NetCDF data and LiveAcessServer, GIS data and OpenMapServer); relevant for DataGRID

- ECLIPSE project to facilitate plug-in development?

- Google strategy

- Seminars, tutorials for „advanced“ FEDORA users

(20)

-20-

Ana Macario, Computer Center

Alfred Wegener Institute, Bremerhaven, Germany

Mastertitelformat bearbeiten

Thanks for your attention!

oto: L. Tadday

Ana Macario, Computer Center

Alfred Wegener Institute for Polar and Marine Research

fedora-admin@awi-bremerhaven.de http://www.awi-bremerhaven.de

http://web.awi-bremerhaven.de/fedora/oai

Referenzen

ÄHNLICHE DOKUMENTE

# Some SCSI devices (e.g. CD jukebox) support multiple LUNs# Some SCSI devices (e.g. CD jukebox) support multiple

Nickname Change Use Cases ... Queries used for Analysis ... JIRA database schema ... Bugzilla database schema .... Issue trackers also known as bug trackers are basically

The Web Coverage Processing Service (WCPS) defines a language for filtering and processing of multi- dimensional raster coverages, such as sensor, simulation, image, and

(1) existing geomorphological maps can be re- evaluated and improved during conversion into digital maps which increases accuracy of land unit boundaries (2) zonal

(1) existing geomorphological maps can be re- evaluated and improved during conversion into digital maps which increases accuracy of land unit boundaries (2) zonal

With tuple-wise data staging, the information in the tables RFID PATH in the cache and RFID READ in the warehouse are asynchronously updated if item e is scanned for the first time

The approach was such that the national emissions by source sector match the NEC scenario analysis but the starting point of the scaling and spatial distribution of the emissions

Abstract Type Discrete Type moving ( int ) mapping ( const ( int )) moving ( string ) mapping ( const ( string )) moving ( bool ) mapping ( const ( bool )) moving ( real ) mapping