• Keine Ergebnisse gefunden

Environmental Data http://www.pangaea.de/

N/A
N/A
Protected

Academic year: 2022

Aktie "Environmental Data http://www.pangaea.de/"

Copied!
31
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

WDC-MARE - World Data Center for Marine Environmental

Sciences

http://www.wdc-mare.org/

PANGAEA - Publishing Network for Geoscientific and

Environmental Data http://www.pangaea.de/

Michael Diepenbroek (marum, Bremen University)

Hannes Grobe (AWI - Alfred Wegener Institute, Bremerhaven) Uwe Schindler (marum, Bremen University)

PANGAEA® - platform for an ICSU World Data Center as a networked publication and

library system for geoscientific data

(2)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

Network of ICSU WDCs

•Nuclear Radiation Tokyo, Japan

WDC Co-ordination Offices Washington DC, USA Beijing, China

•Meteorology Asheville NC, USA Beijing, China Obninsk, Russia

•Oceaography Obninsk, Russia Silver Spring MD, USA Tianjin, China

•Paleoclimatology Boulder CO, USA

•Marine Geology and Geophysics Boulder CO, USA

Moscow, Russia

•Remotely Sensed Land Data Sioux Falls SD, USA

•Renewable Resources and Environment Beijing, China

•Recent Crustal Movements Ondrejov, Czech Republic

•Airglow Mitaka,Japan

•Astronomy Beijing, China

•Atmospheric Trace Gases Oak Ridge TN, USA

•Aurora Tokyo, Japan

•Cosmic Rays Toyokawa, Japan

•Geology Beijing, China

•Human Interactions in the Environment Palisades NY, USA

•Ionosphere Tokyo, Japan

•Earth Tides Brussels, Belgium

•Geomagnetism Copenhagen, Denmark Edinburgh, UK

Kyoto, Japan Colaba, India

•Glaciology Boulder CO, USA Cambridge, UK Lanzhou, China

•Marine Environmental Sciences Bremen, Germany

•Rotation of the Earth Obninsk, Russia Washington DC, USA

•Satellite Information Greenbelt MD, USA

•Rockets and Satellites Obninsk, Russia

•Seismology Denver CO, USA Beijing, China

•Solar Radio Emission Nagano, Japan

•Space Science Beijing, China

•Space Science Satellites Kanagawa, Japan

•Solar Activity Meudon, France

•Soils

Wageningen, The Netherlands

•Sunspot Index Brussels, Belgium

•Solar Terrestrial Physics Boulder CO, USA Didcot Oxon, UK Moscow, Russia Haymarket, Australia

•Solid Earth Geophysics Beijing, China

Boulder CO, USA Moscow, Russia

(3)

World Data Center for Marine Environmental Sciences

Biogeochemistry, Circulation, and Life of Present and Past Oceans

Operated by: Centre for Marine Environmental Sciences (MARUM) at the Bremen University and the Alfred Wegener Institute for Polar and Marine Research (AWI)

Summary of Data Held: The WDC is aimed at collecting, scrutinizing, and disseminating data related to global change in the fields of

environmental oceanography, marine geology, paleoceanography, and marine

biology

. It focuses on georeferenced data using the information system PANGAEA. The WDC stores and handles numeric, string, and image data. Users can retrieve data through the Internet via different gateways. Input is accepted in electronic form; specifics can be discussed with the WDC staff.

User Services: The WDC for Marine Environmental Sciences offers data management services, in particular project data management and data publication. It maintains an inventory of site and sampling locations for all related fields. It

provides hosting and mirroring of electronic journals and serves software products for analyzing, visualization, and transformation of data.

Visitors are welcome.

(4)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

Why do we need data publishers and data libraries?

- Good scientific practice

- Prerequisite for the verification of research results

- Good availability of scientific data fosters complex and large scale approaches in research

- Reusage is more effective than reproduktion of data

(5)

Supporting policies

• Good scientific practice in research and scholarship ESF 2000

• Open access for all kinds of research material Berlin declaration 2003

• “peer review” like procedures for quality assurance of scientific data

OECD 2004 & 2007

(6)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

PANGAEA ® - services & activities

- Final report for data management Final report for data management of projects

of projects

- Accompanied by CD/DVD with data Accompanied by CD/DVD with data and local search engine

and local search engine - Editorial environment for

preparation of data and metadata

- Citable data sets -

- referenced with Digital Object Identifiers (DOI) - data portals, networking data centers

- networking observatories (sensor networks) - fostering SDI standards (GEOSS, INSPIRE) - controlled vocabularies

IODP HERMES

CARBOOCEAN EUR-OCEANS ESONET / EMSO

More than 60 European to international projects since 1995

• Project data management

• Data publication

• Data infrastructures (networking)

(7)

Bremen network

Project & data managers

+ -

technical & scientific organization

AWI

Univ. Bremen Computer Center

internet

MARUM

www.pangaea.de www.wdc-mare.org

PANGAEA ® – resources

(8)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

Sybase ASE

Middleware

Webserver Editorial

system

PANGAEA search engine

PANGAEA ® - technical architecture

www.pangaea.de Harddisk

+ tape (silo) RDB

Sybase IQ warehouse

wiki.pangaea.de

IQ interface

(9)

Archiving & publishing scientific data

or: how to make data available to science?

(10)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

1875 – Glomar challenger 2008

Effects of technical developments

(11)

Effects of technical developments

Hard disk 5

CD / DVD 20

Tape 30

Paper > 100

Papyrus > 1000

Stone of Rosette

Lifetime of storage media (years):

(12)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

0 5 10 15 20 25 30

1970 1980 1990 2000 2010

Publications Data

?

Global increase in publications in empirical sciences

(13)

What are the prerequisites for publishing scientific data?

Citable data sets and persistent identifiers (DOI) Peer review for scientific data

- Completeness of data set description - Validity of methods used

- Data values (precision, sequence, ranges etc.) - Including specific QA/QC procedures

Longterm archiving facilities

- Clear commission as data libraries (e.g. ICSU World Data Center) - Data management infrastructure and expertise and manpower - Longterm commitment and funding

Userfriendly and reliable systems for retrieval and distribution of data

www.pangaea.de

(14)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

Data management as an editorial and publishing process

DOI

Scientific Community

Project Management Work

package leader

Upload Update Editing

Libraries Journals Portals Search engines Publication

Projects, Institutes,

PIs

Existing Data Mass

Data Scientific

primary data

Data Management

Data Curator

Data librarian

Monitoring Technical harmonization

& quality control

Digital archive, library & publisher

Editorial

& review

Distribution

& access Data provision

& scientific quality control

(15)

quality management

International DOI Foundation

DOI registry for scientific data

Agent (publisher) oceans

longterm data archive Agent

atmosphere Agent

models

Data sets http://www.doi.org/overview/

Library catalogues

Science Citation Index

http://www.std-doi.de

DOI = No. of registry + acronym of archive + ID e.g. „doi:10.1594/PANGAEA/80967“

Google Scholar

Digital Object Identifiers (DOI)

- a way to get data published & citable

(16)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

Data management costs in PANGAEA

(estimated costs in Euro per data set in the geo-, biosciences)

- Data archiving & publication 150,-

(aquisition, dokumentation, processing, archiving & publication)

- Post publication curational works 15,-

(corrections, improvements, restructuring works)

- Technical infrastructure and staffs of information system 40,- (computer, storage media, networks, administration)

- Development of the information system 20,-

(incl. ongoing extensions, improvements)

- total for ~5 data sets per publication 1.100,-

- preparation of a publication 12.000,-

- Data production 120.000,-

(incl. costs for expeditions & laboratories)

- The costs for aquiring new data sets are more than 2/3 of the total data management costs Conclusions:

- Data management costs are only 1 - 1,5% of the total costs

for comparison

(17)

Content

www.pangaea.de

(18)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

Data types in PANGAEA

IRD (grav/10 cm3) Sand (% )

CaCO3 (% )

TOC (% ) Radio (% /sand) Smect (% /clay) IRD (grav/10 cm3) Sand (% )

CaCO3 (% )

TOC (% ) Radio (% /sand) Smect (% /clay) IRD (grav/10 cm3) Sand (% )

CaCO3 (% )

TOC (% ) Radio (% /sand) Smect (% /clay) IRD (grav/10 cm3) Sand (% ) CaCO3

(% ) TOC (% ) Radio (% /sand) Smect (% /clay) IRD (grav/10 cm3) Sand (% )

CaCO3 (% )

TOC (% ) Radio (% /sand) Smect (% /clay)

PS1389-3 PS1390-3 PS1431-1 PS1640-1 PS1648-1

Age (kyr)max. : 233.55 kyr PS1389-3ff

0.0

100.0

200.0

0 200 1000 150 0.50 500 1000 200 1000 150 0.50 500 1000 200 1000 150 0.50 500 1000 200 1000 150 0.50 500 1000 200 1000 150 0.50 500 100

54° 0' 54° 0'

54°30' 54°30'

55° 0' 55° 0'

55°30' 55°30'

11°

11°

12°

12°

13°

13°

14°

14°

15°

15°

World vector shore line Grain size class KOLP A Grain size class KOEHN2 Grain size class KOEHN Geochemistry Grain size class KOLP B Grain size class KOLP DIN Scale: 1:2695194 at Latitude 0°

Source: Baltic Sea Research Institute, Warnemünde.

• Profiles -> doi:10.1594/pangaea.103958

• Time series -> doi:10.1594/pangaea.323487

• Sea bed photos -> doi:10.1594/PANGAEA.319877

• Distributes samples -> doi:10.1594/pangaea.51749

• Complex data -> doi:10.1594/PANGAEA.108079

• Air photos -> doi:10.1594/PANGAEA.323540

• Audio record -> doi:10.1594/PANGAEA.339110

(19)

unclassified Atmosphere

Corals

Ice

Sediment

Water

Statistics (2/2008)

Total number of data sets ~ 573,000

Data items ~ 4.1 billions

(20)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

Networking

One stop shopping

for reliable and usable data

(21)

Data Driven Science

wireless

cabled

2000 2010

(22)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

GEOSS Global Earth Observation System of Systems

The missing link !?

(23)

data management &

longterm archiving

PANGAEA ® – standard interfaces for metadata

RDB

catalogues

PANGAEA

ISO19xxx

STD-DOI

XSLT

Index

Dublin Core

protocols

marshaller

WS

(SOAP/WSDL)

Frontends / portals

PANGAEA

+GE + UNM

WFS

(OGC)

OGC catalogue

service

OAI-PMH

ISO690

GeoPortal.Bund®

TIB National Library WS

(SOAP/WSDL) DOI registration

Compiled catalogues

DOI registry

DIF Dublin

harvester Core

Google

Scientific Commons

HGF (Fedora)

harvester

GCMD

EUR-OCEANS CARBOOCEAN

IODP

Darwin

Core DiGIR Darwin

Core ISO19xxx

DIF/FGDC

OBIS GBIF

harvester harvester

D-GRID

gml, kml

WDC portal

(24)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

PANGAEA ®

– dissemination of data and metadata via portal networks

(25)
(26)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

(27)
(28)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

(29)
(30)

DFG/DINI Berlin 2008, Michael Diepenbroek www.pangaea.de

(31)

Referenzen

ÄHNLICHE DOKUMENTE

The question of how many machines are desirable depends partly on how efficiently their use is organ- ized. A comparatively few machines can do more work than

PANGAEA actually provides more than 365 000 data set, consisting of >11 billion data points,. including collections from national and international

Stefanie Schumacher, Amelie Driemel, Hannes Grobe, Rainer Sieger Alfred Wegener Institute, Bremerhaven... www.pangaea.de What is

Data are stored georeferenced in space and time in a relational database and a tape archive?. Datasets have a citation and

DOI (Digital Object Identifier) > persistent link Web service > distribution in the Internet. Data Warehouse > retrieval

Organising Fora Fora – Meetings of Experts to address the Meetings of Experts to address the consistency of plankton data and their transformation into consistency of plankton

The client software to be used on the World Wide Web is written as a Java applet and allows read only access on pub- lished data for anyone. Registered users can also share

(1) The 4D-Client is mostly used by a project‘s data manager for the administration of project related data, the import of metadata and analytical data and for comprehensive