• Keine Ergebnisse gefunden

,Hey dude, how bumpy is it going to be?‘

N/A
N/A
Protected

Academic year: 2022

Aktie ",Hey dude, how bumpy is it going to be?‘"

Copied!
19
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

A Path to filled Archives

or

,Hey dude, how bumpy is it going to be?‘

Dirk Fleischer

dfleischer@ifm-geomar.de

eSciDoc-Days

October 2011

(2)

Complaining, complaining...

ILLUSTRATIONS BY J. H. VAN DIERENDONCK

I n 2003, the University of Rochester in New York launched a digital archive designed to preserve and share dissertations, preprints, working papers, photographs, music scores

— just about any kind of digital data the univer- sity’s investigators could produce. Six months of research and marketing had convinced the university that a publicly accessible online archive would be well received. At the time of the launch, the university librarians were wor- ried that a flood of uploaded data might swamp the available storage space.

Six years later, the US$200,000 repository lies mostly empty.

Researchers had been very supportive of the archive idea, recalls Susan Gibbons, vice-prov- ost and dean of the university’s River Campus Libraries — especially as the alternative was to keep on scattering their data and dissertations across an ever-proliferating array of uninte- grated computers and websites. “So we spent all this money, we spent all this time, we got the software up and running, and then we said, ‘OK, here it is. We’re ready. Give us your stuff’,” she says. “And that’s where we hit the wall.” When the time came, scientists couldn’t find their data,

or didn’t understand how to use the archive, or lamented that they just didn’t have any more hours left in the day to spend on this business.

As Gibbons and anthropologist Nancy Fried Foster observed in their 2005 postmortem

1

,

“The phrase ‘if you build it, they will come’

does not yet apply to IRs [institutional reposi- tories].”

A similar reality check has greeted other data-sharing efforts. Most

researchers happily embrace the idea of sharing. It opens up observations to inde- pendent scrutiny, fosters new collaborations and encourages further discov- eries in old data sets (see pages 168 and 171). But

in practice those advantages often fail to out- weigh researchers’ concerns. What will keep work from being scooped, poached or mis- used? What rights will the scientists have to relinquish? Where will they get the hours and money to find and format everything?

Some communities have been quite open to sharing, and their repositories are bulging with

data. Physicists, mathematicians and computer scientists use arXiv.org, operated by Cornell University in Ithaca, New York; the Interna- tional Council for Science’s World Data System holds data for fields such as geophysics and biodiversity; and molecular biologists use the Protein Data Bank, GenBank and dozens of other sites. The astronomy community has the International Virtual Observatory Alliance, geo-

scientists and environmental researchers have Germany’s Publishing Network for Geoscientific & Environ- mental Data (PANGAEA), and the Dryad repository recently launched in North Carolina for ecology and evolution research.

But those discipline-specific successes are the exception rather than the rule in science.

All too many observations lie isolated and forgotten on personal hard drives and CDs, trapped by technical, legal and cultural barriers

— a problem that open-data advocates are only just beginning to solve.

One of those advocates is Mark Parsons at

Empty archives

Most researchers agree that open access to data is the scientific ideal, so what is stopping it happening? Bryn Nelson investigates why many researchers choose not to share.

”We got the software up and running and said

‘Give us your stuff’. That’s when we hit the wall.”

— Susan Gibbons

160

Vol 461|10 September 2009

160

NATURE|Vol 461|10 September 2009

NEWS FEATURE DATA SHARING

!"#$!"%&'()*&+(,-&$&.,-,&/0&1234566&&&!"#

!"#$!"%&'()*&+(,-&$&.,-,&/0&1234566&&&!"# 7898#9&&&!%:;;:!!7898#9&&&!%:;;:!!

Ÿ)''0DXZd`ccXeGlYc`j_\ijC`d`k\[%8cci`^_kji\j\im\[

Bryn Nelson

Nature 461, 160-163 (2009) http://dx.doi.org/10.1038/461160a

Data sharing: Empty archives

,So we spent all this money, we spent all this time,

we got the software up and running,

and then we said,

'OK, here it is. We're ready. Give us your stuff" -

"And that's where we hit the wall!‘

(3)

Complaining, complaining...

YES, sufficient YES, but not sufficient

8.8%

10.9%

80.3%

NO

Is there sufficient funding for your lab or research group

for data curation?

There are many tales of early archaeologists burning wood from the ruins to make coffee. If we fail to curate the environmental archives we collect from nature at public expense, we essentially repeat those mistakes.

The next few years [particularly in medicine]

the volume of data we need to analyze will expand exponentially.

YES

YES ,

through

collaborators

NO

34.4%

No special skills needed

16.1%

23.0%

26.5%

Do you have the necessary expertise in your lab or group to analyze your data in the way you want?

Have you asked colleagues for data related to their published papers?

If you answered yes, have the appropriate data been provided?

48.7%

47.6%

3.7%

YES

Sometimes

NO

23.6%

NO ,

never

12.5% YES, once 55.8%

YES ,

1–10

8.1% YES, >10

CREDIT: M. TWOMBLY/SCIENCE; SOURCE: SCIENCE ONLINE SURVEY

www.sciencemag.org SCIENCE VOL 331 11 FEBRUARY 2011 693

CONTENTS

News

694 Rescue of Old Data Offers Lesson for Particle Physicists 696 Is There an Astronomer

in the House?

698 May the Best Analyst Win

Perspectives

700 Climate Data Challenges in the 21st Century J. T. Overpeck et al.

703 Challenges and Opportunities of Open Data in Ecology O. J. Reichman et al.

705 Changing the Equation on Scientifi c Data Visualization P. Fox and J. Hendler 708 Challenges and

Opportunities in Mining Neuroscience Data H. Akil et al.

712 The Disappearing Third Dimension

T. Rowe and L. R. Frank 714 Advancing Global Health

Research Through Digital Technology and Sharing Data T. Lang

717 More Is Less: Signal Processing and the Data Deluge

R. G. Baraniuk 719 Ensuring the Data-Rich

Future of the Social Sciences

G. King 721 Metaknowlege

J. A. Evans and J. G. Foster

725 Access to Stem Cells and Data: Persons, Property Rights, and Scientifi c Progress

D. J. H. Mathews et al.

728 On the Future of Genomic Data S. D. Kahn

See also:

Editorial

649 Making Data Maximally Available B. Hanson, A. Sugden, and B. Alberts News Focus

662 What Would You Do?

J. Couzin-Frankel

666 Will Computers Crash Genomics?

E. Pennisi

669 Drag-and-Drop Virtual Worlds R. Service

Books

676 Bounds and Vision M. A. Porter Policy Forum

678 Measuring the Results of Science Investments

J. Lane and S. Bertuzzi Science Express Research Article*

The World’s Technological Capacity to Compute, Store, and Communicate Information M. Hilbert and P. López

Science Signaling*

Conquering the Data Mountain N. R. Gough and M. B. Yaffe

Effective Representation and Storage of Mass Spectrometry–Based Proteomic Data Sets for the Scientifi c Community J. V. Olsen and M. Mann

The Potential Cost of High-Throughput Proteomics

F. M. White

Integrating Multiple Types of Data for Signaling Research: Challenges and Opportunities

H. S. Wiley

Setting the Standards for Signal Transduction Research J. Saez-Rodriguez et al.

Visual Representation of Scientifi c Information B. Wong

Science Translational Medicine*

Power to the People: Participant Ownership of Clinical Trial Data S. F. Terry and P. F. Terry Electronic Consent Channels:

Preserving Patient Privacy Without Handcuffi ng Researchers R. H. Shelton Science Careers*

More Than Words: Biomedical Ontologies Provide New Scientifi c Opportunities C. Wald

Surfi ng the Tsunami E. Pain

Sharing Data in Biomedical and Clinical Research

K. Travis

SPECIAL SECTION

*These items, plus a related podcast and online discussion, are available at www.sciencemag.org/special/data/

Published by AAAS

on February 14, 2011www.sciencemag.orgDownloaded from

YES, sufficient YES, but not sufficient

8.8%

10.9%

80.3%

NO

Is there sufficient funding for your lab or research group

for data curation?

There are many tales of early archaeologists burning wood from the ruins to make coffee. If we fail to curate the environmental archives we collect from nature at public expense, we essentially repeat those mistakes.

The next few years [particularly in medicine]

the volume of data we need to analyze will expand exponentially.

YES

YES ,

through

collaborators

NO

34.4%

No special skills needed

16.1%

23.0%

26.5%

Do you have the necessary expertise in your lab or group to analyze your data in the way you want?

Have you asked colleagues for data related to their published papers?

If you answered yes, have the appropriate data been provided?

48.7%

47.6%

3.7%

YES

Sometimes

NO

23.6%

NO ,

never

12.5% YES, once 55.8%

YES ,

1–10

8.1% YES, >10

CREDIT: M. TWOMBLY/SCIENCE; SOURCE: SCIENCE ONLINE SURVEY

www.sciencemag.org SCIENCE VOL 331 11 FEBRUARY 2011 693

CONTENTS

News

694 Rescue of Old Data Offers Lesson for Particle Physicists 696 Is There an Astronomer

in the House?

698 May the Best Analyst Win

Perspectives

700 Climate Data Challenges in the 21st Century J. T. Overpeck et al. 703 Challenges and

Opportunities of Open Data in Ecology O. J. Reichman et al. 705 Changing the Equation

on Scientifi c Data Visualization P. Fox and J. Hendler 708 Challenges and

Opportunities in Mining Neuroscience Data H. Akil et al.

712 The Disappearing Third Dimension

T. Rowe and L. R. Frank 714 Advancing Global Health

Research Through Digital Technology and Sharing Data T. Lang

717 More Is Less: Signal Processing and the Data Deluge

R. G. Baraniuk 719 Ensuring the Data-Rich

Future of the Social Sciences

G. King 721 Metaknowlege

J. A. Evans and J. G. Foster

725 Access to Stem Cells and Data: Persons, Property Rights, and Scientifi c Progress

D. J. H. Mathews et al.

728 On the Future of Genomic Data S. D. Kahn

See also: Editorial

649 Making Data Maximally Available B. Hanson, A. Sugden, and B. Alberts News Focus

662 What Would You Do? J. Couzin-Frankel

666 Will Computers Crash Genomics? E. Pennisi

669 Drag-and-Drop Virtual Worlds R. Service

Books

676 Bounds and Vision M. A. Porter Policy Forum

678 Measuring the Results of Science Investments

J. Lane and S. Bertuzzi Science Express Research Article* The World’s Technological Capacity to Compute, Store, and Communicate Information M. Hilbert and P. López

Science Signaling* Conquering the Data Mountain N. R. Gough and M. B. Yaffe

Effective Representation and Storage of Mass Spectrometry–Based Proteomic Data Sets for the Scientifi c Community J. V. Olsen and M. Mann

The Potential Cost of High-Throughput Proteomics

F. M. White

Integrating Multiple Types of Data for Signaling Research: Challenges and Opportunities

H. S. Wiley

Setting the Standards for Signal Transduction Research J. Saez-Rodriguez et al.

Visual Representation of Scientifi c Information B. Wong

Science Translational Medicine* Power to the People: Participant Ownership of Clinical Trial Data S. F. Terry and P. F. Terry Electronic Consent Channels: Preserving Patient Privacy Without Handcuffi ng Researchers R. H. Shelton Science Careers*

More Than Words: Biomedical Ontologies Provide New Scientifi c Opportunities C. Wald

Surfi ng the Tsunami E. Pain

Sharing Data in Biomedical and Clinical Research

K. Travis

SPECIAL SECTION

*These items, plus a related podcast and online discussion, are available at www.sciencemag.org/special/data/

Published by AAAS

on February 14, 2011www.sciencemag.orgDownloaded from

1 GB 100 GB

1 TB

It is not stored

Our Lab

University servers

Community repository Other

0.5%

50.2%

38.5%

7.6%

3.2%

Where do you archive most of the data generated in your lab or for your research?

Even within a single institution there are no standards for storing data, so each lab, or often each fellow, uses ad hoc approaches.

How often do you access or use data sets from the published literature for your original research papers?

From archival

databases? 22.6%

21.4%

56.0%

Often Half the time

Rarely

22.8%

21.6%

55.6%

Often Half the time

Rarely

>1 TB 100 GB –1 TB

<1 GB

7.6%

12.1%

1–100 GB

32.0%

48.3%

What is the size of the largest data set that you have used or generated in your research?

CREDIT: M. TWOMBLY/SCIENCE; SOURCE: SCIENCE ONLINE SURVEY

I N T R O D U C T I O N

Challenges and Opportunities

11 FEBRUARY 2011 VOL 331 SCIENCE www.sciencemag.org 692

SCIENTIFIC INNOVATION HAS BEEN CALLED ON TO SPUR ECONOMIC recovery; science and technology are essential to improving public health and welfare and to inform sustainability; and the scientifi c community has been criticized for not being suffi ciently account- able and transparent. Data collection, curation, and access are cen- tral to all of these issues. For this reason, Science has joined with colleagues from our sister publications Science Signaling, Science Translational Medicine, and Science Careers to provide a broad look at the issues surrounding the increasingly huge infl ux of research data. The entire collection is compiled online at www.sciencemag.

org/special/data/. As you will discover, two themes appear repeat- edly: Most scientifi c disciplines are fi nding the data deluge to be extremely challenging, and tremendous opportunities can be real- ized if we can better organize and access the data.

Our authors explore data issues that apply to specifi c fi elds as well as challenges shared between fi elds. These articles clearly show that the challenges are diffi cult and growing. We have recently passed the point where more data is being collected than we can physically store (see Hilbert et al., published online). This storage gap will widen rap- idly in data-intensive fi elds. Thus, decisions will be needed on which data to archive and which to discard. A separate problem is how to access and use these data. Many data sets are becoming too large to download. Even fi elds with well-established data archives, such as genomics, are facing new and growing challenges in data volume and management. And even where accessible, much data in many fi elds is too poorly organized to enable it to be effi ciently used.

To delve deeper into these issues, Science polled our peer review- ers from last year about the availability and use of data. We received about 1700 responses, representing input from an international and interdisciplinary group of scientifi c leaders. About 20% of the respondents regularly use or analyze data sets exceeding 100 giga- bytes, and 7% use data sets exceeding 1 terabyte. About half of those polled store their data only in their laboratories—not an ideal long- term solution. Many bemoaned the lack of common metadata and archives as a main impediment to using and storing data, and most of the respondents have no funding to support archiving.

Many of the responders indicated that they seek or would like additional help in analyzing the data that they had collected. If we can use and reuse scientifi c data better, the opportunities, as indicated in many examples in this special section, are myriad. Large integrated data sets can potentially provide a much deeper understanding of both nature and society and open up many new avenues of research.

And they are critical for addressing key societal problems—from improving public health and managing natural resources intelli- gently to designing better cities and coping with climate change.

To realize these opportunities, many of the articles in this collec- tion speak of changing the culture of science and the practices of sci- entists, as well as recognizing the growing responsibility for much better data stewardship. Several of the pieces illustrate steps toward these goals. But it is clear that organized effort and leadership are needed from funders, societies, journals, educators, and individual scientists—and from society at large.

We hope that this collection spurs additional thinking and cata- lyzes new efforts in dealing with these critical issues. As a start, we invite you to share your thoughts at talk.sciencemag.org, where you can also contribute to our poll. – SCIENCE STAFF

Published by AAAS

on February 14, 2011www.sciencemag.orgDownloaded from

1 GB 100 GB

1 TB

It is not stored

Our Lab

University servers

Community repository Other

0.5%

50.2%

38.5%

7.6%

3.2%

Where do you archive most of the data generated in your lab or for your research?

Even within a single institution there are no standards for storing data, so each lab, or often each fellow, uses ad hoc approaches.

How often do you access or use data sets from the published literature for your original research papers?

From archival

databases? 22.6%

21.4%

56.0%

Often Half the time

Rarely

22.8%

21.6%

55.6%

Often Half the time

Rarely

>1 TB 100 GB –1 TB

<1 GB

7.6%

12.1%

1–100 GB

32.0%

48.3%

What is the size of the largest data set that you have used or generated in your research?

CREDIT: M. TWOMBLY/SCIENCE; SOURCE: SCIENCE ONLINE SURVEY

I N T R O D U C T I O N

Challenges and Opportunities

11 FEBRUARY 2011 VOL 331 SCIENCE www.sciencemag.org 692

SCIENTIFIC INNOVATION HAS BEEN CALLED ON TO SPUR ECONOMIC recovery; science and technology are essential to improving public health and welfare and to inform sustainability; and the scientifi c community has been criticized for not being suffi ciently account- able and transparent. Data collection, curation, and access are cen- tral to all of these issues. For this reason, Science has joined with colleagues from our sister publications Science Signaling, Science Translational Medicine, and Science Careers to provide a broad look at the issues surrounding the increasingly huge infl ux of research data. The entire collection is compiled online at www.sciencemag.

org/special/data/. As you will discover, two themes appear repeat- edly: Most scientifi c disciplines are fi nding the data deluge to be extremely challenging, and tremendous opportunities can be real- ized if we can better organize and access the data.

Our authors explore data issues that apply to specifi c fi elds as well as challenges shared between fi elds. These articles clearly show that the challenges are diffi cult and growing. We have recently passed the point where more data is being collected than we can physically store (see Hilbert et al., published online). This storage gap will widen rap- idly in data-intensive fi elds. Thus, decisions will be needed on which data to archive and which to discard. A separate problem is how to access and use these data. Many data sets are becoming too large to download. Even fi elds with well-established data archives, such as genomics, are facing new and growing challenges in data volume and management. And even where accessible, much data in many fi elds is too poorly organized to enable it to be effi ciently used.

To delve deeper into these issues, Science polled our peer review- ers from last year about the availability and use of data. We received about 1700 responses, representing input from an international and interdisciplinary group of scientifi c leaders. About 20% of the respondents regularly use or analyze data sets exceeding 100 giga- bytes, and 7% use data sets exceeding 1 terabyte. About half of those polled store their data only in their laboratories—not an ideal long- term solution. Many bemoaned the lack of common metadata and archives as a main impediment to using and storing data, and most of the respondents have no funding to support archiving.

Many of the responders indicated that they seek or would like additional help in analyzing the data that they had collected. If we can use and reuse scientifi c data better, the opportunities, as indicated in many examples in this special section, are myriad. Large integrated data sets can potentially provide a much deeper understanding of both nature and society and open up many new avenues of research.

And they are critical for addressing key societal problems—from improving public health and managing natural resources intelli- gently to designing better cities and coping with climate change.

To realize these opportunities, many of the articles in this collec- tion speak of changing the culture of science and the practices of sci- entists, as well as recognizing the growing responsibility for much better data stewardship. Several of the pieces illustrate steps toward these goals. But it is clear that organized effort and leadership are needed from funders, societies, journals, educators, and individual scientists—and from society at large.

We hope that this collection spurs additional thinking and cata- lyzes new efforts in dealing with these critical issues. As a start, we invite you to share your thoughts at talk.sciencemag.org, where you can also contribute to our poll. – SCIENCE STAFF

Published by AAAS

on February 14, 2011www.sciencemag.orgDownloaded from

M. Twombly/Science - Science online survey Yael Fitzpatrick, using www.wordle.net

Science, Vol. 331, Issue 6018

(4)

Data Sharing?

JAN HEIN VAN DIERENDONCK

NATRUE COVER GRAPHIC VOL. 461(2)

"The phrase 'if you build it, they will come' does not yet apply to institutional repositories.“

A similar reality check has greeted other data-sharing efforts.

It opens up observations to independent scrutiny, fosters new collaborations and encourages further discoveries in old data sets.

Most researchers happily embrace the idea of sharing.

But in practice those

advantages often fail to

outweigh researchers' concerns.

(5)

The Carrot Crusade

W. Michener, 2011 (D-Lib Magazine Vol. 17)

Data Conservancy

A Blueprint for Research Libraries

(6)

The Carrot Crusade

(7)

The Carrot Crusade

We will bring your data to the world, but

before this there is something we would like you to do:

wrap them in blue and yellow paper

put green stickers on it best are stars, but flowers are also okay

don‘t forget the purple ribbon around it

(8)

What‘s wrong here?

NATURE GEOSCIENCE | VOL 4 | SEPTEMBER 2011 | www.nature.com/naturegeoscience 575

commentary

A path to filled archives

Dirk Fleischer and Kai Jannaschk

Reluctance to deposit data is rife among researchers, despite broad agreement on the principle of data sharing. More and better information will reach hitherto empty archives, if professional support is given during data creation, not in a project’s final phase.

P rofessionally managed, permanent data archives are essential to ensure the preservation and reusability of data. The idea of data deposition is supported by publishers and funding agencies around the world. Scientists, too, are generally in favour: they appreciate acknowledgement of (and future reference to) their hard-won data. Yet many

repositories are almost empty

1

. Clearly, there are discrepancies between scientists’

attitudes to the principle of data sharing and their actions when it is time to deposit their data.

There are reasons for this gap

1

. When the time of deposition comes — usually either at the end of a project or on publication of the results — the data are often scattered between various storage media, not uniformly formatted, and insufficiently tagged with metadata. As a result, deposition requires a substantial amount of effort, at a time when scientists really want to think about their next research question. Ongoing activities on data discovery and access that aim to innovate data reusability in the geosciences, such as the Data Observation Network for Earth project

2

, do not appropriately address the issue of capturing data.

We argue that most scientists view data deposition in remote archives as a burden

3

, because it is too far removed from their daily routine. Scientists need and want professional and locally supported systems to store their data in a structured and reusable form. Support for scientists in this way, right at the beginning of the data life cycle, can avoid the discrepancy between the principles and actions of data sharing. If raw data and their derivatives are recorded in a professional manner during collection and analysis, the task of data deposition can be automated. It will then need only a mouse click by the scientist to initiate formal deposition, and not the laborious work of days. In such an environment, local data

managers become data navigators, rather than curators.

Scaling up

From the point of view of funding agencies and publishers — the main parties interested in data reusability and accessibility — data deposition at the end of the project or at the time of publication is sufficient. But data sharing is likely to evolve into a mandatory part of the research and publication process in the near future

4,5

.

If so, data pathways must be organized in a way that can be scaled up without a vast drain on resources. In the present system, projects and their data managers are focused on one dedicated data repository. As a result, data managers provide individual support to scientists who wish to deposit their data, for example, by converting scientific data files into the format required by the chosen repository. This kind of curation is

inflexible and very time consuming

6

, and it requires personal communication between the scientist, data manager and repository staff for quality assurance of the metadata.

The human interaction in the data pathway creates unacceptable bottlenecks:

only an automated process can turn around the full quantity of data that are generated and published. The curation system simply will be overwhelmed if all data are to be submitted.

The nagging problem

An analysis of data management requirements within the Cluster of

Excellence: Future Ocean in Kiel

7

revealed that researchers strongly desire reliable personal communication with local data curators

3

. They do not favour support by remote data managers: scientists like to be in charge of their data. (Think about it:

would you give your children to someone you barely know?) Our survey, confirmed by an independent study

3

, included personal

WWW.GLYNGOODWIN.CO.UK

© 2011 Macmillan Publishers Limited. All rights reserved

A Scientist bringing new Data The Internet

CD DVD

USB-Stick E-Mail

A Data Center Importing tool

Transformed Data ready for import

A Data Curator/Scientist transforming data

Storage Database Administrator/Data Librarian

DATA

http://www.glyngoodwin.co.uk/

Nature Geoscience 4, 575–576, (2011)

http://dx.doi.org/10.1038/ngeo1248

(9)

The Bottleneck!

700 Publications per year 3-4 days per import

260 working days per year

700*3=2100 700*4=2800

260 260

=8,07

=10,77

8 to 11 Data Managers

If it could be done in TWO days

you still need 5 -6 Data Managers

(10)

Data Analysis

Scientific Road Trip

Data Creation/

Aquisition Data Creation/

Aquisition Images: Ian Hampton: flickr (Car crash)

jezart:flickr (Intersection) tonylanciabeta:flickr (race car)

Interpretation/Publication

(11)

Publication Output

Relative growth of publication output from 1994-1999 to 2000-2004 by disciplines (SCOPUS)

Source: D. Tunger 2009 Forschungszentrum Jülich)

!"#$%&'"()*

+&,-(%

.&#/$"&0$&

12)"$3-(3)4!"#-#2*

1)("$-&/5"052&0&),- 6%&'"/()*

6#'73(&)58$"&0$&

9%*/"$/

:,(&)",-58$"&0$&

;02&0&&)"0258$"&0$&

:,(%&',("$/

;0&)2*58$"&0$&

< = >< >= ?< ?= @< @= A< A= =< == B< B= C< C= D< D=

E0(&)0,("#0,- .&)',0*

(12)

Publication Output

Number of new scientific articles in Science Citation Index per year

Source: D. Tunger 2009 Forschungszentrum Jülich)

16

Anzahl an naturwissenschaftlichen Artikeln im SCI

0 200000 400000 600000 800000 1000000 1200000

1973 1975 1977 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003

A nz ah l A rti ke l w el tw ei t

0 20000 40000 60000 80000 100000 120000

A nz ah l A rt ik el D eu ts ch la nd

International Deutschland

Abbildung 2: Anzahl wissenschaftlicher Artikel in der Datenbank Science Citation Index (SCI)

Die zentrale Frage in dieser Dissertation ist, ob durch die verbesserte Aufbereitung von bestehenden Daten und Informationsangeboten ein Trendbeobachtungssystem für die Naturwissenschaft entwickelt werden kann. Ziel dieses Systems soll die ver- besserte Informationsversorgung für wissenschaftliche Projekte sein, mit dem Ziel, möglichst frühzeitig neue wissenschaftliche Strömungen zu erkennen.

Menschliches Handeln ist grundsätzlich zukunftsgerichtet und auf bestimmte Ziele o- rientiert. An die Stelle sicheren Wissens über die Zukunft treten Erwartungen der ein- zelnen Individuen. Diese beruhen auf Informationen prognostischer Art (Rieser, 1980, S. 11).

Das Hauptaugenmerk dieser Dissertation liegt auf der bibliometrischen Untersu- chung der Entwicklung und Wahrnehmung von wissenschaftlichen Themen, aber auch auf der bibliometrischen Untersuchung von wissenschaftlichen Einrichtungen oder Wissenschaftlern selbst.

Eine Patentierbarkeit darf bei der Beurteilung von technologischer Entwicklung nicht

darüber hinwegtäuschen, dass etwa 85 % aller Produkte oder Projekte mit Marktreife

einen Fehlschlag erleiden (Schnabel, 2004, S. 1). Schnabel (2004) nennt folgendes

Beispiel: Der Mikrowellenherd wurde bereits um das Jahr 1950 entwickelt und ging

(13)

What‘s next?

If this is all going to happen we definitely need technical support to reduce human interactions!

?

?

?

? ?

?

?

?

?

? ?

?

? ?

?

? ?

?

?

? ?

?

? ?

?

?

?

? ? ?

? ?

?

? ?

?

? ?

?

? ?

?

? ?

? ?

?

? ?

?

? ?

? ? ?

?

?

? ?

?

?

? ? ?

? ?

?

?

?

? ?

?

?

?

?

?

? ? ?

? ?

? ?

? ?

?

?

?

?

? ?

? ? ?

? ?

? ? ?

? ?

? ?

? ?

? ?

? ?

(14)

Projects need to take action

Data Archive

Image:NASA/Goddard Space Flight Center

Research Projects

Institutions

(15)

Projects need to take action

Data Archive

Image:NASA/Goddard Space Flight Center

Research Site

Research Site

Research

Site Research

Site Research Site

Data Provenance Information

Retrievability Usability

Institutions

(16)

Why Research Sites?

Personal and short

communication between

Scientist and Data Center staff Sustainability of trusted

personal cooperation

Scientific record of the

performed research history for a site (University, Institutes, etc.)

Data capturing at the point of origin

Collecting the unpublished data Capturing data and meta

information at the time of data creation

Storing the analytic procedures as provenance information

Publish data with on-click solution from structured data source to

another

(17)

Leg Name

Event

Modeling what you do

(18)

Human activities

Start Leg Assign Person

Assign Port

Assign Port Assign Date/Time

Event

Assign Date/Time

Start Leg

Assign Person Assign Port

Assign Date/Time Event 2

Assign Port

Assign Date/Time

Start Ship Event Assign Person

Assign Date/Time

Assign Location Decimal Deg.

Assign Person Start Ship Event

Assign Date/Time

Assign Location Decimal Deg.

Leg Name: M77/3

Station Number:

Description:

Gear Name:

1 2 3

4 5 6

7 8 9

10 11 12

13 14 ...

Repetitions

(19)

Thank you!

Thanks to:

Andreas, Carsten, Hela and Pina Kai Jannaschk, B. Thalheim

Computer Science Department Kiel Funding Projects

Dirk Fleischer

dfleischer@ifm-geomar.de

Referenzen

ÄHNLICHE DOKUMENTE

Coronary revascularization, either by CABG or PTCA, has been proven unequivocally to improve survival in selected patients, especially in those with triple vessel disease and

12.— The redshift-space power spectrum recovered from the combined SDSS main galaxy and LRG sample, optimally weighted for both density changes and luminosity dependent bias

• Non-linear galaxy bias seems under control, as long as the underlying matter power. spectrum is

alignment of the film with public policy on pandemic preparedness and the 2009/10 influenza pandemic gestures towards the interconnections of pandemic, expert public health

a certain graph, is shown, and he wants to understand what it means — this corre- sponds to reception, though it involves the understanding of a non-linguistic sign;

und unter der Voraussetzung, daß sich der Anzeigentext ein wenig von seinen Nachbarn abhebt - kann man feststellen, daß manche Kontakt- Wünsche sowohl in der Presse als auch im

Diese oder eine ähnliche Frage muß man sich wohl als Studierender immer mal stellen. Wenn man die Zeichen der Zeit bzw. der demo- kratisch legitimierten Regierung zu

es gibt andere lesbische, schwule, bisexuelle, trans*, inter* oder non-binäre Jugendliche, die ihre sexuelle Orientierung und / oder geschlecht- liche Identität offen in