A Path to filled Archives
or
,Hey dude, how bumpy is it going to be?‘
Dirk Fleischer
dfleischer@ifm-geomar.de
eSciDoc-Days
October 2011
Complaining, complaining...
ILLUSTRATIONS BY J. H. VAN DIERENDONCK
I n 2003, the University of Rochester in New York launched a digital archive designed to preserve and share dissertations, preprints, working papers, photographs, music scores
— just about any kind of digital data the univer- sity’s investigators could produce. Six months of research and marketing had convinced the university that a publicly accessible online archive would be well received. At the time of the launch, the university librarians were wor- ried that a flood of uploaded data might swamp the available storage space.
Six years later, the US$200,000 repository lies mostly empty.
Researchers had been very supportive of the archive idea, recalls Susan Gibbons, vice-prov- ost and dean of the university’s River Campus Libraries — especially as the alternative was to keep on scattering their data and dissertations across an ever-proliferating array of uninte- grated computers and websites. “So we spent all this money, we spent all this time, we got the software up and running, and then we said, ‘OK, here it is. We’re ready. Give us your stuff’,” she says. “And that’s where we hit the wall.” When the time came, scientists couldn’t find their data,
or didn’t understand how to use the archive, or lamented that they just didn’t have any more hours left in the day to spend on this business.
As Gibbons and anthropologist Nancy Fried Foster observed in their 2005 postmortem
1,
“The phrase ‘if you build it, they will come’
does not yet apply to IRs [institutional reposi- tories].”
A similar reality check has greeted other data-sharing efforts. Most
researchers happily embrace the idea of sharing. It opens up observations to inde- pendent scrutiny, fosters new collaborations and encourages further discov- eries in old data sets (see pages 168 and 171). But
in practice those advantages often fail to out- weigh researchers’ concerns. What will keep work from being scooped, poached or mis- used? What rights will the scientists have to relinquish? Where will they get the hours and money to find and format everything?
Some communities have been quite open to sharing, and their repositories are bulging with
data. Physicists, mathematicians and computer scientists use arXiv.org, operated by Cornell University in Ithaca, New York; the Interna- tional Council for Science’s World Data System holds data for fields such as geophysics and biodiversity; and molecular biologists use the Protein Data Bank, GenBank and dozens of other sites. The astronomy community has the International Virtual Observatory Alliance, geo-
scientists and environmental researchers have Germany’s Publishing Network for Geoscientific & Environ- mental Data (PANGAEA), and the Dryad repository recently launched in North Carolina for ecology and evolution research.
But those discipline-specific successes are the exception rather than the rule in science.
All too many observations lie isolated and forgotten on personal hard drives and CDs, trapped by technical, legal and cultural barriers
— a problem that open-data advocates are only just beginning to solve.
One of those advocates is Mark Parsons at
Empty archives
Most researchers agree that open access to data is the scientific ideal, so what is stopping it happening? Bryn Nelson investigates why many researchers choose not to share.
”We got the software up and running and said
‘Give us your stuff’. That’s when we hit the wall.”
— Susan Gibbons
160
Vol 461|10 September 2009
160
NATURE|Vol 461|10 September 2009
NEWS FEATURE DATA SHARING
!"#$!"%&'()*&+(,-&$&.,-,&/0&1234566&&&!"#
!"#$!"%&'()*&+(,-&$&.,-,&/0&1234566&&&!"# 7898#9&&&!%:;;:!!7898#9&&&!%:;;:!!
)''0DXZd`ccXeGlYc`j_\ijC`d`k\[%8cci`^_kji\j\im\[
Bryn Nelson
Nature 461, 160-163 (2009) http://dx.doi.org/10.1038/461160a
Data sharing: Empty archives
,So we spent all this money, we spent all this time,
we got the software up and running,
and then we said,
'OK, here it is. We're ready. Give us your stuff" -
"And that's where we hit the wall!‘
Complaining, complaining...
YES, sufficient YES, but not sufficient
8.8%
10.9%
80.3%
NO
Is there sufficient funding for your lab or research group
for data curation?
There are many tales of early archaeologists burning wood from the ruins to make coffee. If we fail to curate the environmental archives we collect from nature at public expense, we essentially repeat those mistakes.
The next few years [particularly in medicine]
the volume of data we need to analyze will expand exponentially.
YES
YES ,
throughcollaborators
NO
34.4%
No special skills needed
16.1%
23.0%
26.5%
Do you have the necessary expertise in your lab or group to analyze your data in the way you want?
Have you asked colleagues for data related to their published papers?
If you answered yes, have the appropriate data been provided?
48.7%
47.6%
3.7%
YES
Sometimes
NO
23.6%
NO ,
never12.5% YES, once 55.8%
YES ,
1–108.1% YES, >10
CREDIT: M. TWOMBLY/SCIENCE; SOURCE: SCIENCE ONLINE SURVEY
www.sciencemag.org SCIENCE VOL 331 11 FEBRUARY 2011 693
CONTENTS
News
694 Rescue of Old Data Offers Lesson for Particle Physicists 696 Is There an Astronomer
in the House?
698 May the Best Analyst Win
Perspectives
700 Climate Data Challenges in the 21st Century J. T. Overpeck et al.
703 Challenges and Opportunities of Open Data in Ecology O. J. Reichman et al.
705 Changing the Equation on Scientifi c Data Visualization P. Fox and J. Hendler 708 Challenges and
Opportunities in Mining Neuroscience Data H. Akil et al.
712 The Disappearing Third Dimension
T. Rowe and L. R. Frank 714 Advancing Global Health
Research Through Digital Technology and Sharing Data T. Lang
717 More Is Less: Signal Processing and the Data Deluge
R. G. Baraniuk 719 Ensuring the Data-Rich
Future of the Social Sciences
G. King 721 Metaknowlege
J. A. Evans and J. G. Foster
725 Access to Stem Cells and Data: Persons, Property Rights, and Scientifi c Progress
D. J. H. Mathews et al.
728 On the Future of Genomic Data S. D. Kahn
See also:
Editorial
649 Making Data Maximally Available B. Hanson, A. Sugden, and B. Alberts News Focus
662 What Would You Do?
J. Couzin-Frankel
666 Will Computers Crash Genomics?
E. Pennisi
669 Drag-and-Drop Virtual Worlds R. Service
Books
676 Bounds and Vision M. A. Porter Policy Forum
678 Measuring the Results of Science Investments
J. Lane and S. Bertuzzi Science Express Research Article*
The World’s Technological Capacity to Compute, Store, and Communicate Information M. Hilbert and P. López
Science Signaling*
Conquering the Data Mountain N. R. Gough and M. B. Yaffe
Effective Representation and Storage of Mass Spectrometry–Based Proteomic Data Sets for the Scientifi c Community J. V. Olsen and M. Mann
The Potential Cost of High-Throughput Proteomics
F. M. White
Integrating Multiple Types of Data for Signaling Research: Challenges and Opportunities
H. S. Wiley
Setting the Standards for Signal Transduction Research J. Saez-Rodriguez et al.
Visual Representation of Scientifi c Information B. Wong
Science Translational Medicine*
Power to the People: Participant Ownership of Clinical Trial Data S. F. Terry and P. F. Terry Electronic Consent Channels:
Preserving Patient Privacy Without Handcuffi ng Researchers R. H. Shelton Science Careers*
More Than Words: Biomedical Ontologies Provide New Scientifi c Opportunities C. Wald
Surfi ng the Tsunami E. Pain
Sharing Data in Biomedical and Clinical Research
K. Travis
SPECIAL SECTION
*These items, plus a related podcast and online discussion, are available at www.sciencemag.org/special/data/
Published by AAAS
on February 14, 2011www.sciencemag.orgDownloaded from
YES, sufficient YES, but not sufficient
8.8%
10.9%
80.3%
NO
Is there sufficient funding for your lab or research group
for data curation?
There are many tales of early archaeologists burning wood from the ruins to make coffee. If we fail to curate the environmental archives we collect from nature at public expense, we essentially repeat those mistakes.
The next few years [particularly in medicine]
the volume of data we need to analyze will expand exponentially.
YES
YES ,
throughcollaborators
NO
34.4%
No special skills needed
16.1%
23.0%
26.5%
Do you have the necessary expertise in your lab or group to analyze your data in the way you want?
Have you asked colleagues for data related to their published papers?
If you answered yes, have the appropriate data been provided?
48.7%
47.6%
3.7%
YES
Sometimes
NO
23.6%
NO ,
never12.5% YES, once 55.8%
YES ,
1–108.1% YES, >10
CREDIT: M. TWOMBLY/SCIENCE; SOURCE: SCIENCE ONLINE SURVEY
www.sciencemag.org SCIENCE VOL 331 11 FEBRUARY 2011 693
CONTENTS
News
694 Rescue of Old Data Offers Lesson for Particle Physicists 696 Is There an Astronomer
in the House?
698 May the Best Analyst Win
Perspectives
700 Climate Data Challenges in the 21st Century J. T. Overpeck et al. 703 Challenges and
Opportunities of Open Data in Ecology O. J. Reichman et al. 705 Changing the Equation
on Scientifi c Data Visualization P. Fox and J. Hendler 708 Challenges and
Opportunities in Mining Neuroscience Data H. Akil et al.
712 The Disappearing Third Dimension
T. Rowe and L. R. Frank 714 Advancing Global Health
Research Through Digital Technology and Sharing Data T. Lang
717 More Is Less: Signal Processing and the Data Deluge
R. G. Baraniuk 719 Ensuring the Data-Rich
Future of the Social Sciences
G. King 721 Metaknowlege
J. A. Evans and J. G. Foster
725 Access to Stem Cells and Data: Persons, Property Rights, and Scientifi c Progress
D. J. H. Mathews et al.
728 On the Future of Genomic Data S. D. Kahn
See also: Editorial
649 Making Data Maximally Available B. Hanson, A. Sugden, and B. Alberts News Focus
662 What Would You Do? J. Couzin-Frankel
666 Will Computers Crash Genomics? E. Pennisi
669 Drag-and-Drop Virtual Worlds R. Service
Books
676 Bounds and Vision M. A. Porter Policy Forum
678 Measuring the Results of Science Investments
J. Lane and S. Bertuzzi Science Express Research Article* The World’s Technological Capacity to Compute, Store, and Communicate Information M. Hilbert and P. López
Science Signaling* Conquering the Data Mountain N. R. Gough and M. B. Yaffe
Effective Representation and Storage of Mass Spectrometry–Based Proteomic Data Sets for the Scientifi c Community J. V. Olsen and M. Mann
The Potential Cost of High-Throughput Proteomics
F. M. White
Integrating Multiple Types of Data for Signaling Research: Challenges and Opportunities
H. S. Wiley
Setting the Standards for Signal Transduction Research J. Saez-Rodriguez et al.
Visual Representation of Scientifi c Information B. Wong
Science Translational Medicine* Power to the People: Participant Ownership of Clinical Trial Data S. F. Terry and P. F. Terry Electronic Consent Channels: Preserving Patient Privacy Without Handcuffi ng Researchers R. H. Shelton Science Careers*
More Than Words: Biomedical Ontologies Provide New Scientifi c Opportunities C. Wald
Surfi ng the Tsunami E. Pain
Sharing Data in Biomedical and Clinical Research
K. Travis
SPECIAL SECTION
*These items, plus a related podcast and online discussion, are available at www.sciencemag.org/special/data/
Published by AAAS
on February 14, 2011www.sciencemag.orgDownloaded from
1 GB 100 GB
1 TB
It is not stored
Our Lab
University servers
Community repository Other
0.5%
50.2%
38.5%
7.6%
3.2%
Where do you archive most of the data generated in your lab or for your research?
Even within a single institution there are no standards for storing data, so each lab, or often each fellow, uses ad hoc approaches.
How often do you access or use data sets from the published literature for your original research papers?
From archival
databases? 22.6%
21.4%
56.0%
Often Half the time
Rarely
22.8%
21.6%
55.6%
Often Half the time
Rarely
>1 TB 100 GB –1 TB
<1 GB
7.6%
12.1%
1–100 GB
32.0%
48.3%
What is the size of the largest data set that you have used or generated in your research?
CREDIT: M. TWOMBLY/SCIENCE; SOURCE: SCIENCE ONLINE SURVEY
I N T R O D U C T I O N
Challenges and Opportunities
11 FEBRUARY 2011 VOL 331 SCIENCE www.sciencemag.org 692
SCIENTIFIC INNOVATION HAS BEEN CALLED ON TO SPUR ECONOMIC recovery; science and technology are essential to improving public health and welfare and to inform sustainability; and the scientifi c community has been criticized for not being suffi ciently account- able and transparent. Data collection, curation, and access are cen- tral to all of these issues. For this reason, Science has joined with colleagues from our sister publications Science Signaling, Science Translational Medicine, and Science Careers to provide a broad look at the issues surrounding the increasingly huge infl ux of research data. The entire collection is compiled online at www.sciencemag.
org/special/data/. As you will discover, two themes appear repeat- edly: Most scientifi c disciplines are fi nding the data deluge to be extremely challenging, and tremendous opportunities can be real- ized if we can better organize and access the data.
Our authors explore data issues that apply to specifi c fi elds as well as challenges shared between fi elds. These articles clearly show that the challenges are diffi cult and growing. We have recently passed the point where more data is being collected than we can physically store (see Hilbert et al., published online). This storage gap will widen rap- idly in data-intensive fi elds. Thus, decisions will be needed on which data to archive and which to discard. A separate problem is how to access and use these data. Many data sets are becoming too large to download. Even fi elds with well-established data archives, such as genomics, are facing new and growing challenges in data volume and management. And even where accessible, much data in many fi elds is too poorly organized to enable it to be effi ciently used.
To delve deeper into these issues, Science polled our peer review- ers from last year about the availability and use of data. We received about 1700 responses, representing input from an international and interdisciplinary group of scientifi c leaders. About 20% of the respondents regularly use or analyze data sets exceeding 100 giga- bytes, and 7% use data sets exceeding 1 terabyte. About half of those polled store their data only in their laboratories—not an ideal long- term solution. Many bemoaned the lack of common metadata and archives as a main impediment to using and storing data, and most of the respondents have no funding to support archiving.
Many of the responders indicated that they seek or would like additional help in analyzing the data that they had collected. If we can use and reuse scientifi c data better, the opportunities, as indicated in many examples in this special section, are myriad. Large integrated data sets can potentially provide a much deeper understanding of both nature and society and open up many new avenues of research.
And they are critical for addressing key societal problems—from improving public health and managing natural resources intelli- gently to designing better cities and coping with climate change.
To realize these opportunities, many of the articles in this collec- tion speak of changing the culture of science and the practices of sci- entists, as well as recognizing the growing responsibility for much better data stewardship. Several of the pieces illustrate steps toward these goals. But it is clear that organized effort and leadership are needed from funders, societies, journals, educators, and individual scientists—and from society at large.
We hope that this collection spurs additional thinking and cata- lyzes new efforts in dealing with these critical issues. As a start, we invite you to share your thoughts at talk.sciencemag.org, where you can also contribute to our poll. – SCIENCE STAFF
Published by AAAS
on February 14, 2011www.sciencemag.orgDownloaded from
1 GB 100 GB
1 TB
It is not stored
Our Lab
University servers
Community repository Other
0.5%
50.2%
38.5%
7.6%
3.2%
Where do you archive most of the data generated in your lab or for your research?
Even within a single institution there are no standards for storing data, so each lab, or often each fellow, uses ad hoc approaches.
How often do you access or use data sets from the published literature for your original research papers?
From archival
databases? 22.6%
21.4%
56.0%
Often Half the time
Rarely
22.8%
21.6%
55.6%
Often Half the time
Rarely
>1 TB 100 GB –1 TB
<1 GB
7.6%
12.1%
1–100 GB
32.0%
48.3%
What is the size of the largest data set that you have used or generated in your research?
CREDIT: M. TWOMBLY/SCIENCE; SOURCE: SCIENCE ONLINE SURVEY
I N T R O D U C T I O N
Challenges and Opportunities
11 FEBRUARY 2011 VOL 331 SCIENCE www.sciencemag.org 692
SCIENTIFIC INNOVATION HAS BEEN CALLED ON TO SPUR ECONOMIC recovery; science and technology are essential to improving public health and welfare and to inform sustainability; and the scientifi c community has been criticized for not being suffi ciently account- able and transparent. Data collection, curation, and access are cen- tral to all of these issues. For this reason, Science has joined with colleagues from our sister publications Science Signaling, Science Translational Medicine, and Science Careers to provide a broad look at the issues surrounding the increasingly huge infl ux of research data. The entire collection is compiled online at www.sciencemag.
org/special/data/. As you will discover, two themes appear repeat- edly: Most scientifi c disciplines are fi nding the data deluge to be extremely challenging, and tremendous opportunities can be real- ized if we can better organize and access the data.
Our authors explore data issues that apply to specifi c fi elds as well as challenges shared between fi elds. These articles clearly show that the challenges are diffi cult and growing. We have recently passed the point where more data is being collected than we can physically store (see Hilbert et al., published online). This storage gap will widen rap- idly in data-intensive fi elds. Thus, decisions will be needed on which data to archive and which to discard. A separate problem is how to access and use these data. Many data sets are becoming too large to download. Even fi elds with well-established data archives, such as genomics, are facing new and growing challenges in data volume and management. And even where accessible, much data in many fi elds is too poorly organized to enable it to be effi ciently used.
To delve deeper into these issues, Science polled our peer review- ers from last year about the availability and use of data. We received about 1700 responses, representing input from an international and interdisciplinary group of scientifi c leaders. About 20% of the respondents regularly use or analyze data sets exceeding 100 giga- bytes, and 7% use data sets exceeding 1 terabyte. About half of those polled store their data only in their laboratories—not an ideal long- term solution. Many bemoaned the lack of common metadata and archives as a main impediment to using and storing data, and most of the respondents have no funding to support archiving.
Many of the responders indicated that they seek or would like additional help in analyzing the data that they had collected. If we can use and reuse scientifi c data better, the opportunities, as indicated in many examples in this special section, are myriad. Large integrated data sets can potentially provide a much deeper understanding of both nature and society and open up many new avenues of research.
And they are critical for addressing key societal problems—from improving public health and managing natural resources intelli- gently to designing better cities and coping with climate change.
To realize these opportunities, many of the articles in this collec- tion speak of changing the culture of science and the practices of sci- entists, as well as recognizing the growing responsibility for much better data stewardship. Several of the pieces illustrate steps toward these goals. But it is clear that organized effort and leadership are needed from funders, societies, journals, educators, and individual scientists—and from society at large.
We hope that this collection spurs additional thinking and cata- lyzes new efforts in dealing with these critical issues. As a start, we invite you to share your thoughts at talk.sciencemag.org, where you can also contribute to our poll. – SCIENCE STAFF
Published by AAAS
on February 14, 2011www.sciencemag.orgDownloaded from
M. Twombly/Science - Science online survey Yael Fitzpatrick, using www.wordle.net
Science, Vol. 331, Issue 6018
Data Sharing?
JAN HEIN VAN DIERENDONCK
NATRUE COVER GRAPHIC VOL. 461(2)
"The phrase 'if you build it, they will come' does not yet apply to institutional repositories.“
A similar reality check has greeted other data-sharing efforts.
It opens up observations to independent scrutiny, fosters new collaborations and encourages further discoveries in old data sets.
Most researchers happily embrace the idea of sharing.
But in practice those
advantages often fail to
outweigh researchers' concerns.
The Carrot Crusade
W. Michener, 2011 (D-Lib Magazine Vol. 17)
Data Conservancy
A Blueprint for Research Libraries
The Carrot Crusade
The Carrot Crusade
We will bring your data to the world, but
before this there is something we would like you to do:
wrap them in blue and yellow paper
put green stickers on it best are stars, but flowers are also okay
don‘t forget the purple ribbon around it
What‘s wrong here?
NATURE GEOSCIENCE | VOL 4 | SEPTEMBER 2011 | www.nature.com/naturegeoscience 575
commentary
A path to filled archives
Dirk Fleischer and Kai Jannaschk
Reluctance to deposit data is rife among researchers, despite broad agreement on the principle of data sharing. More and better information will reach hitherto empty archives, if professional support is given during data creation, not in a project’s final phase.
P rofessionally managed, permanent data archives are essential to ensure the preservation and reusability of data. The idea of data deposition is supported by publishers and funding agencies around the world. Scientists, too, are generally in favour: they appreciate acknowledgement of (and future reference to) their hard-won data. Yet many
repositories are almost empty
1. Clearly, there are discrepancies between scientists’
attitudes to the principle of data sharing and their actions when it is time to deposit their data.
There are reasons for this gap
1. When the time of deposition comes — usually either at the end of a project or on publication of the results — the data are often scattered between various storage media, not uniformly formatted, and insufficiently tagged with metadata. As a result, deposition requires a substantial amount of effort, at a time when scientists really want to think about their next research question. Ongoing activities on data discovery and access that aim to innovate data reusability in the geosciences, such as the Data Observation Network for Earth project
2, do not appropriately address the issue of capturing data.
We argue that most scientists view data deposition in remote archives as a burden
3, because it is too far removed from their daily routine. Scientists need and want professional and locally supported systems to store their data in a structured and reusable form. Support for scientists in this way, right at the beginning of the data life cycle, can avoid the discrepancy between the principles and actions of data sharing. If raw data and their derivatives are recorded in a professional manner during collection and analysis, the task of data deposition can be automated. It will then need only a mouse click by the scientist to initiate formal deposition, and not the laborious work of days. In such an environment, local data
managers become data navigators, rather than curators.
Scaling up
From the point of view of funding agencies and publishers — the main parties interested in data reusability and accessibility — data deposition at the end of the project or at the time of publication is sufficient. But data sharing is likely to evolve into a mandatory part of the research and publication process in the near future
4,5.
If so, data pathways must be organized in a way that can be scaled up without a vast drain on resources. In the present system, projects and their data managers are focused on one dedicated data repository. As a result, data managers provide individual support to scientists who wish to deposit their data, for example, by converting scientific data files into the format required by the chosen repository. This kind of curation is
inflexible and very time consuming
6, and it requires personal communication between the scientist, data manager and repository staff for quality assurance of the metadata.
The human interaction in the data pathway creates unacceptable bottlenecks:
only an automated process can turn around the full quantity of data that are generated and published. The curation system simply will be overwhelmed if all data are to be submitted.
The nagging problem
An analysis of data management requirements within the Cluster of
Excellence: Future Ocean in Kiel
7revealed that researchers strongly desire reliable personal communication with local data curators
3. They do not favour support by remote data managers: scientists like to be in charge of their data. (Think about it:
would you give your children to someone you barely know?) Our survey, confirmed by an independent study
3, included personal
WWW.GLYNGOODWIN.CO.UK
© 2011 Macmillan Publishers Limited. All rights reserved
A Scientist bringing new Data The Internet
CD DVD
USB-Stick E-Mail
A Data Center Importing tool
Transformed Data ready for import
A Data Curator/Scientist transforming data
Storage Database Administrator/Data Librarian
DATA
http://www.glyngoodwin.co.uk/
Nature Geoscience 4, 575–576, (2011)
http://dx.doi.org/10.1038/ngeo1248
The Bottleneck!
700 Publications per year 3-4 days per import
260 working days per year
700*3=2100 700*4=2800
260 260
=8,07
=10,77
8 to 11 Data Managers
If it could be done in TWO days
you still need 5 -6 Data Managers
Data Analysis
Scientific Road Trip
Data Creation/
Aquisition Data Creation/
Aquisition Images: Ian Hampton: flickr (Car crash)
jezart:flickr (Intersection) tonylanciabeta:flickr (race car)
Interpretation/Publication
Publication Output
Relative growth of publication output from 1994-1999 to 2000-2004 by disciplines (SCOPUS)
Source: D. Tunger 2009 Forschungszentrum Jülich)
!"#$%&'"()*
+&,-(%
.&#/$"&0$&
12)"$3-(3)4!"#-#2*
1)("$-&/5"052&0&),- 6%&'"/()*
6#'73(&)58$"&0$&
9%*/"$/
:,(&)",-58$"&0$&
;02&0&&)"0258$"&0$&
:,(%&',("$/
;0&)2*58$"&0$&
< = >< >= ?< ?= @< @= A< A= =< == B< B= C< C= D< D=
E0(&)0,("#0,- .&)',0*
Publication Output
Number of new scientific articles in Science Citation Index per year
Source: D. Tunger 2009 Forschungszentrum Jülich)
16
Anzahl an naturwissenschaftlichen Artikeln im SCI
0 200000 400000 600000 800000 1000000 1200000
1973 1975 1977 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003
A nz ah l A rti ke l w el tw ei t
0 20000 40000 60000 80000 100000 120000
A nz ah l A rt ik el D eu ts ch la nd
International Deutschland
Abbildung 2: Anzahl wissenschaftlicher Artikel in der Datenbank Science Citation Index (SCI)
Die zentrale Frage in dieser Dissertation ist, ob durch die verbesserte Aufbereitung von bestehenden Daten und Informationsangeboten ein Trendbeobachtungssystem für die Naturwissenschaft entwickelt werden kann. Ziel dieses Systems soll die ver- besserte Informationsversorgung für wissenschaftliche Projekte sein, mit dem Ziel, möglichst frühzeitig neue wissenschaftliche Strömungen zu erkennen.
Menschliches Handeln ist grundsätzlich zukunftsgerichtet und auf bestimmte Ziele o- rientiert. An die Stelle sicheren Wissens über die Zukunft treten Erwartungen der ein- zelnen Individuen. Diese beruhen auf Informationen prognostischer Art (Rieser, 1980, S. 11).
Das Hauptaugenmerk dieser Dissertation liegt auf der bibliometrischen Untersu- chung der Entwicklung und Wahrnehmung von wissenschaftlichen Themen, aber auch auf der bibliometrischen Untersuchung von wissenschaftlichen Einrichtungen oder Wissenschaftlern selbst.
Eine Patentierbarkeit darf bei der Beurteilung von technologischer Entwicklung nicht
darüber hinwegtäuschen, dass etwa 85 % aller Produkte oder Projekte mit Marktreife
einen Fehlschlag erleiden (Schnabel, 2004, S. 1). Schnabel (2004) nennt folgendes
Beispiel: Der Mikrowellenherd wurde bereits um das Jahr 1950 entwickelt und ging
What‘s next?
If this is all going to happen we definitely need technical support to reduce human interactions!
?
?
?
? ?
?
?
?
?
? ?
?
? ?
?
? ?
?
?
? ?
?
? ?
?
?
?
? ? ?
? ?
?
? ?
?
? ?
?
? ?
?
? ?
? ?
?
? ?
?
? ?
? ? ?
?
?
? ?
?
?
? ? ?
? ?
?
?
?
? ?
?
?
?
?
?
? ? ?
? ?
? ?
? ?
?
?
?
?
? ?
? ? ?
? ?
? ? ?
? ?
? ?
? ?
? ?
? ?
Projects need to take action
Data Archive
Image:NASA/Goddard Space Flight Center
Research Projects
Institutions
Projects need to take action
Data Archive
Image:NASA/Goddard Space Flight Center
Research Site
Research Site
Research
Site Research
Site Research Site
Data Provenance Information
Retrievability Usability
Institutions
Why Research Sites?
Personal and short
communication between
Scientist and Data Center staff Sustainability of trusted
personal cooperation
Scientific record of the
performed research history for a site (University, Institutes, etc.)
Data capturing at the point of origin
Collecting the unpublished data Capturing data and meta
information at the time of data creation
Storing the analytic procedures as provenance information
Publish data with on-click solution from structured data source to
another
Leg Name
Event
Modeling what you do
Human activities
Start Leg Assign Person
Assign Port
Assign Port Assign Date/Time
Event
Assign Date/Time
Start Leg
Assign Person Assign Port
Assign Date/Time Event 2
Assign Port
Assign Date/Time
Start Ship Event Assign Person
Assign Date/Time
Assign Location Decimal Deg.
Assign Person Start Ship Event
Assign Date/Time
Assign Location Decimal Deg.
Leg Name: M77/3
Station Number:
Description:
Gear Name:
1 2 3
4 5 6
7 8 9
10 11 12
13 14 ...