Research Collection
Educational Material
Research Data Management Training for IAC
Author(s):
Petrus, Ana; Töwe, Matthias Publication Date:
2017-02-03 Permanent Link:
https://doi.org/10.3929/ethz-b-000296488
Rights / License:
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
Research Data Management Training for IAC
Why manage your research data?
©“The Carrot and the Stick Set”(4.9.2018) by Frits Ahlefeldt
Why spend time and effort on this?
So you can work efficiently and effectively
Save yourself time and reduce frustration
Facilitate collaboration in your group
Highlight patterns or connections that might otherwise be missed
Because your data is precious – and costly
To enable data re-use and sharing – even for yourself
To meet funders’ and institutional requirements
You are the experts who can make informed decisions!
You as the authors are responsible for the work you deliver
To meet funders’ and institutional requirements
As group leaders you will be held responsible for complying with your institution’s and funders’
regulations
You oversee students’ and staff members’ work and need to rely on them following good scientific practice
You may be able to influence the discussion in your community, in your institution and with funders
ETH regulations, intellectual property, privacy and access rights
Recent Overview
Guidelines for Research Integrity
At the ETH Zurich research is
founded on intellectual honesty. Researchers […] are committed to scientific integrity and
truthfulness in research and peer review.
https://www.ethz.ch/content/dam/eth z/main/research/pdf/forschungsethik/
Broschure.pdf
Article 11. Collection, documentation and storage of primary data
All steps in the treatment of primary data (statistical analyses, reorganizations, etc.) must be documented in a form appropriate to the discipline in question (e.g. laboratory logs, other data carriers) in such a way as to ensure that the results obtained from the primary data can be reproduced completely.
The project management is responsible for data management (data collection, storage, data access, compliance with data protection requirements, etc.). In particular, it should ensure that, following completion of the project, the data and materials are retained for the period prescribed in the discipline, and are duly destroyed within the period prescribed by law, if appropriate.
Roles and Responsibilities
Project Members :
adhere to the principles of good scientific practice and the guidelines for Research Integrity at ETH.
All steps of treatment of primary data must be documented and results must be reproducible.
Project Manager:
responsible for execution of a scientific project and data management (data collection, storage, data access, compliance with data protection requirements...).
Ensures that all research project participants are aware of the guidelines.
Determines together with the professor, which departed project members should retain access to the primary data or materials.
Licences / Open Access / Creative Commons
Compliance Guide
[…] all ETH members […] are required to integrate the general conditions and internal directives into the work process.
The following Compliance Guide is designed to serve as an orientation tool. […]
To facilitate implementation, each point is supplemented with further information and contact persons available for consultation.
https://rechtssammlung.sp.ethz.ch/
Dokumente/133_en.pdf
Intellectual Property Rights
Intellectual Property Rights (IPR) can be defined as rights acquired over any work created or invented with the intellectual effort of an individual. These rights are time-limited and do not require explicit assertion.
Owners of works are granted exclusive rights
to publish the work
to license the work’s distribution to others
to sue in case of unlawful or deceptive copying
It may be possible to waive these rights through a public declaration similar to a licence.
share-alike by non-derivative some rights reserved
share
“Creative Commons” (4.9.2018) by Michael Porter CC BY-NC-ND 2.0
What you need to consider
Respect the rights of others
Third parties
Individuals you work with
Licence shouldn’t clash with any previous licences (e.g. when collaborating on models)
In case of doubt: seek permission even when a CC-licence is assigned
Note that according to ETH law, ETH reserves most immaterial rights in works by its employees. When in doubt, contact ETH transfer
(www.transfer.ethz.ch)
Make sure you keep sufficient rights
E.g. for Open Access Publishing (green path)
Licensing research data
Outlines pros and cons of each approach and gives practical advice on how to
implement your licence
CREATIVE COMMONS LIMITATIONS NC Non-Commercial
What counts as commercial?
SA Share Alike
Reduces interoperability ND No Derivatives
Severely restricts use
Horizon 2020 guidelines point to
OR
Best practices for personal data management
“MGB Grau Blau WP” (4.9.2018) by Bidgee/ CC BY-SA 3.0
GARBAGE IN,
GARBAGE OUT!
Open standards (non proprietary)
If proprietary, convert or if not possible include data viewer
Well documented
Widely used and supported by many tools
Uncompressed (or at least losslessly compressed)
Unencrypted
When in doubt, keep original and create a copy in an open or exchange format
Don’t rely on file extensions
Preferences for file formats
This does not mean you «must not» keep data in other formats
Just be aware that proprietary or undocumented formats (even your own!) might cause trouble in the future
Think about adding an alternative format (yes, redundantly) for a proprietary one…
…and add any context information you yourself would like to have on your own formats in a few years time in a readme-file, an accompanying document or as metadata
Note
"A story told in file names“ from
Does this remind you of something?
Keep in mind, this is just a suggestion.
Generally, keep stuff together that belongs together.
For further file and folder organisation tips, see:
http://www.data.cam.ac.uk/data-management-guide/organising- your-data
http://www.wur.nl/en/Expertise-Services/Data-Management-
Support-Hub/Browse-by-Subject/Organising-files-and-folders.htm
http://datalib.edina.ac.uk/mantra/organisingdata/
Try this instead…
Context information needed
How to run the model
Which software is needed (incl. compilers)
Which hardware is needed
Always preserve a reference output together with the code, to ensure reproducibility
When writing a publication , preserve the model code, input and output needed to reproduce the results/figures presented in the article
Save aggregated data if possible (eg. daily or monthly means, depending on time resolution, saving raw output with each timestep not always necessary.
Preservation of (climate) models and data
«Gulf Stream Sea Surface Currents and Temperatures» (12.9.2018) by NASA Goddard Space Flight Center /CC BY
Long-term preservation
What does long term mean?
Different time horizons and purposes
Keeping data for at least ten years to ensure accountability if results are challenged (as defined in the ETH “Guidelines for Research Integrity”)
Potentially unlimited retention of data with permanent value (e.g. long running series of observational data)
Permanent retention of published data which is considered as part of the scientific record and is expected to remain available just like articles and journals are
In general “long term” signifies any time period which spans technological changes in the way data is being used
if results are challenged (as defined in the
Old files?
How old is your oldest file? Is it in use?
Age of files need not be a problem in itself…
…but obsolescence is
How much are you willing to invest to recover unreadable files?
Files in regular use will not become obsolete unnoticed
ETH Services
ETH Data-Archive (http://www.library.ethz.ch/Digital-Curation)
Not for mass storage and active data
DOI registration (http://www.library.ethz.ch/DOI-Desk-EN)
Open Access (http://www.library.ethz.ch/en/Open-Access) including payment of Article Processing Charges with a range of publishers
ETH E-Collection (http://e-collection.library.ethz.ch/index.php?lang=en)
ETH E-Citations (http://e-citations.ethbib.ethz.ch/index.php?lang=en)
Will be merged into
«research collection»
and allow publication of documents and data
Services at ETH Library
IT Services
Storage provisioning, usually via your IT Support Group
HSM (Hierarchical Storage Management) https://www.ethz.ch/services/en/it- services/catalogue/storage/nas.html
LTS (Long-Term Storage) https://www.ethz.ch/services/en/it- services/catalogue/storage/lts.html
openBIS ELN-LIMS https://openbis-eln-lims.ethz.ch/
ETH-Transfer https://www.ethz.ch/en/the-eth-zurich/organisation/staff-units/eth-transfer.html
IT services and ETH transfer
Training courses and workshops on information research, reference
management, data management, scientific writing and open access by the ETH-Library:
http://www.library.ethz.ch/en/Services/Training-courses-guided-tours
Courses offered by the ETH Information Center for Chemistry/Biology/Pharmacy:
http://www.infozentrum.ethz.ch/en/whats-up/events/
Further topics on demand
Trainings
Thank you!
Digital Curation ETH-Bibliothek Rämistrasse 101 8092 Zurich
www.library.ethz.ch/Digital-Curation data-archive@library.ethz.ch
Dr. Ana Sesartic 044 632 73 76
ana.sesartic@library.ethz.ch Dr. Matthias Töwe
044 632 60 32
matthias.toewe@library.ethz.ch