• Keine Ergebnisse gefunden

Research Data Management Training for IAC

N/A
N/A
Protected

Academic year: 2021

Aktie "Research Data Management Training for IAC"

Copied!
29
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Research Collection

Educational Material

Research Data Management Training for IAC

Author(s):

Petrus, Ana; Töwe, Matthias Publication Date:

2017-02-03 Permanent Link:

https://doi.org/10.3929/ethz-b-000296488

Rights / License:

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International

(2)

Research Data Management Training for IAC

(3)

Why manage your research data?

©“The Carrot and the Stick Set”(4.9.2018) by Frits Ahlefeldt

(4)

Why spend time and effort on this?

So you can work efficiently and effectively

Save yourself time and reduce frustration

Facilitate collaboration in your group

Highlight patterns or connections that might otherwise be missed

Because your data is precious – and costly

To enable data re-use and sharing – even for yourself

To meet funders’ and institutional requirements

You are the experts who can make informed decisions!

You as the authors are responsible for the work you deliver

To meet funders’ and institutional requirements

As group leaders you will be held responsible for complying with your institution’s and funders’

regulations

You oversee students’ and staff members’ work and need to rely on them following good scientific practice

You may be able to influence the discussion in your community, in your institution and with funders

(5)

ETH regulations, intellectual property, privacy and access rights

(6)

Recent Overview

(7)

Guidelines for Research Integrity

At the ETH Zurich research is

founded on intellectual honesty. Researchers […] are committed to scientific integrity and

truthfulness in research and peer review.

https://www.ethz.ch/content/dam/eth z/main/research/pdf/forschungsethik/

Broschure.pdf

(8)

Article 11. Collection, documentation and storage of primary data

All steps in the treatment of primary data (statistical analyses, reorganizations, etc.) must be documented in a form appropriate to the discipline in question (e.g. laboratory logs, other data carriers) in such a way as to ensure that the results obtained from the primary data can be reproduced completely.

The project management is responsible for data management (data collection, storage, data access, compliance with data protection requirements, etc.). In particular, it should ensure that, following completion of the project, the data and materials are retained for the period prescribed in the discipline, and are duly destroyed within the period prescribed by law, if appropriate.

(9)

Roles and Responsibilities

 Project Members :

adhere to the principles of good scientific practice and the guidelines for Research Integrity at ETH.

All steps of treatment of primary data must be documented and results must be reproducible.

 Project Manager:

responsible for execution of a scientific project and data management (data collection, storage, data access, compliance with data protection requirements...).

Ensures that all research project participants are aware of the guidelines.

Determines together with the professor, which departed project members should retain access to the primary data or materials.

(10)

Licences / Open Access / Creative Commons

(11)

Compliance Guide

[…] all ETH members […] are required to integrate the general conditions and internal directives into the work process.

The following Compliance Guide is designed to serve as an orientation tool. […]

To facilitate implementation, each point is supplemented with further information and contact persons available for consultation.

https://rechtssammlung.sp.ethz.ch/

Dokumente/133_en.pdf

(12)

Intellectual Property Rights

Intellectual Property Rights (IPR) can be defined as rights acquired over any work created or invented with the intellectual effort of an individual. These rights are time-limited and do not require explicit assertion.

Owners of works are granted exclusive rights

to publish the work

to license the work’s distribution to others

to sue in case of unlawful or deceptive copying

It may be possible to waive these rights through a public declaration similar to a licence.

(13)

share-alike by non-derivative some rights reserved

share

“Creative Commons” (4.9.2018) by Michael Porter CC BY-NC-ND 2.0

(14)

What you need to consider

 Respect the rights of others

Third parties

Individuals you work with

Licence shouldn’t clash with any previous licences (e.g. when collaborating on models)

 In case of doubt: seek permission even when a CC-licence is assigned

 Note that according to ETH law, ETH reserves most immaterial rights in works by its employees. When in doubt, contact ETH transfer

(www.transfer.ethz.ch)

 Make sure you keep sufficient rights

E.g. for Open Access Publishing (green path)

(15)

Licensing research data

Outlines pros and cons of each approach and gives practical advice on how to

implement your licence

CREATIVE COMMONS LIMITATIONS NC Non-Commercial

What counts as commercial?

SA Share Alike

Reduces interoperability ND No Derivatives

Severely restricts use

Horizon 2020 guidelines point to

OR

(16)

Best practices for personal data management

“MGB Grau Blau WP” (4.9.2018) by Bidgee/ CC BY-SA 3.0

GARBAGE IN,

GARBAGE OUT!

(17)

Open standards (non proprietary)

If proprietary, convert or if not possible include data viewer

Well documented

Widely used and supported by many tools

Uncompressed (or at least losslessly compressed)

Unencrypted

When in doubt, keep original and create a copy in an open or exchange format

Don’t rely on file extensions

Preferences for file formats

(18)

This does not mean you «must not» keep data in other formats

Just be aware that proprietary or undocumented formats (even your own!) might cause trouble in the future

Think about adding an alternative format (yes, redundantly) for a proprietary one…

…and add any context information you yourself would like to have on your own formats in a few years time in a readme-file, an accompanying document or as metadata

Note

(19)

"A story told in file names“ from

Does this remind you of something?

(20)

Keep in mind, this is just a suggestion.

Generally, keep stuff together that belongs together.

For further file and folder organisation tips, see:

http://www.data.cam.ac.uk/data-management-guide/organising- your-data

http://www.wur.nl/en/Expertise-Services/Data-Management-

Support-Hub/Browse-by-Subject/Organising-files-and-folders.htm

http://datalib.edina.ac.uk/mantra/organisingdata/

Try this instead…

(21)

Context information needed

How to run the model

Which software is needed (incl. compilers)

Which hardware is needed

 Always preserve a reference output together with the code, to ensure reproducibility

 When writing a publication , preserve the model code, input and output needed to reproduce the results/figures presented in the article

 Save aggregated data if possible (eg. daily or monthly means, depending on time resolution, saving raw output with each timestep not always necessary.

Preservation of (climate) models and data

«Gulf Stream Sea Surface Currents and Temperatures» (12.9.2018) by NASA Goddard Space Flight Center /CC BY

(22)

Long-term preservation

(23)

What does long term mean?

Different time horizons and purposes

Keeping data for at least ten years to ensure accountability if results are challenged (as defined in the ETH “Guidelines for Research Integrity”)

Potentially unlimited retention of data with permanent value (e.g. long running series of observational data)

Permanent retention of published data which is considered as part of the scientific record and is expected to remain available just like articles and journals are

In general “long term” signifies any time period which spans technological changes in the way data is being used

if results are challenged (as defined in the

(24)

Old files?

How old is your oldest file? Is it in use?

Age of files need not be a problem in itself…

…but obsolescence is

How much are you willing to invest to recover unreadable files?

Files in regular use will not become obsolete unnoticed

(25)

ETH Services

(26)

ETH Data-Archive (http://www.library.ethz.ch/Digital-Curation)

Not for mass storage and active data

DOI registration (http://www.library.ethz.ch/DOI-Desk-EN)

Open Access (http://www.library.ethz.ch/en/Open-Access) including payment of Article Processing Charges with a range of publishers

ETH E-Collection (http://e-collection.library.ethz.ch/index.php?lang=en)

ETH E-Citations (http://e-citations.ethbib.ethz.ch/index.php?lang=en)

Will be merged into

«research collection»

and allow publication of documents and data

Services at ETH Library

(27)

IT Services

Storage provisioning, usually via your IT Support Group

HSM (Hierarchical Storage Management) https://www.ethz.ch/services/en/it- services/catalogue/storage/nas.html

LTS (Long-Term Storage) https://www.ethz.ch/services/en/it- services/catalogue/storage/lts.html

openBIS ELN-LIMS https://openbis-eln-lims.ethz.ch/

ETH-Transfer https://www.ethz.ch/en/the-eth-zurich/organisation/staff-units/eth-transfer.html

IT services and ETH transfer

(28)

 Training courses and workshops on information research, reference

management, data management, scientific writing and open access by the ETH-Library:

http://www.library.ethz.ch/en/Services/Training-courses-guided-tours

 Courses offered by the ETH Information Center for Chemistry/Biology/Pharmacy:

http://www.infozentrum.ethz.ch/en/whats-up/events/

 Further topics on demand

Trainings

(29)

Thank you!

Digital Curation ETH-Bibliothek Rämistrasse 101 8092 Zurich

www.library.ethz.ch/Digital-Curation data-archive@library.ethz.ch

Dr. Ana Sesartic 044 632 73 76

ana.sesartic@library.ethz.ch Dr. Matthias Töwe

044 632 60 32

matthias.toewe@library.ethz.ch

Referenzen

ÄHNLICHE DOKUMENTE

(1) The 4D-Client is mostly used by a project‘s data manager for the administration of project related data, the import of metadata and analytical data and for comprehensive

The question of how many machines are desirable depends partly on how efficiently their use is organ- ized. A comparatively few machines can do more work than

With the emergence of db4o, we noted that while there were many advances over earlier object-oriented database systems in terms of ease of application development, the underlying

The UFBGKSIZE (generic key size) specifies the number of characters to be considered in a comparison. After the START has been performed, UFBGKSIZE reverts to

– Each routing table entry refers to a node close to the local node (in the proximity space), among all nodes with the appropriate nodeId prefix. VDBMS und P2P – Wolf Tilo Balke

Oberseminar I „Datenbanksysteme – Aktuelle

According to the requirement R4 (ability of be- ing aggregated), the metrics presented in this section are defined on the layers of attribute values, tupels, relations and

A data management plan (DMP) is a good way to think through and document the data life cycle, includ- ing a sampling strategy, anticipated data formats, possible storages