Introduction

The OpenAIRE ( **Open A**ccess **I**nfrastructure for **R**esearch in **E**urope ) Guidelines were established to support the Open Access/Open Science strategy of the European Commission and to meet requirements of the OpenAIRE infrastructure. This new version of the Guidelines, according to the expansion of the aims of the OpenAIRE initiative and its infrastructure, has a broader scope. In fact, these Guidelines are intended to guide repository manager to expose to the OpenAIRE infrastructure open access and non-open access publications together with funding information, where applicable.

Aim

The OpenAIRE Guidelines for Data Archive Managers 3.0 provide orientation for software repository managers to define and implement their local data management policies in exposing metadata for data products according to the requirements of the OpenAIRE - Open Access Infrastructure for Research in Europe. These guidelines are intended to provide indications on how to make dataset products citable in order to make them first-level citizen of an Open Science, interlinked scholarly communication ecosystem. By adhering to the guidlines exposure, visibility, and re-use of repository content will be significantly increased.

By implementing the OpenAIRE Guidelines, data archive managers are facilitating the creation of enhanced publications and building the stepping-stones for a linked data infrastructure for research.

According to the Content Acquisition Policies (CAP, 10.5281/zenodo.1446408 ) of the OpenAIRE infrastructure, metadata from data archives can be included and shown in the OpenAIRE Research Graph without any restrictions.

OpenAIRE is happy to assist in adherence to these guidelines.

Rationale

The goal of the OpenAIRE guidelines for dataset is to give immediate visibility of datasets as a “citable research product” based on the current state of the art in the scholarly communication, while indicating the way towards “good dataset citation practices”. Research dataset is currently available from the following kinds scholarly communication repositories:

  • Institutional repositories: datasets descriptions are currently provided as Dublin Core metadata records
  • Data repositories: dataset descriptions are currently provided as DataCite metadata records

The guidelines aim at making these repositories readily compliant so as to start exposing dataset entities to discovery and citation services. This means the guidelines should be endorsed by the community (e.g. include properties that reflect the need of dataset citation), do not impose high efforts to sources (e.g. mandatory citation metadata not available to sources), while recommending best practices (e.g. placing metadata recommended/optional for citation). Accordingly, the guidelines have been defined with a pragmatic approach, keeping mandatory properties to the minimum, focusing on properties for citation (attribution and access), disregarding discover-for-reuse properties, but keeping in mind that any property can be added in the future to reflect changes that should and hopefully will occur at the repositories side and in the behaviour of scientists who create, share, cite, and re-use research datasets.

The guidelines take inspiration from the following initiatives on datasets description and citation:

Acknowledgements & Contributors

Editors

  • Andreas Czerniak (Bielefeld University Library, Germany, orcid.org/)
  • Aenne Loehden (Bielefeld University Library, Germany, orcid.org/)

Experts & Reviewers

Versions

  • 3.0-Draft June 2020, Updated to DataCite Metadata Schema v4.3 ( 10.14454/f2wp-s162)
  • 2.0 April 2014 , Updated to DataCite Metadata Schema v3.0
  • 1.0 December 2012, Initial document

Citation