Gestion et partage des données et des logiciels and help establish global standards for citation of biomedical data

Key element in long term and consistent data citation and identification

EMBL-EBI has established as a stable system for identification and citation of life science data, using Persistent Identifiers (PIDs). not only enables researchers to easily reference their data, but also provides a variety of useful, supporting  web services. is built upon a high-quality curated registry containing several hundred life science data collections. Following a recently completed ELIXIR Implementation Study, the registry now includes all relevant databases operated by individual ELIXIR Nodes.

Defending against dead links

Having a dedicated service to check all reference hyperlinks and keep them up-to-date has many advantages. Dr Sarala Wimalaratne, Project Lead at EMBL-EBI explains: “From time to time, hyperlinks to life science data records may change, for example due to technical updates or institutional changes. keeps track of all those changes and provides stable identification of life science data through the latest URLs used to access the data.”

This is very useful to researchers, as they can be sure that their stored reference links will always point to the right data source. But it also helps developers of bioinformatics tools and database providers, both of whom need to maintain up-to-date cross-references. In interconnected networks of such cross-referenced systems, a single broken link can compromise the whole network; effectively helps avoid ‘dead-ends’ in networks of linked data.

Nick Juty, from Manchester University (ELIXIR UK), says: “All entries in are carefully curated to a high standard, collating all the necessary information to unambiguously and accurately identify individual data records. This is a continuous process, requiring the addition of new resources, whilst maintaining and updating existing records.”

“The ELIXIR Implementation Study helped integrate resources in ELIXIR Nodes into Researchers as well as scientific journals can now use a consistent citation scheme for any resource within the ELIXIR ecosystem” says Jerry Lanfear, ELIXIR Chief Technical Officer. “This will make collaboration and linking of ELIXIR resources much easier. It also helps establish standards in data identification across the life science community,” adds Lanfear.

Global Standards in Data Citation

Another goal of the ELIXIR Implementation Study was to improve and harmonise existing data citation practices in scientific literature and on the web at large. In a collaboration with the team at the California Digital Library (CDL), the team developed a global approach for the formal citation of research data.

The citation system is based on compact identifiers – an easy to read and easy to process citation system using a unique prefix to indicate an individual archive, combined with a locally assigned identifier (e. g. uniprot:P04150). This compact identifier points  to identical records through either EMBL-EBI or CDL’s resolving systems.  For this system to work globally, EMBL-EBI and CDL established a namespace registry with an easy to use form for requesting new prefixes, and clear governance and maintenance rules to resolve all references to the right data collections.

This new approach was developed by an international team organized through, and has been presented in a recent paper in Nature-Scientific Data, with lead authorship by EMBL-EBI and CDL staff. It will be beneficial not only to authors, but also to scientific journals and other publishers in the life sciences.

Nature Scientific Data announced today that it will be ”taking advantage of the resolver services offered by and to provide more standardized and predictable links for biomedical datasets that have accession identifiers”

Lien vers la source : Elixir

Print Friendly, PDF & Email