Federated European infrastructure for genomics data

Inicio / Programas UE / DIGITAL / DIGITAL-2021-CLOUD-AI-01-FEI-DS-GENOMICS
Logo

(DIGITAL-2021-CLOUD-AI-01-FEI-DS-GENOMICS) - FEDERATED EUROPEAN INFRASTRUCTURE FOR GENOMICS DATA

Programme: Digital Europe Programme (DIGITAL)
Call: Cloud Data and TEF EU

Topic description

ExpectedOutcome:

Outcomes and deliverables

The following elements shall be delivered by the selected project:

  • Deployment of an interoperable, FAIR-compliant and secure federated infrastructure and data governance needed to enable sustainable cross-border linkage of genomic data sets in compliance with the relevant legal, ethical, quality and interoperability requirements and agreed standards.
  • Platform enabling the application of appropriate high-end computing, AI and simulation resources to analyse the data.
  • Support for the establishment or upgrade of the necessary local infrastructure, and for the creation, extension and adaptation (e.g. FAIRification) of genomic datasets.
  • Business model including an uptake strategy explaining the motivation and incentives for all stakeholders at the different levels (national, European, global) to support the data infrastructure towards its sustainability, including data controllers (biobanks, hospitals/municipalities, research institutes, patients), data users (clinicians, researchers, policymakers, companies), service providers (e.g. IT industry, biotech industry), healthcare systems and public authorities at large.
  • Coordination support for the multi-country project ‘Genome of Europe’.
  • Comprehensive communication strategy.
Objective:

The objective is the creation of a technical infrastructure combined with governance mechanisms that will secure easy, cross-border access to key datasets in the targeted area. In particular, the aim of this topic is to achieve sustainable cross-border linkage of and access to a multitude of genomic and related phenotypic, clinical and other datasets across Europe based on the progress achieved in the context of the 1+ Million Genomes initiative (1+MG).[1] Authorised data users, such as clinicians, researchers and innovators, will be able to advance our understanding of genomics for more precise and faster clinical decision-making, diagnostics, treatments and predictive medicine, and for improved public health measures that will benefit citizens, healthcare systems and the overall economy. The resulting genomic data infrastructure will be aligned with the developments under the European health data space, including relevant projects supported under the EU4Health Programme. In order to maximise the societal benefits of health data use, the genomic data infrastructure should be supported by advanced IT tools and capacities, e.g. AI, HPC, cloud, blockchain and trust solutions, as appropriate for enabling secure access to and distributed analysis of complex datasets. Moreover, the measure will support the creation, extension and adaptation (e.g. FAIRification) of genomic datasets.

Scope:

This action will support the deployment of the infrastructure needed to make harmonised European genomic data resources and linked clinical, phenotypic and other information, where available and relevant, securely findable and accessible across national borders. The data infrastructure may combine existing data platforms via FAIR-compliant interfaces and should facilitate the extension or upgrade of existing datasets

and the creation of new ones. To ensure maximum data protection, the data will be in principle analysed using distributed data analysis and AI learning techniques, while fully taking into account the applicable data protection requirements and the EU’s international obligations.

The infrastructure is expected to provide a federated network of connected genomic databases deploying trust mechanisms (security and privacy by design), enabling data discovery, querying and use of appropriate computing capacities for distributed data analysis and providing a minimum of core services to facilitate the operation of the federated network. The federated system will comprise nodes with datasets, a common platform offering a (meta)-data catalogue, a single entry point for data queries and output delivery, links to high performance computing capacities and data access control, secure authorisation and authentication services, and access to other relevant services. Data transfer across national borders and/or central storage, if and when needed, may take place (only on a voluntary basis) in accordance with applicable EU and national legal requirements.

The system connecting harmonised data sources should provide at least the following functionalities: data discoverability, data reception/access, secure interfaces (APIs), data access management, and data processing (analysis). It should be based on FAIR principles, including common interoperability standards and mechanisms. Therefore, the implementation of the genomic data infrastructure should build on the progress achieved and agreements made within the 1+ Million Genomes initiative (and outcomes of the Horizon 2020 CSA project ‘Beyond 1 Million Genomes’). It should also take into account, link to and use the outcomes of other relevant H2020 projects, in particular exploring specific use cases/disease areas, and also collaborate and synergise with the related projects and partnerships supported under Horizon Europe.

The project shall, throughout its lifetime, inform and consult the representatives of Member States. It shall lead to defining a sustainable business model and setting up a coordination entity that will supervise the activities, run and maintain the system and its services, ensure the necessary agreements within the project and with Member States, and monitor the implementation of such agreements. The governance structures shall ensure that the rights and duties of both public and private participants are duly respected.

The action should support the coordination of the multi-country project ‘Genome of Europe’ launched in the context of the 1+MG initiative towards the creation of a European network of harmonised national genomic reference cohorts representative for the European population.

The project is expected to engage with patients, health professionals, the public and other stakeholders to explain that data is used transparently and responsibly, and raise awareness of the expected benefits for European patients and citizens. To this end, it should design and implement a comprehensive communication strategy covering the relevant stakeholders and communication channels.

The project selected for the deployment of this data infrastructure will have to make provisions for gradually becoming fully compliant with the European Data Spaces Technical Framework. It will also have to coordinate and collaborate with other projects participating in the deployment of the data space and the Data Spaces Support Centre in order to allow integration of existing standards and to ensure interoperability and portability across infrastructure, applications and data.

Furthermore, the project implementing the genomic data infrastructure is encouraged to cooperate with Testing and Experimentation Facility for Health (see section 2.3.2.2), to define European test and training data sets and to provide support in their establishment.

Delegation Exception Footnote:

Implementation: European Commission

[1]https://ec.europa.eu/digital-single-market/en/european-1-million-genomes-initiative

Keywords

Algorithms, distributed, parallel and network algo Cloud computing Genomics, comparative genomics, functional genomic Clinical bioinformatics Population genetics Privacy Biomedical software Data curation Translational bioinformatics High-performance computing (HPC) Health information Pharmacology, pharmacogenomics, drug discovery and eHealth Personalised medicine Bioinformatics, biocomputing, and DNA and molecula Genomics High performance computing Trust Artificial intelligence, intelligent systems, mult Medical biotechnology related ethics Biobanks Digital Services and Platforms Health data

Tags

Multi-party computing 1+ Million Genomes Whole Exome Sequencing (WES) Whole Genome Sequencing (WGS) Distributed learning 1+MG Federated data analysis

¿No encuentras la financiación que necesitas?

Contacta con nosotros y cuentanos cuál es tu proyecto.