Selected IMI projects and their datasets
As there are more than 100 IMI/IMI2 projects and even more data sources from pharmaceutical industry, an important task is to identify data sets representing the main data types for initial FAIRification.
FAIRplus selected four IMI projects whose datasets will be FAIRfied early on in the project to demonstrate its impact and to establish procedures for assessing the maturity of data in terms of FAIRness.
The first four IMI projects each represent a different indication area, stages of the drug discovery process and IMI project maturity:
The ND4BB-TRANSLOCATION IMI consortium (www.nd4bb.eu) operated between 2013 and 2018 and involved 5 EFPIA and 20 Public partners. Their goal was to develop new insights into the molecular determinants of antibiotic drug influx and efflux in gram negative bacteria. They also developed a platform (the ND4BB InfoCentre) to facilitate information sharing in antibiotic research.
Within the InfoCentre, data on historical successes and failures in antibacterial research and development are combined with information drawn from public databases and new insights generated within the ND4BB consortia on the fundamental biophysical determinants of compound transport mechanisms. The IMI InfoCentre was developed to coordinate the disclosure and combined analyses of previously confidential information. This information was provided primarily by participating organisations across the new Drug for Bad Bugs program. Initial FAIRification efforts will focus on a highly annotated database covering the antimicrobial and transport related properties of clinical antibiotics and tool compounds. Additional data sets cover PK and efficacy studies for antibiotics from EFPIA partners.
The OncoTrack consortium (www.oncotrack.eu) was active from 2011 to the 2016 with 8 EFPIA and 11 Public partners. The aim of the project was to establish innovative approaches for biomarker assessment in the peripheral circulation in order to assess the potential of circulating tumor cells, cancer stem cells and/or nucleic acids as a surrogate for invasive biopsies.
OncoTrack was the first large study to provide extensive molecular characterization of a patient donor cohort, encompassing all the disease stages and to include the establishment of matched in vivo and in vitro models. Data sets for FAIRification included whole exome (in some cases whole genome) sequencing, transcript sequencing and methylome analysis of clinical tumour samples. Confirmatory genome sequencing and transcriptome analyses were performed on xenograft and cell culture models.
In addition, drug response data for a panel of 16 therapeutic agents in the xenograft and cell culture models was generated. Related proteome data is available from both multiplex-MS methods and RPA studies. Possible scientific and societal impact from systematic reuse of the OncoTrack data includes the potential for development of new diagnostic procedures and the wider implementation of the developed biological models in the translational research process.
The eTOX consortium (www.etoxproject.eu) was composed of 17 Public and 13 EFPIA partners and was active from 2010 to 2016. The focus of the consortia was to create a platform for legacy data sharing from industrial and public sources in order to improve the efficiency of drug safety testing. Of special interest was the wealth of the high quality toxicology data in the archives of the pharmaceutical companies, which through the development of innovative strategies and novel software tools could be interrogated to enable an early prediction of the potential side-effects of new drug candidates on the basis of integrative approaches.
In practice, this led to the development of a large number of models by diverse organizations, their integration into a single system (eTOXsys) and their adoption by the pharmaceutical industry. Strict procedures and tools were established for enabling the harmonized development, documentation, verification and implementation of the models, as well as their versioning and maintenance. The project developed new tools including eTOXlab, a flexible modelling framework, ADAN, a method for the assessment of the model applicability domain, and specific protocols for model documentation (OECD QSAR).
In the context of the FAIRplus pilot, a test database covering compound toxicological data from around 100 compounds will be analysed. This will act as a starting point for the eventual FAIRification of the entire eTOX data resource which covers over 8.8 million pre-clinical data points from 8,196 pre-clinical studies on nearly 2000 chemicals. Ultimately, it is planned that these FAIR data approaches will be integrated into the data management procedures of the successor (IMI project, eTRANSAFE ).
RESOLUTE is a public-private partnership (www.re-solute.eu) which started in 2018, with 13 partners from academia and industry. The consortia aims to trigger an escalation in the appreciation and intensity of research on solute carriers (SLCs) and to establish SLCs as a tractable target class for medical research and development.
Through the coupling of an inclusive, “open-access ethos” to the data, results, techniques and reagents with the highest-possible quality of research output, RESOLUTE expects to accelerate the pace and increase the impact of research in the field of SLCs to the global benefit of basic academic research through to applied research in SMEs and pharmaceutical companies. Data types will include RNA-Seq and quantitative gene expression analyses of various cell lines and SLC-target family wide cell viability screen measurements using sgRNA libraries.
As a newly commenced project the RESOLUTE consortium, through its cooperation with FAIRplus, will seek to implement FAIR methods into its data management approach from the earliest stages of its operation.