Update on OpenTrialsFDA: finalist for the Open Science Prize

In May, the OpenTrialsFDA team (a collaboration between Erick Turner, Dr. Ben Goldacre and the  OpenTrials team at Open Knowledge) was selected as a finalist for the Open Science Prize. This global science competition is focused on making both the outputs from science and the research process broadly accessible to the public. Six finalists will present their final prototypes at an Open Science Prize Showcase in early December 2016, with the ultimate winner to be announced in late February or early March 2017.

image01As the name suggests, OpenTrialsFDA is closely related to OpenTrials, a project funded by The Laura and John Arnold Foundation that is developing an open, online database of information about the world’s clinical research trials. OpenTrialsFDA will work on increasing access, discoverability and opportunities for re-use of a large volume of high quality information currently hidden in user-unfriendly Food and Drug Administration (FDA) drug approval packages (DAPs).

The FDA publishes these DAPs as part of the general information on drugs via its data portal known as Drugs@FDA. These documents contain detailed information about the methods and results of clinical trials, and are unbiased, compared to reports of clinical trials in academic journals. This is because FDA reviewers require adherence to the outcomes and analytic methods prespecified in the original trial protocols, so, in contrast to most journal editors, they are unforgiving of practices such as post hoc switching of outcomes and changes to the planned statistical analyses. These review packages also often report on clinical trials that have never been published.


A more complete picture: contrasting the journal version of antidepressant trials with the FDA information (image: Erick Turner, adapted from http://bit.ly/1XKLjzp)

However, despite their high value, these FDA documents are notoriously difficult to access, aggregate, and search. The website itself is not easy to navigate, and much of the information is stored in PDFs or non-searchable image files for older drugs. As a consequence, they are rarely used by clinicians and researchers. OpenTrialsFDA will work on improving this situation, so that valuable information that is currently hidden away can be discovered, presented, and used to properly inform evidence-based treatment decisions.

The team has started to scrape the FDA website, extracting the relevant information from the PDFs through a process of OCR (optical character recognition). A new OpenTrialsFDA interface will be developed to explore and discover the FDA data, with application programming interfaces (APIs) allowing third party platforms to access, search, and present the information, thus maximising discoverability, impact, and interoperability. In addition, the information will be integrated into the OpenTrials database, so that for any trial for which a match exists, users can see the corresponding FDA data.

