Data Infrastructure

Experimental Medicine Application Platform (EMAP)

EMAP is a translational data science platform built in and for the NHS and has been specifically created to support research. It contains over 100 million health data items from UCLH, with 500 items added every minute, and has been developed as a non-operational “mirror” of a subset of UCLH data (historical and live). The underpinning aim is to ensure that no clinical data are corrupted or destroyed during the interaction between the research process and the hospital’s systems and that the systems are not compromised (for instance, if they are interrupted or slowed down by research enquiries).

Today, the typical way for a researcher to access hospital data is to extract it from the hospital into the outside world. This introduces privacy risks, as the data leave the protected environment of the NHS. EMAP reverses this process. By providing a software environment within the hospital, we enable research to happen inside the NHS, so that patient data never have to leave.

Currently, EMAP includes demographics, vital signs, lab tests and more, and consists of:

  • a live data warehouse
  • a secure research environment giving ready access to modern data science and software development tools
  • access to powerful and flexible compute facilities.

EMAPData are transmitted live from the Epic, UCLH’s new electronic health record system (EHRS), by means of HL7 interfaces (a set of international standards for transfer of clinical and administrative data between software applications used by various healthcare providers), supplemented by standard ETL (“extract, transform, load”) pipelines, where appropriate. Additionally, data from legacy systems have been transferred into a “data lake”, a repository that holds vast amounts of raw data until needed.

Data within the warehouse are modelled in a tiered series of databases optimised for different use cases, culminating in data coded using SNOMED-CT (a structured clinical vocabulary for use in an electronic health record), and modelled using the OMOP format to aid interoperability. A FHIR interface is currently in development.

EMAP is one of the workstreams of the INFORM project, initially funded by the UCLH Charity and supported by the UCLH/UCL BRC. The development of EMAP at UCLH has been led by Dr Mark White (UCLH Chief Technology Officer) in conjunction with Drs Steve Harris, Tim Bonnici, Dave Brealey and Niall MacCallum.

Why was EMAP developed

The potential for data-driven products – analytics, algorithms and apps – to improve health care is widely recognised. However, despite an explosion in the number of digital health papers being published, very few innovations make it out of the lab to the bedside. Similarly, the creative potential of frontline NHS staff remains largely untapped because of the difficulties using data for innovation within their own hospitals.

A number of barriers stand in the way:

  • Routinely-collected clinical data are “siloed”, "dirty" and difficult to access.
  • Modern tools and compute are not routinely available within the NHS firewall, where the data rest. Exporting the data means grappling with the increasingly thorny issues of patient privacy, information security and data governance.
  • Live data to drive real-time apps and clinical decision support are not readily available. Even where they are, utilising them requires in-depth knowledge of the hospital infrastructure and arcane messaging standards. This makes it hard for innovators to develop applications and evaluate their real-world effect with the same rigour in which we expect drugs to be evaluated.
  • There is no safe space for experimentation using live data. The need to protect the operational integrity of clinical information systems is at odds with the innovation process, which inevitably involves iteration, blind alleys and mistakes.

Beyond the technological barriers, there are skills and data-literacy barriers. The staff delivering patient care day in day out are well placed to understand where digitally-enabled change would be useful but they may not have the knowledge or skills to translate their ideas into effective and sustainable technological innovation.