Improving clinical trials with machine learning

Machine learning could improve our ability to determine whether a new drug works in the brain, potentially enabling researchers to detect drug effects that would be missed entirely by conventional tests, according to a study published today in Brain.

Study author Dr Parashkev Nachev said: “Current statistical models are too simple. They fail to capture complex biological variations across people, discarding them as mere noise. We suspected this could partly explain why so many drug trials work in simple animals but fail in the complex brains of humans. If so, machine learning capable of modelling the human brain in its full complexity may uncover treatment effects that would otherwise be missed.”

To test the concept, the research team looked at large-scale data from stroke patients, extracting the complex anatomical pattern of brain damage caused by the stroke in each patient, creating in the process the largest collection of anatomically registered images of stroke ever assembled. As an index of the impact of stroke, they used gaze direction, objectively measured from the eyes as seen on head CT scans upon hospital admission; and from MRI scans typically done 1-3 days later.

The team then simulated a large-scale meta-analysis of a set of hypothetical drugs, to see if treatment effects of different magnitudes that would have been missed by conventional statistical analysis could be identified with machine learning. For example, given a drug treatment that shrinks a brain lesion by 70%, the team tested for a significant effect using conventional (low-dimensional) statistical tests as well as by using high-dimensional machine learning methods.

The machine learning technique took into account the presence or absence of damage across the entire brain, treating the stroke as a complex “fingerprint”, described by a multitude of variables.

First author Tianbo Xu said: “Stroke trials tend to use relatively few, crude variables, such as the size of the lesion, ignoring whether the lesion is centred on a critical area or at the edge of it. Our algorithm learned the entire pattern of damage across the brain instead, employing thousands of variables at high anatomical resolution. We used well-established methods of machine learning, teaching the algorithm on subsets of data and then testing its performance on other subsets it had not seen.”

The advantage of the machine learning approach was particularly strong when looking at interventions that reduce the volume of the lesion itself. With conventional low-dimensional models, the intervention would need to shrink the lesion by 78.4% of its volume for the effect to be detected in a trial more often than not, while the high-dimensional model would more than likely detect an effect when the lesion was shrunk by only 55%.

Dr Nachev said: “Conventional statistical models will miss an effect even if the drug typically reduces the size of the lesion by half, or more, simply because the complexity of the brain’s functional anatomy—when left unaccounted for—introduces so much individual variability in measured clinical outcomes. Yet saving 50% of the affected brain area is meaningful even if it doesn’t have a clear impact on behaviour. There’s no such thing as redundant brain.”

The researchers say their findings demonstrate that machine learning could be invaluable to medical science, especially when the system under study—such as the brain—is highly complex.

Dr Nachev said: “The real value of machine learning lies not so much in automating things we find easy to do naturally, but formalising very complex decisions. Machine learning can combine the intuitive flexibility of a clinician with the formality of the statistics that drive evidence-based medicine. Models that pull together 1000s of variables can still be rigorous and mathematically sound. We can now capture the complex relationship between anatomy and outcome with high precision.”

The study was funded by the BRC and Wellcome.

Visit Brain to read High-dimensional therapeutic inference in the focally damaged human brain in full.