Real-World Applications and Experiences of AI/ML Deployment for Drug Discovery

After listening to Rob Young’s talk at the Chemist in Industry about “Fads and Fashion in Drug Discovery” where he spoke about whether artificial intelligence (AI) is all that it is made out to be within the drug discovery process, I started to investigate further.  I stumbled across Pitt et al.’s Real-World Applications and Experiences of AI/ML Deployment for Drug Discovery. Within the paper, they discuss their and others’ experiences of integrating AI and machine learning (ML) into drug discovery pipelines and the methods that have had the biggest impact. There are nine main sections: machine representations of chemical space; machine learning; generative design; protein modelling; active learning; synthetic tractability and retrosynthesis prediction; safety assessment; computational pipelines; and AI in the context of medicinal chemistry projects.

For machine representation of chemical space, they discuss autoencoders and transfer learning, recurrent neural networks. Pitt et al. discuss an in-house architecture they have created that combines quantitative structure-activity relationship (QSAR) and deep generate chemistry into the same latent space.

 They also discuss how ML is used to predict activity, absorption, distribution, metabolism, excretion, and toxicity (ADMET), and physiochemical properties. The models, however, are only as good as the input data and this is something that we too have found.

The main discussion within the generative design section is around REINVENT and how it can be used to optimise physicochemical and ADMET properties, however, postprocessing is needed.

AlphaFold is always the main topic when examining AI and ML in the protein modelling world, so of course it is Pitt et. al main focal point. I am excited to see what work will continue in this area to keep improving the protein modelling, especially with ligand protein models.

Active learning is a ML strategy that continually learns and fits perfectly with the design-make-test-analyse (DMTA) cycle. This also provides the potential ability to do multiparameter optimisation. Another feedback potential into these models is synthetic information.

The rise of AI and ML hasn’t just been within designing compounds and predicting properties but it can be used for the “design” and “make” stages of the DMTA cycle. This can be done use to the development of computer-aided synthesis planning (CASP) tools. The additions of electronic laboratory notebooks (ELNs) have increased the effectiveness of the tools. Even though these tools can provide synthesis there still needs to be some final medicinal chemist evaluation to understand the output from the tools.

Additionally, the safety of compounds needs to be considered. More AI/ML approaches are being used initially to help flag potential risks and help to reduce costs.

Pitt et al then conclude their review by delving into a computational pipeline that has been generated at GSK BRADSHAW and discuss how they have an automation DMTA cycle. They then go on to discuss how Evotec have specific research and development groups that focus entirely on AI and ML to help improve medicinal chemistry projects and introduce the term D2MTL (Design-Decide-Make-Test-Learn), see Figure 1.

Comp chem POTM march2025

Figure 1

AI and ML are powerful tools and with more organisations investing more into AI drug discovery, the future is exciting. However, the output and effectiveness of these tools still need computational and medicinal chemist’s inputs and assessment of the output to be a viable tool.

J. Med. Chem.2025, 68, 2, 851–859