Drug discovery has not yet had its “ChatGPT moment,” according to Arman Zaribafiyan, PhD, head of product, AI simulation and platforms, at SandboxAQ, in an interview with GEN. “We can’t rely only on machine learning models trained on text-based data to solve the world’s most challenging problems. The cure for cancer is not written in Wikipedia,” he said.
SandboxAQ is a spinout from the multinational technology conglomerate, Alphabet, seeking to advance large quantitative models (LQMs) for positive impact on society, including drug discovery applications. In contrast to large language models (LLMs), which train on text-based data, LQMs are grounded in physics to simulate real-world systems.
The 2024 publications of Google DeepMind’s AlphaFold31 and RoseTTAFold All-Atom,2 developed by the lab of Nobel Laureate David Baker, PhD, was a technological inflection point that expanded protein structure prediction capabilities from peptide chains to interactions with small molecules, nucleic acids, metal ions, and more.
These structural co-folding models predicted the “pose,” or how ligands bind to their target protein, but had not yet made the advance of predicting binding affinity, or the strength of the interaction. Achieving accurate binding affinity predictions would provide a powerful alternative to resource-intensive experimental screens to cut discovery timelines and save costs.
Small molecules that bind
In June, researchers from the Massachusetts Institute of Technology (MIT), in collaboration with Recursion, the Salt Lake City-based AI drug discovery company that combined with Exscientia in 2024, announced the open-source release of Boltz-2. It demonstrated binding affinity predictions at newfound speed and accuracy to democratize AI-based small molecule drug discovery across the commercial landscape.
Boltz-2 was the top predictor of binding affinity at the December 2024 Critical Assessment of Protein Structure Prediction 16 (CASP16) competition for benchmarking state-of-the-art models in structural biology. In speed, Boltz-2 is reported to calculate binding-affinity values in just 20 seconds, a thousand times faster than free-energy perturbation (FEP) simulations, the current physics-based computational standard. The model is available under the highly permissive MIT license, which allows commercial drug developers to use the model internally and apply their own proprietary data.
Zaribafiyan emphasizes that the lack of data connecting protein-ligand structural complexes with pharmacokinetics (PK)—how the body affects a drug—and pharmacodynamics (PD)—how drugs impact the body—remains a main bottleneck in AI-based drug discovery. To address this gap, one solution is to generate synthetic training data using computationally predicted structures.
A few weeks after Boltz-2’s release, SandboxAQ, in collaboration with Nvidia, announced the Structurally-Augmented IC50 Repository (SAIR), an open-access repository that leveraged the Boltz series of models to generate computationally folded protein–ligand structures linked to corresponding experimental drug affinity values.
SAIR contains over one million unique protein–ligand pairs and a total of 5.2 million 3D structures curated from experimental binding affinity databases ChEMBL and BindingDB, which were then computationally folded using Boltz-1x. (Boltz-1x is an augmented version of the biomolecular complex prediction model Boltz-1, which improves structures to respect physical laws and prevent distorted internal geometries). According to the SAIR technical report, 97% of the Boltz-1x folded structures passed the checks of PoseBusters, an established computational tool that evaluates biophysical plausibility.
While Boltz-2’s training set includes data from ChEMBL and BindingDB, and by extension, all complexes contained in SAIR, other research groups have reported applying SAIR to their AI drug discovery efforts. According to Zaribafiyan, Technetium Therapeutics is using SAIR to build agentic AI models to identify drug candidates with a focus on oncological and immunological diseases, while researchers from Texas A&M University are generating foundation models that design novel ligands inside the pocket of a protein for therapeutic applications.
“You want to understand the pose if you’re going to make additions or changes to the small molecule once you know that it’s binding, but most of early drug discovery is ‘does your molecule bind or not?’,” weighed in Ian Quigley, PhD, CEO of Leash Bio, in an interview with GEN. “The ability to predict [binding affinity] is important and we’re grateful that the community is paying more attention.”
Quality data, simple architecture
Leash is a Salt Lake City-based start-up with the mission of filling the small molecule drug discovery data gap. According to Quigley, life science datasets can be riddled with batch effects and technical noise, making it difficult for models to make accurate biological predictions.
“There’s a famous data collection where they gathered horse photos from an equestrian commercial photographer. All the photos contained a watermark showing the name of the photography business in the corner,” explained Quigley. “Turns out if you put the watermark on any photo, an AI model trained on that data will say it’s a horse. Same goes for life sciences. There are many ways that watermarks can show up in your data.”
Quigley argues that generating large, quality, datasets that screen millions of small molecules against hundreds of protein targets can enable strong predictive performance, even with modest model architectures.
In July, Leash announced Hermes, a small molecule–protein binding prediction model trained exclusively on in-house data generated by the company’s platform. Hermes is not a structural model and only predicts binding likelihood given an amino acid sequence and a Simplified Molecular Input Line Entry System (SMILES) representation of a small molecule. According to Leash, this model’s simplicity enables speed and reports that the model is 200–500x faster than Boltz-2 with improved predictive performance when benchmarking against competitive AI models. Just two months later, Leash unveiled Artemis, a hit expansion tool that leverages Hermes to explore chemical space around a target of interest.
Proteins from scratch
While small-molecule drugs continue to be the dominant modality in the drug discovery industry, many experts are eyeing the complexity of proteins to address their therapeutic problems of interest.
Simon Kohl, PhD, CEO of Latent Labs, is a former Google DeepMind researcher who was involved in the development of AlphaFold2, the 2024 Nobel Prize in Chemistry-winning algorithm for protein structure prediction, from start to finish. Kohl founded Latent Labs with the vision of building frontier models for biology, starting with designing proteins from scratch.
In July, Latent Labs released Latent-X, the company’s first frontier model for de novo protein design. Latent-X achieved strong binding affinities in the picomolar range by testing only 30-100 candidates per target in wet lab experiments, offering an advance from traditional drug discovery pipelines, which require screening millions of random molecules for hit rates below one percent.
The designs from Latent-X focus on therapeutically relevant mini-binders and macrocycles. Binding affinities were reported to be competitive with the current state-of-the-art protein design models, RFdiffusion3 and AlphaProteo,4 in head-to-head experimental comparisons.
Kohl emphasizes that the model architecture, which jointly models sequence and structure at the all-atom level, makes Latent-X distinct.
“We’ve released movies where you can see the model make specific hydrogen bonds and pi stacking of aromatic rings. Generating biochemistry directly end-to-end allows us to make superior molecules from the start,” said Kohl in an interview with GEN.
Latent-X is available as a web user interface that is accessible to researchers without a computational background. Kohl highlights that the premise for Latent Labs is to provide a resource to guide pharma companies and academic groups on the hardware and model needs for their protein design workflows.
“Making a technology accessible without the need for expert knowledge and AI infrastructure in this case is true democratization,” emphasized Kohl.
Let’s reprogram
From designing proteins not found in nature to extending healthy lifespan, Retro Biosciences is applying AI models to advance cellular reprogramming for aging research. The company operates across different therapeutic modalities, from cell therapies to small molecules.
“We think of ourselves as a portfolio that invests in varying shots on goal toward that mission,” Joe Betts-LaCroix, PhD, CEO of Retro, told GEN. “There are advantages and disadvantages to different modalities that are very complementary to each other, which makes it robust for Retro as a single company.”

CEO, Retro Biosciences
In a collaboration with OpenAI in August, Retro announced the design of enhanced variants of the Yamanaka factors, a set of four specific transcription factors (Oct4, Sox2, Klf4, and c-Myc) that can reprogram adult somatic cells into induced pluripotent stem cells (iPSCs), using GPT‑4b micro, a miniature version of GPT‑4o specialized for protein engineering. (GPT-4o is a flagship model from OpenAI that accepts input and produces outputs in the form of text, audio, image, and video.)
“Protein sequence and structure models learn patterns from raw data, but you can’t prompt them based on what has been known in literature. In reprogramming, there’s approximately 20 years of literature that you don’t need to learn from scratch,” said Rico Meinl, head of Applied AI at Retro, in an interview with GEN.
Proteins redesigned by GPT-4b micro achieved greater than a 50-fold higher expression of stem cell reprogramming markers than wild-type controls in vitro and demonstrated enhanced DNA-damage-repair capabilities, indicating improved rejuvenation potential.

The model incorporates protein information in the form of textual descriptions, including co-evolutionary homologous sequences, and protein interaction networks, which allows GPT-4b micro to be prompted to generate new sequences with designed properties. As most of the data is structure-free, the model is adaptable to both structured proteins and proteins with intrinsically disordered regions, including Yamanaka factors whose activity depends on transient interactions with diverse binding partners.
From predicting small molecule binding affinity to cellular reprogramming, large-scale biological data combined with new model architectures continue to propel the AI revolution forward. Time will tell if rising computational power will lift the boat of therapeutic potential.
References
- Abramson J, Adler J, Dunger J, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. 2024; doi: 10.1038/s41586-024-07487-w.
- Krishna R, Wang J, Ahern W, et al. Generalized biomolecular modeling and design with RoseTTAFold All-Atom. Science. 2025;384(6693):eadl2528; doi: 10.1126/science.adl2528.
- Watson JL, Juergens D, Bennett NR, et al. De novo design of protein structure and function with RFdiffusion. Nature. 2023;620(7976):1089–1100; doi: 10.1038/s41586-023-06415-8.
- Zambaldi V, La D, Chu AE, et al. De novo design of high-affinity protein binders with AlphaProteo. arXiv. 2024; doi: https://arxiv.org/abs/2409.08022
