In a move that could reshape drug discovery, researchers at Harvard Medical School (HMS) have designed an artificial intelligence (AI) model they say is capable of identifying treatments that reverse disease states in cells.
Unlike traditional approaches that typically test one protein target or drug at a time in hopes of identifying an effective treatment, the new model, called PDGrapher focuses on multiple drivers of disease and identifies the genes most likely to revert diseased cells back to healthy function.
The tool also identifies the best single or combined targets for treatments that correct the disease process. The researchers say that by zeroing in on the targets most likely to reverse disease, the new approach could speed up drug discovery and design and unlock therapies for conditions that have long eluded traditional methods.
“Traditional drug discovery resembles tasting hundreds of prepared dishes to find one that happens to taste perfect,” said study senior author Marinka Zitnik, PhD, associate professor of biomedical informatics in the Blavatnik Institute at HMS. “PDGrapher works like a master chef who understands what they want the dish to be and exactly how to combine ingredients to achieve the desired flavor.”
Zitnik is senior author of the team’s published paper in Nature Biomedical Engineering, titled “Combinatorial prediction of therapeutic perturbations using causally inspired neural networks.” In their paper, they noted, “Using insights from causal discovery and geometric deep learning, here we introduce PDGrapher, an approach for the combinatorial prediction of therapeutic targets that can shift gene expression from an initial diseased state to a desired treated state.” The researchers are making PDGrapher available for free.
The traditional drug discovery approach, which focuses on activating or inhibiting a single protein, has succeeded with treatments such as kinase inhibitors—the authors cite imatinib as an example—drugs that block certain proteins used by cancer cells to grow and divide. “Target-driven drug discovery, which has been the dominant approach since the 1990s, focuses on designing highly specific compounds to act against targets, such as proteins or enzymes, that are implicated in disease, often through genetic evidence,” they wrote.
However, Zitnik noted, this discovery paradigm can fall short when diseases are fueled by the interplay of multiple signaling pathways and genes. For example, many breakthrough drugs discovered in recent decades—think immune checkpoint inhibitors and CAR T-cell therapies—work by targeting disease processes in cells. This has resulted in a revival over the last decade of phenotype-driven approaches. In contrast to target-driven drug discovery, phenotype-driven approaches identify disease-counteracting compounds by analyzing the phenotypic signatures that distinguish diseases from healthy states. “Instead of the ‘one drug, one gene, one disease’ model of target-driven approaches, phenotype-driven drug discovery focuses on identifying compounds or, more broadly, perturbagens—combinations of therapeutic targets—that reverse disease phenotypes as measured by assays without predefined targets,” they stated.
The approach enabled by PDGrapher looks at the bigger picture to find compounds that can actually reverse signs of disease in cells, even if scientists don’t yet know exactly which molecules those compounds may be acting on.
PDGrapher is a type of artificial intelligence tool called a graph neural network (GNN). This tool doesn’t just look at individual data points but at the connections that exist between these data points and the effects they have on one another.
In the context of biology and drug discovery, this approach is used to map the relationship between various genes, proteins, and signaling pathways inside cells and predict the best combination of therapies that would correct the underlying dysfunction of a cell to restore healthy cell behavior. Instead of exhaustively testing compounds from large drug databases, the new model focuses on drug combinations that are most likely to reverse disease. “Unlike methods that learn how perturbations alter phenotypes, PDGrapher solves the inverse problem and predicts the perturbagens needed to achieve a desired response by embedding disease cell states into networks, learning a latent representation of these states, and identifying optimal combinatorial perturbations,” the investigators explained.
PDGrapher points to parts of the cell that might be driving disease. Next, it simulates what happens if these cellular parts were turned off or dialed down. The AI model then offers an answer as to whether a diseased cell would happen if certain targets were “hit.”
Zitnik added, “Instead of testing every possible recipe, PDGrapher asks: ‘Which mix of ingredients will turn this bland or overly salty dish into a perfectly balanced meal?’”
For their reported study, the researchers trained the tool on a dataset of diseased cells before and after treatment so that it could figure out which genes to target to shift cells from a diseased state to a healthy one. “PDGrapher is trained on a dataset of disease–treated sample pairs to predict therapeutic gene targets that can shift the gene expression phenotype from a diseased to a healthy or treated state,” they continued.
Next, they tested it on 19 datasets spanning 11 types of cancer, using both genetic and drug-based experiments, asking the tool to predict various treatment options for cell samples it had not seen before and for cancer types it had not encountered.
The tool accurately predicted drug targets already known to work but that were deliberately excluded during training to ensure the model did not simply recall the right answers. It also identified additional candidates supported by emerging evidence. The model, in addition, highlighted KDR (VEGFR2) as a target for non-small cell lung cancer, aligning with clinical evidence. “Importantly, PDGrapher has successfully identified KDR among the top 20 predicted targets in chemical-PPI-lung-A549, validating its precision in detecting key therapeutic targets for lung cancer,” the team commented.
It also identified TOP2A, an enzyme already targeted by approved chemotherapies, as a treatment target in certain tumors, adding to evidence from recent preclinical studies that TOP2A inhibition may be used to curb the spread of metastases in non-small cell lung cancer. “Using the predicted target of TOP2A, PDGrapher then identified three drugs, aldoxorubicin, vosaroxin, and doxorubicin hydrochloride, as candidate drugs,” the authors reported. “These drugs were not part of the training dataset of PDGrapher and are in the early stages of clinical development.”
The model showed superior accuracy and efficiency compared with other similar tools. In previously unseen datasets, it ranked the correct therapeutic targets up to 35% higher than other models did and delivered results up to 25 times faster than comparable AI approaches. “An advantage of PDGrapher is its direct prediction, in contrast to the indirect and computationally intensive approach common in phenotype-driven models,” the investigators added. “It trains up to 25X faster than existing methods, providing a fast approach for identifying therapeutic perturbations and advancing phenotype-driven drug discovery.”
The new approach could optimize the way new drugs are designed, the researchers said. This is because, instead of trying to predict how every possible change would affect a cell and then looking for a useful drug, PDGrapher right away seeks which specific targets can reverse a disease trait. This makes it faster to test ideas and lets researchers focus on fewer promising targets.
This tool could be especially useful for complex diseases fueled by multiple pathways, such as cancer, in which tumors can outsmart drugs that hit just one target. Because PDGrapher identifies multiple targets involved in a disease, it could help circumvent this problem. “By leveraging causal reasoning and representation learning on gene networks, PDGrapher identifies perturbagens necessary to achieve specific phenotypic changes,” the team pointed out. “This approach enables the direct prediction of therapeutic targets that can reverse disease phenotypes, bypassing the need for exhaustive response simulations across large perturbation libraries … PDGrapher has the potential to improve therapeutic lead design and expand the search space for perturbagens.”
Additionally, the researchers said that after careful testing to validate the model, it could one day be used to analyze a patient’s cellular profile and help design individualized treatment combinations. Because PDGrapher identifies cause-and-effect biological drivers of disease, it could help researchers understand why certain drug combinations work—offering new biological insights that could propel biomedical discovery even further.
The team is currently using this model to tackle brain diseases such as Parkinson’s and Alzheimer’s, looking at how cells behave in disease and spotting genes that could help restore them to health. The researchers are also collaborating with colleagues at the Center for XDP at Massachusetts General Hospital to identify new drug targets and map which genes or pairs of genes could be affected by treatments for X-linked Dystonia-Parkinsonism, a rare inherited neurodegenerative disorder. “Our ultimate goal is to create a clear road map of possible ways to reverse disease at the cellular level,” Zitnik said.