Researchers and physicians alike are interested in identifying biomarkers from minimally invasive samples like blood to aid in disease detection and monitoring. However, so far, most targets observed at the bench don’t translate to clinical success.
According to Mathias Uhlén, a protein scientist at the KTH Royal Institute of Technology, part of this problem lies in the approach many researchers take to finding these biomarkers, comparing a single disease against healthy controls. In this model, it’s impossible to tell whether a particular protein is specific to the disease or may actually be common in several diseases.
Mathias Uhlén directs the Human Protein Atlas Program and studies protein science to advance precision medicine.
Gustav Ceder
To address this issue, he and his collaborators launched a pan-disease analysis of the human blood proteome. “By doing this pan-disease, we think that we can help the community to focus on things which actually are very specific for the diseases,” Uhlén said. “We are just finding incredibly interesting things.”
In total, the researchers studied more than 8,200 people and compared the blood proteomes from healthy individuals and those with 59 different diseases or autoimmune conditions using a high throughput protein quantification platform.1 The study, published in Science, includes the open-access proteome database.
“They’ve attempted to do something more comprehensive than anyone’s done before, in terms of mapping the secreted proteome,” said Holden Maecker, a cellular immunologist at Stanford University who was not involved with the study.
The team collected blood samples from multiple cohorts, including two different groups of healthy controls: one of adults between 50 and 65 sampled multiple times over two years and one that included children with samples taken at the ages of four, eight, 16, and 24.
From their adult cohort, they saw that the proteome of individuals was relatively stable over the sampling period. “One thing which is a little bit surprising is that we find that every individual that we have analyzed has a unique sort of fingerprint,” Uhlén said.
In the childhood longitudinal cohort, they saw that the proteomes between male and female participants were relatively similar until the third collection point. “We can see this incredible explosion of changes during puberty,” Uhlén said. They observed sex-specific differences in protein abundance for several markers. Many of these appeared to stabilize in adulthood.
To create a human disease blood atlas, the team studied the proteomes of more than 6,000 samples from people who had at least one of 59 diseases. These conditions included cardiovascular, metabolic, psychiatric, and autoimmune diseases, as well as cancer. The team first compared individual diseases to their healthy adult cohort to model the traditional type of biomarker study. They identified disease-associated markers for pancreatic cancer and rheumatoid arthritis when they focused on just these diseases. However, when they included their full atlas, many of the identified proteins also showed up in unrelated diseases.
Uhlén and his team then explored the influence of age, sex, body mass index, and disease status on the variability in protein abundance in their pan-disease atlas. They showed that for most proteins, the presence of disease contributed to most of the protein variability. However, the variability of several proteins did not map to any of these four factors, likely due to inherent human variability, Maecker said. “There’s so much variability within the population that that is still the biggest factor in the variance of each of these proteins. It’s not disease. It’s human variability,” he said.
The researchers evaluated how the protein profiles varied across different diseases. Although many profiles overlapped, some diseases, including pediatric and liver-related diseases, clustered based on their proteome. “It looks like you could diagnose [these diseases] completely with a set subset of proteins from the study. And that’s impressive,” Maecker said. “It might give us clues to the mechanisms behind those particular diseases.” The team also observed that different diseases, like liver-related and infectious diseases, shared abundant protein profiles, indicating their relevance across diseases.
One disease class that the pan-disease proteome did not distinguish well was different types of common cancers. The team included additional blood samples from people diagnosed with breast, ovarian, prostate, colorectal, or lung cancer from biobanks. After training a machine learning model, the team showed that the algorithm accurately predicted lung, colorectal, and ovarian cancers from the protein profiles.
The team then assessed whether this model could predict cancer from longitudinally collected blood samples prior to when the diagnosis was made. While the algorithm predicted lung cancer years before the diagnosis from the protein profile, it only reliably predicted ovarian and colorectal cancer at the time of diagnosis, highlighting the challenge of creating biomarkers for disease detection.
Uhlén and his team hope to study ways to apply these resources and high throughput technologies further to the early detection of cancer. They are also exploring ways to verify their current results with an independent proteomic technology.2 In the meantime, Uhlén hopes that other researchers can use the data to explore potential disease biomarkers.