PharmiWeb.com - Global Pharma News & Resources
09-Oct-2025

The Impact of Artificial Intelligence on Biomarker Discovery

The Impact of Artificial Intelligence on Biomarker Discovery

Summary

Artificial intelligence (AI) is rapidly transforming biomarker discovery by unlocking insights from vast and complex datasets. Hurdle recently reviewed how revolutionary this technology could be, while also highlighting the potential hurdles that still stand in the way of its translation into the clinic.
Editor: Press Office Last Updated: 09-Oct-2025
Artificial intelligence (AI) is rapidly transforming biomarker discovery by unlocking insights from vast and complex datasets. Hurdle recently reviewed how revolutionary this technology could be, while also highlighting the potential hurdles that still stand in the way of its translation into the clinic.   The Explosion of Data: Bigger, Broader, More Complex
Large cohort studies are no longer a curiosity; they’re foundational. Crucially, these cohorts increasingly link rich electronic health records (EHRs) to multiomic and imaging data, creating longitudinal, clinically anchored datasets that dramatically expand the discovery space for AI-derived biomarkers. For example, the UK Biobank has data for half a million people with long follow-up (over 15 years) covering multiple sequencing modalities, imaging, electronic health records, wearable sensors, and more. Other initiatives like All of Us are aiming to enroll at least one million participants, generating both molecular and digital health data. Meanwhile, the China Kadoorie Biobank (500,000 participants) and H3Africa aim to enroll over 500,000 and 100,000 participants respectively. These projects would add breadth, including demographics that are under-represented in other projects.   This scale matters because AI models need both depth (many features per person) and breadth (diversity of people, conditions, and data types) to avoid overfitting and ensure generalisability. Many older datasets lacked certain modalities (for example, pathology or imaging) or sufficient diversity, which limits how well biomarker findings translate in other populations.   Understanding Biomarker Categories
Before diving deeper into AI applications, it helps to see how biomarkers fit into the patient journey. Biomarkers are not a single entity: they can act as risk indicators, diagnostic tools, monitors of treatment response, or safety checks for drug toxicity. The figure below maps these categories along the disease continuum, from prevention to diagnosis, treatment, and recurrence.  
  Models and Modalities: What AI Can Do
AI methods are being applied across many fronts. Classical statistical and machine learning approaches (like elastic net regression) are improving polygenic risk scores. Random forests, trained on millions of genetic variants (≈3.5 million in some cases), are being used to detect Mendelian disorders with greater sensitivity than before. Survival analysis models,  particularly Cox proportional hazards models, are being used to develop ProteinScores, composite protein panels that predict ten-year risks of various diseases such as type 2 diabetes, Alzheimer’s, and COPD, as well as all-cause mortality. These models often outperform baseline models using just age, sex, and standard clinical or lifestyle risk factors.   On the unstructured side, deep learning is making strides with imaging (MRI, CT, pathology slides). Convolutional neural networks can detect Alzheimer’s progression from MRI scans. Tools with regulatory clearance, like Brainomix 360 e-ASPECTS, detect early signs of stroke in CT scans that might otherwise be missed by radiologists or other specialists. Transformer-based pathology models, such as Virchow2 trained on 3.1 million histopathology slides, are showing cross-task performance, handling multiple pathology tasks in a single model.    Concrete Biomarkers: What’s Actually Working Several AI-powered biomarkers stand out for their performance. biomodal has achieved an area under the curve (AUC) of 0.95 for colorectal cancer detection using methylation and hydroxymethylation signals from cell-free DNA, compared with just 0.66 using traditional approaches that conflate these two epigenetic modalities. A prognostic model for high-grade serous ovarian cancer that integrates genomics, transcriptomics, proteomics, and pathology achieved a five-year AUC of 0.911 and a hazard ratio of 18.23 for survival stratification. The Gut Microbiome Wellness Index 2 has been validated on 8,069 samples across 54 studies in 26 countries, distinguishing healthy from unhealthy microbiome states.   Consumer-scale applications are also emerging. The Apple Watch’s atrial fibrillation detector, using deep learning on photoplethysmography signals, has FDA approval for continuous, population-scale monitoring. Similarly, CE-marked tools like Stratipath Breast are bringing AI into clinical workflows, offering improved risk stratification for breast cancer patients.   Why So Few Biomarkers Make It to Clinical Use Despite these successes, only around 1–2% of published biomarker discoveries ever make it into clinical practice. This is largely due to the cost and complexity of translation: bringing a biomarker from discovery to a validated diagnostic typically takes 3–7 years and costs US$20–100 million. Datasets are often biased toward certain populations, AI models are sometimes black boxes lacking interpretability, and scaling requires robust diagnostic infrastructure that can handle multi-omics assays alongside clinical data. Although there are nearly 950 AI/ML-enabled medical devices with FDA clearance, the majority are still not part of everyday clinical workflows.   What’s Next: From Discovery to Action AI-powered “Virtual Labs”,  systems of AI agents that act as virtual scientists,  are beginning to design and test biomarker pipelines, even generating and validating nanobody candidates for SARS-CoV-2. At the same time, wearables and chemical sensors are enabling continuous, at-home biomarker monitoring. Together, these developments point toward a future where healthcare is proactive and personalised, rather than reactive and impersonal.   Conclusion AI is already delivering measurable gains in biomarker discovery, with certain methods validated in tens of thousands of participants and achieving accuracies above 90%. Some regulatory-approved tools are already in clinical use. The challenge is no longer whether AI works, but how to make it affordable, equitable, and scalable. Overcoming these barriers will unlock a new era in which biomarkers are not only powerful in theory but also transformative in everyday life.