A groundbreaking development in artificial intelligence promises to shift the focus of medicine from diagnosis to prediction. A new AI model, detailed in a recent publication, claims it can forecast which of more than a thousand diseases a person might develop in their lifetime. This could one day allow doctors to identify high-risk patients long before symptoms appear.
How Delphi-2M Works: Inspired by Language Models
The model, named Delphi-2M, was developed by research teams at the European Molecular Biology Laboratory (EMBL) in Cambridge and the German Cancer Research Centre in Heidelberg. Its design takes a page from the book of large language models (LLMs) like GPT-5, which power chatbots. These LLMs are trained on vast amounts of text to predict the next word in a sequence.
The creators of Delphi-2M applied a similar logic to human health. They reasoned that an AI trained on massive datasets of medical histories could learn to predict the next health event in a person's life. However, a crucial adjustment was needed. Unlike words in a sentence, medical diagnoses are separated by time. To account for this, the researchers modified the model to encode a person's age instead of a word's position, allowing it to understand the temporal gaps between health events.
The model was trained on anonymised data from 400,000 people in the UK Biobank, one of the world's most comprehensive biological databases. It learned from the sequence and timing of ICD-10 codes—the standard medical shorthand for diagnoses—representing 1,256 different diseases.
Testing Accuracy and Future Potential
After its initial training, Delphi-2M was rigorously tested. It was first validated on data from another 100,000 individuals in the UK Biobank. To ensure its robustness, researchers then tested it on famously thorough Danish health records, using data from 1.9 million Danes dating back to 1978.
To measure performance, scientists used a metric called AUC (Area Under the Curve). A perfect score is 1, while 0.5 is no better than random chance. For predicting diagnoses within five years, Delphi-2M scored an average of 0.76 on British data and 0.67 on Danish data. Predictably, its accuracy was higher for conditions that commonly follow a specific prior event (like death after sepsis) and lower for random events like viral infections. When looking ten years into the future, the average score dropped to 0.7.
While the results are promising, real-world application in clinics is still years away. The model must undergo extensive clinical trials to see if it genuinely improves patient outcomes. The research team is also working to upgrade Delphi-2M to process more than just diagnosis codes. Future versions could incorporate medical images and genome sequences from the UK Biobank, potentially boosting accuracy further.
A New Frontier in Preventive Medicine and Research
Delphi-2M is not alone in this space. Other models, like Foresight from King's College London (2024) and ETHOS from Harvard University, share similar ambitions of predicting future health. However, Delphi-2M's approach has already yielded valuable insights for biologists. Its predictive patterns reveal which diseases tend to cluster together, hinting at previously unknown biological relationships between conditions.
Ewan Birney, a geneticist at EMBL, expressed great excitement about the possibilities, comparing the feeling to being "a kid in a candy shop." Beyond individual patient care, such models could help health authorities plan budgets by identifying disease areas that may require more resources in the future.
The journey from lab to hospital will be long, but Delphi-2M represents a significant step toward a future where medicine is proactively guided by AI, aiming to prevent disease rather than just treat it.