Microsoft has disclosed the development of an AI system capable of outperforming human doctors in diagnosing complex medical cases.
The company says the breakthrough system creates a “path to medical superintelligence.”
The AI, developed by Microsoft’s artificial intelligence unit led by Mustafa Suleyman, mimics a team of expert physicians tackling challenging diagnostic problems.
When tested with OpenAI’s advanced o3 model, the system successfully solved more than 80% of specially selected cases.
In comparison, practicing physicians without access to tools or peer support achieved only 20%.
Microsoft stated that the system also demonstrated higher efficiency in ordering medical tests, potentially making it more cost-effective than human clinicians.
Model Mirrors Real-World Clinical Reasoning
Microsoft designed its system to behave like a real-world doctor. Instead of providing direct answers, the AI takes clinical steps, such as asking targeted questions and ordering tests, before arriving at a diagnosis.
As an example, a patient with cough and fever may need blood tests and a chest X-ray before pneumonia is confirmed.
This layered process reflects how clinicians typically handle diagnostic uncertainty.
The system was evaluated using more than 300 complex case studies from the New England Journal of Medicine.
These were restructured into interactive challenges to test the AI’s diagnostic reasoning.
Collaboration Across Leading AI Models
To build the diagnostic system, Microsoft integrated several top-performing language models, including those from OpenAI, Meta, Anthropic, Google, and Elon Musk’s Grok.
The orchestrator model acts as a coordinator, guiding the AI through clinical reasoning steps alongside these models.
Suleyman told The Guardian that the system could be nearly error-free within a decade.
“It’s pretty clear that we are on a path to these systems getting almost error-free in the next 5–10 years. It will be a massive weight off the shoulders of all health systems around the world,” he said.
Microsoft Emphasizes Support, Not Replacement
Despite the system’s performance, Microsoft said AI is intended to support, not replace, doctors.
The company noted that clinical work involves more than diagnosis. “Their clinical roles are much broader than simply making a diagnosis.
They need to navigate ambiguity and build trust with patients and their families in a way that AI isn’t set up to do,” it wrote.
Microsoft also addressed the limitations of existing medical exams, such as the U.S. Medical Licensing Examination.
It argued that high AI scores on these tests may exaggerate performance, since the multiple-choice format favors memorization over true clinical reasoning.
Not Yet Ready for Clinical Practice
Microsoft acknowledged that the system is not yet suitable for real-world clinical use. Further testing is needed to measure its effectiveness with more common symptoms and everyday healthcare scenarios.
The company stated that its approach could eventually support patient self-care and enhance clinical decision-making by providing broader and deeper expertise than any single physician.
PHOTO: FREEPIK
This article was created with AI assistance.
Read More