AI in Healthcare: A Doctor's Test Reveals Strengths and Blind Spots
In a revealing experiment, Dr. Mikhail Varshavski, a board-certified family medicine doctor based in New York, tested popular artificial intelligence tools to assess their capabilities in medical contexts. His YouTube video involved posing questions similar to those a medical student might encounter, highlighting both the potential and limitations of AI in healthcare.
The Contextual Failure: AI Misses a Critical Detail
Dr. Varshavski asked a basic question: "What are the top three symptoms of cervical cancer?" Multiple large language models, including those from Meta and Google, focused entirely on cancer of the cervix. They failed to consider that "cervical" could also refer to the neck, demonstrating a significant blind spot. In medicine, context is paramount, and here, AI clearly missed it, underscoring a key weakness in its current applications.
AI's Strengths: Accuracy and Efficiency in Other Areas
Despite this failure, the same AI tools performed admirably on other queries. For instance, when asked about insulin and calories in weight loss, they provided relevant and accurate answers. This contrast captures the current reality: AI in healthcare is powerful but not universally reliable. On World Health Day, this sparks a crucial debate: can machines replace doctors, or is human judgment still indispensable?
The Rise of AI in Healthcare: From Experiment to Implementation
Artificial intelligence is no longer experimental in medicine; it is actively diagnosing diseases, analyzing medical scans, and answering patient queries on a large scale. Studies analyzing thousands of medical questions show that AI systems often produce highly accurate and professionally structured responses.
- A study analyzing over 7,000 medical queries across the US and Australia found AI responses consistently scored between 7 and 9 out of 10 for clarity, completeness, and factual correctness.
- In some controlled studies, AI has demonstrated higher diagnostic accuracy than doctors. For example, a recent Microsoft analysis revealed an AI system correctly diagnosed up to 85.5% of complex medical cases, four times more than a group of 21 experienced physicians from the UK and US, while ordering fewer tests.
- At a medical AI competition in Shanghai, AI-assisted teams worked faster than doctors when analyzing chest X-rays, detecting dozens of conditions in a single scan.
Speed is a clear advantage for AI: it does not tire, slow down under pressure, or struggle with processing thousands of cases simultaneously. In overburdened healthcare systems, this efficiency offers a major benefit.
Where AI Still Fails: Critical Weaknesses and Concerns
Despite its strengths, AI has critical weaknesses that cannot be overlooked. The biggest issue remains context. In the Shanghai competition, AI-assisted teams overlooked certain diagnoses that human doctors caught, and human reports were more readable with a warmer tone and better structure.
Deeper concerns include:
- Most studies test AI for empathy in responses rather than patient outcomes, a significant gap that matters for safety.
- AI can provide technically accurate but unsafe advice if it ignores warning signs, misunderstands symptoms, or fails to adapt to a patient's situation, potentially leading to delayed diagnosis or incorrect treatment.
- AI systems are vulnerable to misinformation; a Reuters report found they accepted false medical information up to 47% of the time when presented in authoritative formats like fake hospital documents.
The Empathy Paradox: AI vs. Human Doctors
Surprisingly, recent research indicates AI may appear more empathetic than doctors in certain situations. A review in the British Medical Bulletin found AI-generated responses were rated as more empathetic nearly 87% of the time. However, this does not mean machines truly feel empathy; the studies evaluated written responses, not real-life interactions, giving AI advantages like unlimited time and no emotional pressure.
This raises an uncomfortable question: why do human doctors sometimes come across as less empathetic? The answer lies in systemic issues. Modern healthcare prioritizes protocols, documentation, and efficiency, with doctors spending significant time on paperwork rather than patient interaction. Burnout is another factor; globally, many doctors report high stress and exhaustion, reducing their emotional capacity.
What AI Cannot Replace: The Human Element in Medicine
Despite advances, aspects of medicine remain deeply human. AI cannot:
- Read unspoken emotions or body language like hesitation or fear.
- Provide physical reassurance, such as holding a patient's hand.
- Fully understand cultural context, personal values, or ethical dilemmas.
- Handle moments like breaking bad news or supporting long-term suffering where presence matters more than information.
Trust is central to healthcare, and AI may struggle to build it, especially in sensitive cases. Additionally, AI systems can inherit biases from training data that may not represent all populations equally.
So, Who Is Better? A Collaborative Future for Healthcare
The answer is not straightforward. AI excels in processing large data quickly, providing structured information, improving efficiency, and assisting in diagnosis. Doctors, however, outperform in understanding context, making judgment calls, communicating with empathy, and building human connections.
In reality, AI and doctors are not direct competitors; they solve different parts of the same problem. Most experts agree AI is unlikely to fully replace doctors. Instead, the future of healthcare is collaborative: AI can handle routine queries, assist with diagnosis, and reduce administrative burdens, freeing doctors to focus on caring for patients, making complex decisions, and providing emotional support. Thus, the biggest opportunity may not be replacing doctors but fixing the system around them.



