Deepfake X-rays Deceive Radiologists in New Study
- •Radiologists failed to detect synthetic X-rays 59% of the time during initial diagnostic assessments
- •Clinician accuracy improved to only 75% even after being warned about AI-generated images
- •Multimodal AI models struggled to identify deepfakes with accuracy ranging from 57% to 85%
Deepfake technology has officially entered the clinical arena, and the initial results are unsettling. A recent study published in the journal Radiology reveals that professional radiologists struggle significantly to distinguish between authentic medical scans and synthetic X-rays generated by AI. Using simple prompts, researchers tasked ChatGPT with creating radiographs that mimicked specific anatomical locations and diseases. The resulting images were so convincing that clinicians failed to identify them as fake nearly 60% of the time during standard diagnostic tasks.
Even when the 17 participating radiologists were explicitly warned to look for synthetic anomalies, their accuracy in spotting deepfakes only reached 75%. This gap in detection suggests that the human eye—even one trained through years of medical residency—is not a reliable defense against sophisticated generative models. The ease with which these images are produced contrasts sharply with the difficulty of their identification, creating a potential trust crisis regarding the integrity of digital healthcare records.
The study also highlights a recursive failure: AI models are currently poor at detecting their own handiwork. Four different multimodal models—AI systems designed to interpret both text and visual data simultaneously—showed inconsistent results, with detection accuracy fluctuating between 57% and 85%. As these generative tools become more accessible, the medical community faces a dual challenge: integrating AI diagnostic benefits while safeguarding the data that underpins clinical decisions.