Skip to main content

AI Can Simplify Reports for Patients

Large language models (LLMs) can produce radiology report versions that are significantly more understandable for patients than the original physician-written reports — but critical errors in open-source models raise an important safety concern. This is what a study published in European Radiology reveals, having tested three different LLMs on the task of simplifying 60 reports across X-ray, CT, MRI, and ultrasound modalities.

Doctor reviewing radiology reports on computer monitor in clinical setting
AI-simplified reports can help patients understand their imaging exams

The demand for this type of solution is real: patients increasingly want direct access to their imaging exams and reports. However, the technical language of radiological reports makes comprehension nearly impossible for laypersons. And asking the radiologist to draft a second plain-language version is impractical in the current context of workforce overload.

What the Study Found

German researchers tested ChatGPT-4o alongside two open-source LLMs (Llama-3-70B and Mixtral-8x22B) deployed on-premises within their hospitals. The models were instructed to generate summaries at an eighth-grade reading level while preserving essential clinical information.

The results were revealing:

  • Original reports scored just 17 on the Flesch readability scale, versus 44–46 for AI-generated versions
  • Understandability jumped from 1.5 to 4.1–4.4 on a five-point scale
  • The two open-source LLMs showed critical error rates of 8.3% to 10%, while ChatGPT-4o had zero critical errors
  • Reading time increased considerably: 15 seconds for originals versus 64–73 seconds for the simplified versions

On-Premises versus Cloud: A Real Dilemma

The issue of critical errors in open-source models is particularly concerning. Many institutions prefer locally deployed LLMs for patient privacy reasons, avoiding sending clinical data to external servers such as those used by ChatGPT. However, the tested local models demonstrated error rates that could result in patient harm — a trade-off that requires careful evaluation.

For professionals working with DICOM integration in clinical practice, this technology represents an additional processing layer that can be integrated into existing workflows. The key lies in ensuring adequate clinical oversight, especially when using PACS technology with AI.

Looking Ahead

The study points to a future where tasks like report simplification could be delegated to generative AI algorithms — but with the caveat that significant work remains to ensure patient safety and privacy before this becomes routine clinical practice.

Source: The Imaging Wire

Leave a Reply