About the Session

Vision-language models (VLMs) are emerging as powerful tools that combine visual understanding with natural language capabilities, offering new opportunities for clinical support, workflow efficiency, and research innovation. This session provides an accessible overview of what VLMs are, how they differ from traditional AI approaches, and where they can add value in imaging environments. Presenters will outline broad clinical applications such as interactive image interpretation, automated description generation, and flexible zero-shot analysis, while also acknowledging common limitations including accuracy concerns, bias, and unpredictable outputs.

From an operational standpoint, the session will highlight key concepts and practical considerations for bringing VLMs into real-world imaging workflows. Topics will include infrastructure needs, system integration, performance evaluation, and ongoing monitoring to ensure safety and reliability. Attendees will gain a foundational understanding of VLM capabilities, typical workflow use cases, and high-level strategies for responsible deployment in clinical and research settings.

Objectives

  • Describe the core functions and distinguishing features of vision-language models in imaging environments.
  • Explain common clinical and operational use cases where VLMs may improve workflow efficiency or decision support.
  • Identify key limitations, risks, and considerations for responsible VLM deployment in clinical and research workflows.
Session Number

2013

Format

Education Session

Learning Topic
Artificial Intelligence (AI)Enterprise ImagingMachine Learning (ML)Productivity & Workflow
Credit Type
ACCME-MDASRT-RTCAMPEP-MPCECSIIM IIP-CIIP

Presented By

 

Pranav Rajpurkar, PhD

Associate Professor of Biomedical Informatics
Harvard Medical School

Raffi Salibian, MD

Diagnostic Radiologist
UCLA Medical Center

Cody H. Savage, MD

Radiology Resident Physician
University of Maryland Medical Center