Holistic Evaluation of GPT-4V for Biomedical Imaging
- URL: http://arxiv.org/abs/2312.05256v1
- Date: Fri, 10 Nov 2023 18:40:44 GMT
- Title: Holistic Evaluation of GPT-4V for Biomedical Imaging
- Authors: Zhengliang Liu, Hanqi Jiang, Tianyang Zhong, Zihao Wu, Chong Ma, Yiwei
Li, Xiaowei Yu, Yutong Zhang, Yi Pan, Peng Shu, Yanjun Lyu, Lu Zhang, Junjie
Yao, Peixin Dong, Chao Cao, Zhenxiang Xiao, Jiaqi Wang, Huan Zhao, Shaochen
Xu, Yaonai Wei, Jingyuan Chen, Haixing Dai, Peilong Wang, Hao He, Zewei Wang,
Xinyu Wang, Xu Zhang, Lin Zhao, Yiheng Liu, Kai Zhang, Liheng Yan, Lichao
Sun, Jun Liu, Ning Qiang, Bao Ge, Xiaoyan Cai, Shijie Zhao, Xintao Hu, Yixuan
Yuan, Gang Li, Shu Zhang, Xin Zhang, Xi Jiang, Tuo Zhang, Dinggang Shen,
Quanzheng Li, Wei Liu, Xiang Li, Dajiang Zhu, Tianming Liu
- Abstract summary: GPT-4V represents a breakthrough in artificial general intelligence for computer vision.
We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and more.
Results show GPT-4V's proficiency in modality and anatomy recognition but difficulty with disease diagnosis and localization.
- Score: 113.46226609088194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a large-scale evaluation probing GPT-4V's
capabilities and limitations for biomedical image analysis. GPT-4V represents a
breakthrough in artificial general intelligence (AGI) for computer vision, with
applications in the biomedical domain. We assess GPT-4V's performance across 16
medical imaging categories, including radiology, oncology, ophthalmology,
pathology, and more. Tasks include modality recognition, anatomy localization,
disease diagnosis, report generation, and lesion detection. The extensive
experiments provide insights into GPT-4V's strengths and weaknesses. Results
show GPT-4V's proficiency in modality and anatomy recognition but difficulty
with disease diagnosis and localization. GPT-4V excels at diagnostic report
generation, indicating strong image captioning skills. While promising for
biomedical imaging AI, GPT-4V requires further enhancement and validation
before clinical deployment. We emphasize responsible development and testing
for trustworthy integration of biomedical AGI. This rigorous evaluation of
GPT-4V on diverse medical images advances understanding of multimodal large
language models (LLMs) and guides future work toward impactful healthcare
applications.
Related papers
- Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports [51.45762396192655]
Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence for computer vision.
This study evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets.
arXiv Detail & Related papers (2024-07-08T09:08:42Z) - Enhancing Medical Task Performance in GPT-4V: A Comprehensive Study on
Prompt Engineering Strategies [28.98518677093905]
GPT-4V, OpenAI's latest large vision-language model, has piqued considerable interest for its potential in medical applications.
Recent studies and internal reviews highlight its underperformance in specialized medical tasks.
This paper explores the boundary of GPT-4V's capabilities in medicine, particularly in processing complex imaging data from endoscopies, CT scans, and MRIs etc.
arXiv Detail & Related papers (2023-12-07T15:05:59Z) - GPT-4V(ision) Unsuitable for Clinical Care and Education: A Clinician-Evaluated Assessment [6.321623278767821]
GPT-4V was recently developed for general image interpretation.
Board-certified physicians and senior residents assessed GPT-4V's proficiency across a range of medical conditions.
GPT-4V's diagnostic accuracy and clinical decision-making abilities are poor, posing risks to patient safety.
arXiv Detail & Related papers (2023-11-14T17:06:09Z) - A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical
Image Analysis [87.25494411021066]
GPT-4V's multimodal capability for medical image analysis is evaluated.
It is found that GPT-4V excels in understanding medical images and generates high-quality radiology reports.
It is found that its performance for medical visual grounding needs to be substantially improved.
arXiv Detail & Related papers (2023-10-31T11:39:09Z) - Multimodal ChatGPT for Medical Applications: an Experimental Study of
GPT-4V [20.84152508192388]
We critically evaluate the capabilities of the state-of-the-art multimodal large language model, GPT-4 with Vision (GPT-4V)
Our experiments thoroughly assess GPT-4V's proficiency in answering questions paired with images using both pathology and radiology datasets.
The experiments with accuracy score conclude that the current version of GPT-4V is not recommended for real-world diagnostics.
arXiv Detail & Related papers (2023-10-29T16:26:28Z) - GPT-4 Vision on Medical Image Classification -- A Case Study on COVID-19
Dataset [58.493596972033195]
This technical report delves into the application of GPT-4 Vision (GPT-4V) in the realm of COVID-19 image classification, leveraging the transformative potential of in-context learning to enhance diagnostic processes.
arXiv Detail & Related papers (2023-10-27T21:28:36Z) - Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for
Multimodal Medical Diagnosis [59.35504779947686]
GPT-4V is OpenAI's newest model for multimodal medical diagnosis.
Our evaluation encompasses 17 human body systems.
GPT-4V demonstrates proficiency in distinguishing between medical image modalities and anatomy.
It faces significant challenges in disease diagnosis and generating comprehensive reports.
arXiv Detail & Related papers (2023-10-15T18:32:27Z) - Review of Artificial Intelligence Techniques in Imaging Data
Acquisition, Segmentation and Diagnosis for COVID-19 [71.41929762209328]
The pandemic of coronavirus disease 2019 (COVID-19) is spreading all over the world.
Medical imaging such as X-ray and computed tomography (CT) plays an essential role in the global fight against COVID-19.
The recently emerging artificial intelligence (AI) technologies further strengthen the power of the imaging tools and help medical specialists.
arXiv Detail & Related papers (2020-04-06T15:21:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.