GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical Reasoning
- URL: http://arxiv.org/abs/2504.01886v1
- Date: Wed, 02 Apr 2025 16:43:16 GMT
- Title: GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical Reasoning
- Authors: Yanzhou Su, Tianbin Li, Jiyao Liu, Chenglong Ma, Junzhi Ning, Cheng Tang, Sibo Ju, Jin Ye, Pengcheng Chen, Ming Hu, Shixiang Tang, Lihao Liu, Bin Fu, Wenqi Shao, Xiaowei Hu, Xiangwen Liao, Yuanfeng Ji, Junjun He,
- Abstract summary: This paper presents GMAI-VL-R1, a multimodal medical reasoning model enhanced by reinforcement learning (RL) to improve its reasoning abilities.<n>We develop a reasoning data synthesis method, generating step-by-step reasoning data via rejection sampling, which further enhances the model's generalization.<n> Experimental results show that after RL training, GMAI-VL-R1 excels in tasks such as medical image diagnosis and visual question answering.
- Score: 28.911445780180077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in general medical AI have made significant strides, but existing models often lack the reasoning capabilities needed for complex medical decision-making. This paper presents GMAI-VL-R1, a multimodal medical reasoning model enhanced by reinforcement learning (RL) to improve its reasoning abilities. Through iterative training, GMAI-VL-R1 optimizes decision-making, significantly boosting diagnostic accuracy and clinical support. We also develop a reasoning data synthesis method, generating step-by-step reasoning data via rejection sampling, which further enhances the model's generalization. Experimental results show that after RL training, GMAI-VL-R1 excels in tasks such as medical image diagnosis and visual question answering. While the model demonstrates basic memorization with supervised fine-tuning, RL is crucial for true generalization. Our work establishes new evaluation benchmarks and paves the way for future advancements in medical reasoning models. Code, data, and model will be released at \href{https://github.com/uni-medical/GMAI-VL-R1}{this link}.
Related papers
- ChestX-Reasoner: Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification [57.22053411719822]
ChestX-Reasoner is a radiology diagnosis MLLM designed to leverage process supervision mined directly from clinical reports.
Our two-stage training framework combines supervised fine-tuning and reinforcement learning guided by process rewards to better align model reasoning with clinical standards.
arXiv Detail & Related papers (2025-04-29T16:48:23Z) - OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement [91.88062410741833]
This study investigates whether similar reasoning capabilities can be successfully integrated into large vision-language models (LVLMs)<n>We consider an approach that iteratively leverages supervised fine-tuning (SFT) on lightweight training data and Reinforcement Learning (RL) to further improve model generalization.<n>OpenVLThinker, a LVLM exhibiting consistently improved reasoning performance on challenging benchmarks such as MathVista, MathVerse, and MathVision, demonstrates the potential of our strategy for robust vision-language reasoning.
arXiv Detail & Related papers (2025-03-21T17:52:43Z) - Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models [6.176432104264649]
Vision-language models (VLMs) have advanced reasoning in natural scenes, but their role in medical imaging remains underexplored.<n>We introduce Med-R1, a framework exploring reinforcement learning to enhance VLMs' generalizability and trustworthiness in medical reasoning.<n>Med-R1 is evaluated across eight medical imaging modalities: CT, MRI, Ultrasound, Dermoscopy, Fundus Photography, Optical Coherence Tomography, Microscopy, and X-ray Imaging.
arXiv Detail & Related papers (2025-03-18T06:12:38Z) - Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning [19.064630697040055]
Reinforcement learning from verifiable rewards (RLVR) has recently gained attention for its ability to elicit self-evolved reasoning from base language models without explicit reasoning supervisions.
We introduce Med-RLVR as an initial study of RLVR in the medical domain leveraging medical multiple-choice question answering (MCQA) data as verifiable labels.
Our results demonstrate that RLVR is not only effective for math and coding but also extends successfully to medical question answering.
arXiv Detail & Related papers (2025-02-27T00:54:38Z) - MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning [29.84956540178252]
Reasoning is a critical frontier for advancing medical image analysis.<n>We introduce MedVLM-R1, a medical VLM that explicitly generates natural language reasoning.<n>MedVLM-R1 boosts accuracy from 55.11% to 78.22% across MRI, CT, and X-ray benchmarks.
arXiv Detail & Related papers (2025-02-26T23:57:34Z) - Towards a perturbation-based explanation for medical AI as differentiable programs [0.0]
In medicine and healthcare, there is a particular demand for sufficient and objective explainability of the outcome generated by AI models.
This work examines a numerical availability of the Jacobian matrix of deep learning models that measures how stably a model responses against small perturbations added to the input.
This is a first step towards a perturbation-based explanation, which will assist medical practitioners in understanding and interpreting the response of the AI model in its clinical application.
arXiv Detail & Related papers (2025-02-19T07:56:23Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - GENIE-NF-AI: Identifying Neurofibromatosis Tumors using Liquid Neural
Network (LTC) trained on AACR GENIE Datasets [0.0]
We propose an interpretable AI approach to diagnose patients with neurofibromatosis.
Our proposed approach outperformed existing models with 99.86% accuracy.
arXiv Detail & Related papers (2023-04-26T10:28:59Z) - Towards Medical Artificial General Intelligence via Knowledge-Enhanced
Multimodal Pretraining [121.89793208683625]
Medical artificial general intelligence (MAGI) enables one foundation model to solve different medical tasks.
We propose a new paradigm called Medical-knedge-enhanced mulTimOdal pretRaining (MOTOR)
arXiv Detail & Related papers (2023-04-26T01:26:19Z) - Competence-based Multimodal Curriculum Learning for Medical Report
Generation [98.10763792453925]
We propose a Competence-based Multimodal Curriculum Learning framework ( CMCL) to alleviate the data bias and make best use of available data.
Specifically, CMCL simulates the learning process of radiologists and optimize the model in a step by step manner.
Experiments on the public IU-Xray and MIMIC-CXR datasets show that CMCL can be incorporated into existing models to improve their performance.
arXiv Detail & Related papers (2022-06-24T08:16:01Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.