MedIQA: A Scalable Foundation Model for Prompt-Driven Medical Image Quality Assessment
- URL: http://arxiv.org/abs/2507.19004v1
- Date: Fri, 25 Jul 2025 07:02:47 GMT
- Title: MedIQA: A Scalable Foundation Model for Prompt-Driven Medical Image Quality Assessment
- Authors: Siyi Xun, Yue Sun, Jingkun Chen, Zitong Yu, Tong Tong, Xiaohong Liu, Mingxiang Wu, Tao Tan,
- Abstract summary: Existing medical IQA methods, however, struggle to generalize across diverse modalities and clinical scenarios.<n>We introduce MedIQA, the first comprehensive foundation model for medical IQA, designed to handle variability in image dimensions, modalities, anatomical regions, and types.
- Score: 26.185840831950063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rapid advances in medical imaging technology underscore the critical need for precise and automated image quality assessment (IQA) to ensure diagnostic accuracy. Existing medical IQA methods, however, struggle to generalize across diverse modalities and clinical scenarios. In response, we introduce MedIQA, the first comprehensive foundation model for medical IQA, designed to handle variability in image dimensions, modalities, anatomical regions, and types. We developed a large-scale multi-modality dataset with plentiful manually annotated quality scores to support this. Our model integrates a salient slice assessment module to focus on diagnostically relevant regions feature retrieval and employs an automatic prompt strategy that aligns upstream physical parameter pre-training with downstream expert annotation fine-tuning. Extensive experiments demonstrate that MedIQA significantly outperforms baselines in multiple downstream tasks, establishing a scalable framework for medical IQA and advancing diagnostic workflows and clinical decision-making.
Related papers
- Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications [59.721265428780946]
Large Language Models (LLMs) in medicine have enabled impressive capabilities, yet a critical gap remains in their ability to perform systematic, transparent, and verifiable reasoning.<n>This paper provides the first systematic review of this emerging field.<n>We propose a taxonomy of reasoning enhancement techniques, categorized into training-time strategies and test-time mechanisms.
arXiv Detail & Related papers (2025-08-01T14:41:31Z) - MedGemma Technical Report [75.88152277443179]
We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B.<n>MedGemma demonstrates advanced medical understanding and reasoning on images and text.<n>We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP.
arXiv Detail & Related papers (2025-07-07T17:01:44Z) - MedBookVQA: A Systematic and Comprehensive Medical Benchmark Derived from Open-Access Book [5.318470975871017]
We present MedBookVQA, a systematic and comprehensive multimodal benchmark derived from open-access medical textbooks.<n>We generate 5,000 clinically relevant questions spanning modality recognition, classification, anatomical identification, symptom diagnosis, and surgical procedures.<n>We evaluate a wide array of MLLMs, including proprietary, open-sourced, medical, and reasoning models, revealing significant performance disparities across task types and model categories.
arXiv Detail & Related papers (2025-06-01T06:28:36Z) - MedAgent-Pro: Towards Evidence-based Multi-modal Medical Diagnosis via Reasoning Agentic Workflow [14.478357882578234]
In modern medicine, clinical diagnosis relies on the comprehensive analysis of primarily textual and visual data.<n>Recent advances in large Vision-Language Models (VLMs) and agent-based methods hold great potential for medical diagnosis.<n>We propose MedAgent-Pro, a new agentic reasoning paradigm that follows the diagnosis principle in modern medicine.
arXiv Detail & Related papers (2025-03-21T14:04:18Z) - ClinKD: Cross-Modal Clinical Knowledge Distiller For Multi-Task Medical Images [4.353855760968461]
Cross-Modal Clinical Knowledge Distiller (ClinKD) designed to enhance image-text alignment and establish more effective medical knowledge transformation mechanisms.<n>ClinKD achieves state-of-the-art performance on several datasets which are challenging for Med-VQA task.
arXiv Detail & Related papers (2025-02-09T15:08:10Z) - MedCoT: Medical Chain of Thought via Hierarchical Expert [48.91966620985221]
This paper presents MedCoT, a novel hierarchical expert verification reasoning chain method.<n>It is designed to enhance interpretability and accuracy in biomedical imaging inquiries.<n> Experimental evaluations on four standard Med-VQA datasets demonstrate that MedCoT surpasses existing state-of-the-art approaches.
arXiv Detail & Related papers (2024-12-18T11:14:02Z) - Medchain: Bridging the Gap Between LLM Agents and Clinical Practice through Interactive Sequential Benchmarking [58.25862290294702]
We present MedChain, a dataset of 12,163 clinical cases that covers five key stages of clinical workflow.<n>We also propose MedChain-Agent, an AI system that integrates a feedback mechanism and a MCase-RAG module to learn from previous cases and adapt its responses.
arXiv Detail & Related papers (2024-12-02T15:25:02Z) - Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering [70.44269982045415]
Retrieval-augmented generation (RAG) has emerged as a promising approach to enhance the performance of large language models (LLMs)
We introduce Medical Retrieval-Augmented Generation Benchmark (MedRGB) that provides various supplementary elements to four medical QA datasets.
Our experimental results reveals current models' limited ability to handle noise and misinformation in the retrieved documents.
arXiv Detail & Related papers (2024-11-14T06:19:18Z) - MD-IQA: Learning Multi-scale Distributed Image Quality Assessment with
Semi Supervised Learning for Low Dose CT [6.158876574189994]
Image quality assessment (IQA) plays a critical role in optimizing radiation dose and developing novel medical imaging techniques.
Recent deep learning-based approaches have demonstrated strong modeling capabilities and potential for medical IQA.
We propose a multi-scale distributions regression approach to predict quality scores by constraining the output distribution.
arXiv Detail & Related papers (2023-11-14T09:33:33Z) - PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering [56.25766322554655]
Medical Visual Question Answering (MedVQA) presents a significant opportunity to enhance diagnostic accuracy and healthcare delivery.
We propose a generative-based model for medical visual understanding by aligning visual information from a pre-trained vision encoder with a large language model.
We train the proposed model on PMC-VQA and then fine-tune it on multiple public benchmarks, e.g., VQA-RAD, SLAKE, and Image-Clef 2019.
arXiv Detail & Related papers (2023-05-17T17:50:16Z) - Evaluating Explainable AI on a Multi-Modal Medical Imaging Task: Can
Existing Algorithms Fulfill Clinical Requirements? [42.75635888823057]
Heatmap is a form of explanation that highlights important features for AI models' prediction.
It is unknown how well heatmaps perform on explaining decisions on multi-modal medical images.
We propose the modality-specific feature importance (MSFI) metric to tackle this clinically important but technically ignored problem.
arXiv Detail & Related papers (2022-03-12T17:18:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.