Expert-Guided Explainable Few-Shot Learning for Medical Image Diagnosis
- URL: http://arxiv.org/abs/2509.08007v2
- Date: Thu, 11 Sep 2025 16:30:31 GMT
- Title: Expert-Guided Explainable Few-Shot Learning for Medical Image Diagnosis
- Authors: Ifrat Ikhtear Uddin, Longwei Wang, KC Santosh,
- Abstract summary: We propose an expert-guided explainable few-shot learning framework that integrates radiologist-provided regions of interest into model training.<n>We evaluate our framework on two distinct datasets: BraTS (MRI) and VinDr-CXR (Chest X-ray)<n>Our findings demonstrate the effectiveness of incorporating expert-guided attention supervision to bridge the gap between performance and interpretability in few-shot medical image diagnosis.
- Score: 2.7946918847372277
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Medical image analysis often faces significant challenges due to limited expert-annotated data, hindering both model generalization and clinical adoption. We propose an expert-guided explainable few-shot learning framework that integrates radiologist-provided regions of interest (ROIs) into model training to simultaneously enhance classification performance and interpretability. Leveraging Grad-CAM for spatial attention supervision, we introduce an explanation loss based on Dice similarity to align model attention with diagnostically relevant regions during training. This explanation loss is jointly optimized with a standard prototypical network objective, encouraging the model to focus on clinically meaningful features even under limited data conditions. We evaluate our framework on two distinct datasets: BraTS (MRI) and VinDr-CXR (Chest X-ray), achieving significant accuracy improvements from 77.09% to 83.61% on BraTS and from 54.33% to 73.29% on VinDr-CXR compared to non-guided models. Grad-CAM visualizations further confirm that expert-guided training consistently aligns attention with diagnostic regions, improving both predictive reliability and clinical trustworthiness. Our findings demonstrate the effectiveness of incorporating expert-guided attention supervision to bridge the gap between performance and interpretability in few-shot medical image diagnosis.
Related papers
- Learning to Select Like Humans: Explainable Active Learning for Medical Imaging [8.744178539108267]
We propose an explainability-guided active learning framework that integrates spatial attention alignment into a sample acquisition process.<n>We evaluate the framework using three expert-annotated medical imaging datasets.
arXiv Detail & Related papers (2026-02-10T01:20:37Z) - Clinical Interpretability of Deep Learning Segmentation Through Shapley-Derived Agreement and Uncertainty Metrics [0.05473229173811305]
Deep learning models have achieved remarkable performance in medical image segmentation.<n>The need for explainability remains critical for ensuring their acceptance and integration in clinical practice.<n>Our approach explored the use of contrast-level Shapley values to assess feature importance.
arXiv Detail & Related papers (2025-12-08T07:06:58Z) - MIRNet: Integrating Constrained Graph-Based Reasoning with Pre-training for Diagnostic Medical Imaging [67.74482877175797]
MIRNet is a novel framework that integrates self-supervised pre-training with constrained graph-based reasoning.<n>We introduce TongueAtlas-4K, a benchmark comprising 4,000 images annotated with 22 diagnostic labels.
arXiv Detail & Related papers (2025-11-13T06:30:41Z) - An Explainable Hybrid AI Framework for Enhanced Tuberculosis and Symptom Detection [55.35661671061754]
Tuberculosis remains a critical global health issue, particularly in resource-limited and remote areas.<n>We propose a framework which enhances disease and symptom detection on chest X-rays by integrating two supervised heads and a self-supervised head.<n>Our model achieves an accuracy of 98.85% for distinguishing between COVID-19, tuberculosis, and normal cases, and a macro-F1 score of 90.09% for multilabel symptom detection.
arXiv Detail & Related papers (2025-10-21T17:18:55Z) - Self-Supervised Anatomical Consistency Learning for Vision-Grounded Medical Report Generation [61.350584471060756]
Vision-grounded medical report generation aims to produce clinically accurate descriptions of medical images.<n>We propose Self-Supervised Anatomical Consistency Learning (SS-ACL) to align generated reports with corresponding anatomical regions.<n>SS-ACL constructs a hierarchical anatomical graph inspired by the invariant top-down inclusion structure of human anatomy.
arXiv Detail & Related papers (2025-09-30T08:59:06Z) - RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis [56.373297358647655]
Retrieval-Augmented Diagnosis (RAD) is a novel framework that injects external knowledge into multimodal models directly on downstream tasks.<n>RAD operates through three key mechanisms: retrieval and refinement of disease-centered knowledge from multiple medical sources, a guideline-enhanced contrastive loss transformer, and a dual decoder.
arXiv Detail & Related papers (2025-09-24T10:36:14Z) - Uncertainty-Driven Expert Control: Enhancing the Reliability of Medical Vision-Language Models [52.2001050216955]
Existing methods aim to enhance the performance of Medical Vision Language Model (MedVLM) by adjusting model structure, fine-tuning with high-quality data, or through preference fine-tuning.<n>We propose an expert-in-the-loop framework named Expert-Controlled-Free Guidance (Expert-CFG) to align MedVLM with clinical expertise without additional training.
arXiv Detail & Related papers (2025-07-12T09:03:30Z) - Vision-Language Models for Acute Tuberculosis Diagnosis: A Multimodal Approach Combining Imaging and Clinical Data [0.0]
This study introduces a Vision-Language Model (VLM) leveraging SIGLIP and Gemma-3b architectures for automated acute tuberculosis (TB) screening.<n>The VLM combines visual data from chest X-rays with clinical context to generate detailed, context-aware diagnostic reports.<n>Key acute TB pathologies, including consolidation, cavities, and nodules, were detected with high precision and recall.
arXiv Detail & Related papers (2025-03-17T14:08:35Z) - Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty [52.03490691733464]
We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks.
By leveraging subjective logic theory, we explicitly model probability and uncertainty for the problem of medical image segmentation.
DeviS incorporates an uncertainty-aware filtering module, which utilizes the metric of uncertainty-calibrated error to filter reliable data.
arXiv Detail & Related papers (2023-01-01T05:02:46Z) - Outlier-based Autism Detection using Longitudinal Structural MRI [6.311381904410801]
This paper proposes structural Magnetic Resonance Imaging (sMRI)-based Autism Spectrum Disorder diagnosis via an outlier detection approach.
Generative Adversarial Network (GAN) is trained exclusively with sMRI scans of healthy subjects.
Experiments reveal that our ASD detection framework performs comparably with the state-of-the-art with far fewer training data.
arXiv Detail & Related papers (2022-02-21T04:37:25Z) - IA-GCN: Interpretable Attention based Graph Convolutional Network for
Disease prediction [47.999621481852266]
We propose an interpretable graph learning-based model which interprets the clinical relevance of the input features towards the task.
In a clinical scenario, such a model can assist the clinical experts in better decision-making for diagnosis and treatment planning.
Our proposed model shows superior performance with respect to compared methods with an increase in an average accuracy of 3.2% for Tadpole, 1.6% for UKBB Gender, and 2% for the UKBB Age prediction task.
arXiv Detail & Related papers (2021-03-29T13:04:02Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Cross Chest Graph for Disease Diagnosis with Structural Relational
Reasoning [2.7148274921314615]
Locating lesions is important in the computer-aided diagnosis of X-ray images.
General weakly-supervised methods have failed to consider the characteristics of X-ray images.
We propose the Cross-chest Graph (CCG), which improves the performance of automatic lesion detection.
arXiv Detail & Related papers (2021-01-22T08:24:04Z) - Advancing diagnostic performance and clinical usability of neural
networks via adversarial training and dual batch normalization [2.1699022621790736]
We let six radiologists rate the interpretability of saliency maps in datasets of X-rays, computed tomography, and magnetic resonance imaging scans.
We found that the accuracy of adversarially trained models was equal to standard models when sufficiently large datasets and dual batch norm training were used.
arXiv Detail & Related papers (2020-11-25T20:41:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.