MedSAE: Dissecting MedCLIP Representations with Sparse Autoencoders
- URL: http://arxiv.org/abs/2510.26411v1
- Date: Thu, 30 Oct 2025 11:58:36 GMT
- Title: MedSAE: Dissecting MedCLIP Representations with Sparse Autoencoders
- Authors: Riccardo Renzulli, Colas Lepoutre, Enrico Cassano, Marco Grangetto,
- Abstract summary: We advance mechanistic interpretability in medical vision by applying Medical Sparse Autoencoders (MedSAEs) to the latent space of MedCLIP.<n>Our findings bridge high-performing medical AI and transparency, offering a scalable step toward clinically reliable representations.
- Score: 5.681456272022905
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence in healthcare requires models that are accurate and interpretable. We advance mechanistic interpretability in medical vision by applying Medical Sparse Autoencoders (MedSAEs) to the latent space of MedCLIP, a vision-language model trained on chest radiographs and reports. To quantify interpretability, we propose an evaluation framework that combines correlation metrics, entropy analyzes, and automated neuron naming via the MedGEMMA foundation model. Experiments on the CheXpert dataset show that MedSAE neurons achieve higher monosemanticity and interpretability than raw MedCLIP features. Our findings bridge high-performing medical AI and transparency, offering a scalable step toward clinically reliable representations.
Related papers
- MedProbCLIP: Probabilistic Adaptation of Vision-Language Foundation Model for Reliable Radiograph-Report Retrieval [3.7054279251399507]
This work introduces MedProbCLIP, a probabilistic vision-language learning framework for chest X-ray and radiology report representation learning and bidirectional retrieval.<n>The framework employs multi-view radiograph encoding and multi-section report encoding during training to provide fine-grained supervision for clinically aligned correspondence.<n>It outperforms deterministic and probabilistic baselines, including CLIP, CXR-CLIP, and PCME++, in both retrieval and zero-shot classification.
arXiv Detail & Related papers (2026-02-17T21:20:32Z) - MedCLM: Learning to Localize and Reason via a CoT-Curriculum in Medical Vision-Language Models [0.11666234644810893]
We introduce MedCLM, an automated pipeline that converts detection datasets into large-scale medical visual question answering (VQA) data with Chain-of-Thought (CoT) reasoning.<n>We propose an Integrated CoT-Curriculum Strategy composed of an Easy stage with explicit lesion boxes for visual grounding, a Medium stage that encourages implicit localization, and a Hard stage for weakly supervised reasoning.
arXiv Detail & Related papers (2025-10-06T04:26:39Z) - Interpretable Clinical Classification with Kolgomorov-Arnold Networks [70.72819760172744]
Kolmogorov-Arnold Networks (KANs) offer intrinsic interpretability through transparent, symbolic representations.<n>KANs support built-in patient-level insights, intuitive visualizations, and nearest-patient retrieval.<n>These results position KANs as a promising step toward trustworthy AI that clinicians can understand, audit, and act upon.
arXiv Detail & Related papers (2025-09-20T17:21:58Z) - MedGemma Technical Report [75.88152277443179]
We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B.<n>MedGemma demonstrates advanced medical understanding and reasoning on images and text.<n>We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP.
arXiv Detail & Related papers (2025-07-07T17:01:44Z) - Towards Multi-dimensional Explanation Alignment for Medical Classification [16.799101204390457]
We propose a novel framework called Med-MICN (Medical Multi-dimensional Interpretable Concept Network)
Med-MICN provides interpretability alignment for various angles, including neural symbolic reasoning, concept semantics, and saliency maps.
Its advantages include high prediction accuracy, interpretability across multiple dimensions, and automation through an end-to-end concept labeling process.
arXiv Detail & Related papers (2024-10-28T20:03:19Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - MRC-based Nested Medical NER with Co-prediction and Adaptive Pre-training [0.38498367961730184]
We propose a medical NER model based on Machine Reading (MRC), which uses a task-adaptive pre-training strategy to improve the model's capability in the medical field.
Our proposed model outperforms the compared state-of-the-art (SOTA) models.
arXiv Detail & Related papers (2024-03-23T11:14:02Z) - MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder [26.830574964308962]
We introduce MedFLIP, a Fast Language-Image Pre-training method for Medical analysis.
We explore MAEs for zero-shot learning with crossed domains, which enhances the model's ability to learn from limited data.
Lastly, we validate using language will improve the zero-shot performance for the medical image analysis.
arXiv Detail & Related papers (2024-03-07T16:11:43Z) - AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs.
This setup allows for realistic assessments of LLMs in clinical scenarios.
We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z) - XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions.
Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z) - MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence
using Federated Evaluation [110.31526448744096]
We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data.
We are building MedPerf, an open framework for benchmarking machine learning in the medical domain.
arXiv Detail & Related papers (2021-09-29T18:09:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.