ProtoMedX: Towards Explainable Multi-Modal Prototype Learning for Bone Health Classification
- URL: http://arxiv.org/abs/2509.14830v2
- Date: Thu, 09 Oct 2025 14:54:12 GMT
- Title: ProtoMedX: Towards Explainable Multi-Modal Prototype Learning for Bone Health Classification
- Authors: Alvaro Lopez Pellicer, Andre Mariucci, Plamen Angelov, Marwan Bukhari, Jemma G. Kerns,
- Abstract summary: ProtoMedX is a multi-modal (multimodal) model that uses both DEXA scans of the lumbar spine and patient records.<n>It achieves 87.58% accuracy in vision-only tasks and 89.8% in its multi-modal variant, both surpassing existing published methods.
- Score: 5.29568690662347
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bone health studies are crucial in medical practice for the early detection and treatment of Osteopenia and Osteoporosis. Clinicians usually make a diagnosis based on densitometry (DEXA scans) and patient history. The applications of AI in this field are ongoing research. Most successful methods rely on deep learning models that use vision alone (DEXA/X-ray imagery) and focus on prediction accuracy, while explainability is often disregarded and left to post hoc assessments of input contributions. We propose ProtoMedX, a multi-modal (multimodal) model that uses both DEXA scans of the lumbar spine and patient records. ProtoMedX's prototype-based architecture is explainable by design, which is crucial for medical applications, especially in the context of the upcoming EU AI Act, as it allows explicit analysis of model decisions, including incorrect ones. ProtoMedX demonstrates state-of-the-art performance in bone health classification while also providing explanations that can be visually understood by clinicians. Using a dataset of 4,160 real NHS patients, the proposed ProtoMedX achieves 87.58% accuracy in vision-only tasks and 89.8% in its multi-modal variant, both surpassing existing published methods.
Related papers
- A multimodal vision foundation model for generalizable knee pathology [40.03838145472935]
Musculoskeletal disorders represent an urgent demand for precise interpretation of medical imaging.<n>Current artificial intelligence approaches in orthopedics rely on task-specific, supervised learning paradigms.<n>We introduce OrthoFoundation, a multimodal vision foundation model optimized for musculoskeletal pathology.
arXiv Detail & Related papers (2026-01-26T08:14:51Z) - X-Ray-CoT: Interpretable Chest X-ray Diagnosis with Vision-Language Models via Chain-of-Thought Reasoning [0.0]
We propose X-Ray-CoT (Chest X-Ray Chain-of-Thought), a novel framework for intelligent chest X-ray diagnosis and interpretable report generation.<n>X-Ray-CoT simulates human radiologists' "chain-of-thought" by first extracting multi-modal features and visual concepts.<n>It achieves competitive quantitative performance, with a Balanced Accuracy of 80.52% and F1 score of 78.65% for disease diagnosis.
arXiv Detail & Related papers (2025-08-17T18:00:41Z) - A Multi-Site Study on AI-Driven Pathology Detection and Osteoarthritis Grading from Knee X-Ray [0.0]
Bone health disorders like osteoarthritis and osteoporosis pose major global health challenges.<n>This study presents an AI-powered system that analyzes knee X-rays to detect key pathologies.<n>It also grades osteoarthritis severity, enabling timely, personalized treatment.
arXiv Detail & Related papers (2025-03-28T06:41:22Z) - MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis [28.421857904824627]
MiniGPT-Med is a vision-language model derived from large-scale language models and tailored for medical applications.
It is capable of performing tasks such as medical report generation, visual question answering (VQA), and disease identification within medical imagery.
It achieves state-of-the-art performance on medical report generation, higher than the previous best model by 19% accuracy.
arXiv Detail & Related papers (2024-07-04T18:21:10Z) - Key Patches Are All You Need: A Multiple Instance Learning Framework For Robust Medical Diagnosis [15.964609888720315]
We propose to limit the amount of information deep learning models use to reach the final classification, by using a multiple instance learning framework.
We evaluate our framework on two medical applications: skin cancer diagnosis using dermoscopy and breast cancer diagnosis using mammography.
arXiv Detail & Related papers (2024-05-02T18:21:25Z) - Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
We introduce a novel task called Medical Report Grounding (MRG)<n>MRG aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner.<n>We propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases.
arXiv Detail & Related papers (2024-04-10T07:41:35Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - This Patient Looks Like That Patient: Prototypical Networks for
Interpretable Diagnosis Prediction from Clinical Text [56.32427751440426]
In clinical practice such models must not only be accurate, but provide doctors with interpretable and helpful results.
We introduce ProtoPatient, a novel method based on prototypical networks and label-wise attention.
We evaluate the model on two publicly available clinical datasets and show that it outperforms existing baselines.
arXiv Detail & Related papers (2022-10-16T10:12:07Z) - Explainable Deep Learning Methods in Medical Image Classification: A
Survey [0.0]
State-of-the-art deep learning models have achieved human-level accuracy on the classification of different types of medical data.
These models are hardly adopted in clinical, mainly due to their lack of interpretability.
The black-box-ness of deep learning models has raised the need for devising strategies to explain the decision process of these models.
arXiv Detail & Related papers (2022-05-10T09:28:14Z) - MIMO: Mutual Integration of Patient Journey and Medical Ontology for
Healthcare Representation Learning [49.57261599776167]
We propose an end-to-end robust Transformer-based solution, Mutual Integration of patient journey and Medical Ontology (MIMO) for healthcare representation learning and predictive analytics.
arXiv Detail & Related papers (2021-07-20T07:04:52Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.