Multi Anatomy X-Ray Foundation Model
- URL: http://arxiv.org/abs/2509.12146v1
- Date: Mon, 15 Sep 2025 17:12:26 GMT
- Title: Multi Anatomy X-Ray Foundation Model
- Authors: Nishank Singla, Krisztian Koos, Farzin Haddadpour, Amin Honarmandi Shandiz, Lovish Chum, Xiaojian Xu, Qing Jin, Erhan Bas,
- Abstract summary: We introduce XR-0, the multi-anatomy X-ray foundation model using self-supervised learning.<n> XR-0 achieves state-of-the-art performance on most multi-anatomy tasks and remains competitive on chest-specific benchmarks.
- Score: 7.079609136804425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: X-ray imaging is a ubiquitous in radiology, yet most existing AI foundation models are limited to chest anatomy and fail to generalize across broader clinical tasks. In this work, we introduce XR-0, the multi-anatomy X-ray foundation model using self-supervised learning on a large, private dataset of 1.15 million images spanning diverse anatomical regions and evaluated across 12 datasets and 20 downstream tasks, including classification, retrieval, segmentation, localization, visual grounding, and report generation. XR-0 achieves state-of-the-art performance on most multi-anatomy tasks and remains competitive on chest-specific benchmarks. Our results demonstrate that anatomical diversity and supervision are critical for building robust, general-purpose medical vision models, paving the way for scalable and adaptable AI systems in radiology.
Related papers
- A generalizable large-scale foundation model for musculoskeletal radiographs [6.440881664328117]
We present SKELEX, a large-scale foundation model for musculoskeletal radiographs trained using self-supervised learning.<n>The model was evaluated on 12 downstream diagnostic tasks and generally outperformed baselines in fracture detection, osteoarthritis grading, and bone tumor classification.<n>We developed an interpretable, region-guided model for predicting bone tumors, which maintained robust performance on independent external datasets.
arXiv Detail & Related papers (2026-02-03T04:04:45Z) - MORE: Multi-Organ Medical Image REconstruction Dataset [27.136259882514864]
We introduce the Multi-Organ medical image REconstruction dataset, comprising CT scans across 9 diverse anatomies with 15 lesion types.<n>This dataset serves two key purposes: (1) enabling robust training of deep learning models on extensive, heterogeneous data, and (2) facilitating rigorous evaluation of model generalization for CT reconstruction.
arXiv Detail & Related papers (2025-10-30T17:49:49Z) - A Fully Open and Generalizable Foundation Model for Ultrasound Clinical Applications [77.3888788549565]
We present EchoCare, a novel ultrasound foundation model for generalist clinical use.<n>We developed EchoCare via self-supervised learning on our curated, publicly available, large-scale dataset EchoCareData.<n>With minimal training, EchoCare outperforms state-of-the-art comparison models across 10 representative ultrasound benchmarks.
arXiv Detail & Related papers (2025-09-15T10:05:31Z) - CADS: A Comprehensive Anatomical Dataset and Segmentation for Whole-Body Anatomy in Computed Tomography [27.1055374364626]
We present CADS, an open-source framework that prioritizes the systematic integration, standardization, and labeling of heterogeneous data sources for whole-body CT segmentation.<n>At its core is a large-scale dataset of 22,022 CT volumes with complete annotations for 167 anatomical structures.<n>Through comprehensive evaluation across 18 public datasets and an independent real-world hospital cohort, we demonstrate advantages over SoTA approaches.
arXiv Detail & Related papers (2025-07-29T19:58:32Z) - RadFabric: Agentic AI System with Reasoning Capability for Radiology [61.25593938175618]
RadFabric is a multi agent, multimodal reasoning framework that unifies visual and textual analysis for comprehensive CXR interpretation.<n>System employs specialized CXR agents for pathology detection, an Anatomical Interpretation Agent to map visual findings to precise anatomical structures, and a Reasoning Agent powered by large multimodal reasoning models to synthesize visual, anatomical, and clinical data into transparent and evidence based diagnoses.
arXiv Detail & Related papers (2025-06-17T03:10:33Z) - X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography [89.84588038174721]
Computed Tomography serves as an indispensable tool in clinical, providing non-invasive visualization of internal anatomical structures.<n>Existing CT reconstruction works are limited to small-capacity model architecture and inflexible volume representation.<n>We present X-GRM, a large feedforward model for reconstructing 3D CT volumes from sparse-view 2D X-ray projections.
arXiv Detail & Related papers (2025-05-21T08:14:10Z) - A Continual Learning-driven Model for Accurate and Generalizable Segmentation of Clinically Comprehensive and Fine-grained Whole-body Anatomies in CT [67.34586036959793]
There is no fully annotated CT dataset with all anatomies delineated for training.<n>We propose a novel continual learning-driven CT model that can segment complete anatomies.<n>Our single unified CT segmentation model, CL-Net, can highly accurately segment a clinically comprehensive set of 235 fine-grained whole-body anatomies.
arXiv Detail & Related papers (2025-03-16T23:55:02Z) - Advancing human-centric AI for robust X-ray analysis through holistic self-supervised learning [33.9544297423474]
We present RayDINO, a large visual encoder trained by self-supervision on 873k chest X-rays.
We compare RayDINO to previous state-of-the-art models across nine radiology tasks, from classification and dense segmentation to text generation.
Our findings suggest that self-supervision allows patient-centric AI proving useful in clinical and interpreting X-rays holistically.
arXiv Detail & Related papers (2024-05-02T16:59:10Z) - FluoroSAM: A Language-promptable Foundation Model for Flexible X-ray Image Segmentation [11.55858990545478]
FluoroSAM is a language-promptable variant of the Segment Anything Model.<n>It is capable of segmenting myriad anatomical structures and tools based on natural language prompts.<n>We show how FluoroSAM is a key enabler for rich human-machine interaction in the X-ray image acquisition and analysis context.
arXiv Detail & Related papers (2024-03-12T20:11:38Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - MUSCLE: Multi-task Self-supervised Continual Learning to Pre-train Deep
Models for X-ray Images of Multiple Body Parts [63.30352394004674]
Multi-task Self-super-vised Continual Learning (MUSCLE) is a novel self-supervised pre-training pipeline for medical imaging tasks.
MUSCLE aggregates X-rays collected from multiple body parts for representation learning, and adopts a well-designed continual learning procedure.
We evaluate MUSCLE using 9 real-world X-ray datasets with various tasks, including pneumonia classification, skeletal abnormality classification, lung segmentation, and tuberculosis (TB) detection.
arXiv Detail & Related papers (2023-10-03T12:19:19Z) - XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models [72.8965643836841]
We introduce XrayGPT, a novel conversational medical vision-language model.<n>It can analyze and answer open-ended questions about chest radiographs.<n>We generate 217k interactive and high-quality summaries from free-text radiology reports.
arXiv Detail & Related papers (2023-06-13T17:59:59Z) - Self adaptive global-local feature enhancement for radiology report
generation [10.958641951927817]
We propose a novel framework AGFNet to dynamically fuse the global and anatomy region feature to generate multi-grained radiology report.
Firstly, we extract important anatomy region features and global features of input Chest X-ray (CXR)
Then, with the region features and the global features as input, our proposed self-adaptive fusion gate module could dynamically fuse multi-granularity information.
Finally, the captioning generator generates the radiology reports through multi-granularity features.
arXiv Detail & Related papers (2022-11-21T11:50:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.