Related papers: Pillar-0: A New Frontier for Radiology Foundation Models

Pillar-0: A New Frontier for Radiology Foundation Models

URL: http://arxiv.org/abs/2511.17803v1
Date: Fri, 21 Nov 2025 21:50:34 GMT
Title: Pillar-0: A New Frontier for Radiology Foundation Models
Authors: Kumar Krishna Agrawal, Longchao Liu, Long Lian, Michael Nercessian, Natalia Harguindeguy, Yufu Wu, Peter Mikhael, Gigin Lin, Lecia V. Sequist, Florian Fintelmann, Trevor Darrell, Yutong Bai, Maggie Chung, Adam Yala,
Abstract summary: We introduce Pillar-0, a radiology foundation model pretrained on 42,990 abdomen-pelvis CTs, 86,411 chest CTs, 14,348 head CTs, and 11,543 breast MRIs.<n>Pillar-0 achieves mean AUROCs of 86.4, 88.0, 90.1, and 82.9, outperforming MedGemma (Google), MedImageInsight (Microsoft), Lingshu (Alibaba), and Merlin (Stanford) by 7.8-15.8 AUROC points and ranking best in 87.2% (319/366) tasks.
Score: 41.640120966890954
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Radiology plays an integral role in modern medicine, yet rising imaging volumes have far outpaced workforce growth. Foundation models offer a path toward assisting with the full spectrum of radiology tasks, but existing medical models remain limited: they process volumetric CT and MRI as low-fidelity 2D slices, discard critical grayscale contrast information, and lack evaluation frameworks that reflect real clinical practice. We introduce Pillar-0, a radiology foundation model pretrained on 42,990 abdomen-pelvis CTs, 86,411 chest CTs, 14,348 head CTs, and 11,543 breast MRIs from a large academic center, together with RATE, a scalable framework that extracts structured labels for 366 radiologic findings with near-perfect accuracy using LLMs. Across internal test sets of 14,230 abdomen-pelvis CTs, 10,646 chest CTs, 4,906 head CTs, and 1,585 breast MRIs, Pillar-0 establishes a new performance frontier, achieving mean AUROCs of 86.4, 88.0, 90.1, and 82.9, outperforming MedGemma (Google), MedImageInsight (Microsoft), Lingshu (Alibaba), and Merlin (Stanford) by 7.8-15.8 AUROC points and ranking best in 87.2\% (319/366) tasks. Pillar-0 similarly outperforms all baselines in an external validation on the Stanford Abdominal CT dataset, including Merlin (82.2 vs 80.6 AUROC). Pillar-0 extends to tasks beyond its pretraining, such as long-horizon lung cancer risk prediction, where it improves upon the state-of-the-art Sybil by 3.0 C-index points on NLST, and generalizes with gains of 5.9 (MGH) and 1.9 (CGMH). In brain hemorrhage detection, Pillar-0 obtained a >95 AUROC when using only 1/20th of the data of the next most sample efficient baseline. Pillar-0 and RATE together provide an open, clinically rigorous foundation for building high-performance radiology systems, enabling applications that were previously infeasible due to computational, data, and evaluation constraints.

Related papers

Neural Discrete Representation Learning for Sparse-View CBCT Reconstruction: From Algorithm Design to Prospective Multicenter Clinical Evaluation [64.42236775544579]
Cone beam computed tomography (CBCT)-guided puncture has become an established approach for diagnosing and treating thoracic tumours.<n>DeepPriorCBCT is a three-stage deep learning framework that achieves diagnostic-grade reconstruction using only one-sixth of the conventional radiation dose.
arXiv Detail & Related papers (2025-11-30T12:45:02Z)
Evaluating Large Language Models for Zero-Shot Disease Labeling in CT Radiology Reports Across Organ Systems [1.1373722549440357]
We compare a rule-based algorithm (RBA), RadBERT, and three lightweight open-weight LLMs for multi-disease labeling of chest, abdomen, and pelvis CT reports.<n>Performance was evaluated using Cohen's Kappa and micro/macro-averaged F1 scores.
arXiv Detail & Related papers (2025-06-03T18:00:08Z)
Explainable Anatomy-Guided AI for Prostate MRI: Foundation Models and In Silico Clinical Trials for Virtual Biopsy-based Risk Assessment [3.5408411348831232]
We present a fully automated, anatomically guided deep learning pipeline for prostate cancer (PCa) risk stratification using routine MRI.<n>The pipeline integrates three key components: an nnU-Net module for segmenting the prostate gland and its zones on axial T2-weighted MRI; a classification module based on the DiceedPT Swin Transformer foundation model, fine-tuned on 3D patches with optional anatomical priors and clinical data; and a VAE-GAN framework for generating counterfactual heatmaps that localize decision-driving image regions.
arXiv Detail & Related papers (2025-05-23T14:40:09Z)
Towards a Holistic Framework for Multimodal Large Language Models in Three-dimensional Brain CT Report Generation [42.06416052431378]
2D radiology captioning is incompetent to reflect the real-world diagnostic challenge in the volumetric 3D anatomy. We collected an 18,885 text-scan pairs 3D-BrainCT dataset and applied clinical visual instruction tuning to train BrainGPT models to generate radiology-adherent 3D brain CT reports. Our work embodies a holistic framework that showcased the first-hand experience of curating a 3D brain CT dataset, fine-tuning anatomy-sensible language models, and proposing robust radiology evaluation metrics.
arXiv Detail & Related papers (2024-07-02T12:58:35Z)
Quantifying uncertainty in lung cancer segmentation with foundation models applied to mixed-domain datasets [6.712251433139412]
Medical image foundation models have shown the ability to segment organs and tumors with minimal fine-tuning.<n>These models are typically evaluated on task-specific in-distribution (ID) datasets.<n>We introduce a comprehensive set of computationally fast metrics to evaluate the performance of multiple foundation models trained with self-supervised learning (SSL)<n>SMIT produced the highest F1-score (LRAD: 0.60, 5Rater: 0.64) and lowest entropy (LRAD: 0.06, 5Rater: 0.12), indicating higher tumor detection rate and confident segmentations.
arXiv Detail & Related papers (2024-03-19T19:36:48Z)
Attention-based Saliency Maps Improve Interpretability of Pneumothorax Classification [52.77024349608834]
To investigate chest radiograph (CXR) classification performance of vision transformers (ViT) and interpretability of attention-based saliency. ViTs were fine-tuned for lung disease classification using four public data sets: CheXpert, Chest X-Ray 14, MIMIC CXR, and VinBigData. ViTs had comparable CXR classification AUCs compared with state-of-the-art CNNs.
arXiv Detail & Related papers (2023-03-03T12:05:41Z)
TotalSegmentator: robust segmentation of 104 anatomical structures in CT images [48.50994220135258]
We present a deep learning segmentation model for body CT images. The model can segment 104 anatomical structures relevant for use cases such as organ volumetry, disease characterization, and surgical or radiotherapy planning.
arXiv Detail & Related papers (2022-08-11T15:16:40Z)
Open-radiomics: A Collection of Standardized Datasets and a Technical Protocol for Reproducible Radiomics Machine Learning Pipelines [0.0]
We curated large-scale radiomics datasets based on three open-source datasets; BraTS 2020 for high-grade glioma (HGG) versus low-grade glioma (LGG) classification and survival analysis.<n>We applied our protocol to 369 brain tumor patients (76 LGG, 293 HGG)<n>Unlike binWidth and image normalization, tumor subregion and imaging sequence significantly affected performance of the models.
arXiv Detail & Related papers (2022-07-29T16:37:46Z)
A self-supervised learning strategy for postoperative brain cavity segmentation simulating resections [46.414990784180546]
Convolutional neural networks (CNNs) are the state-of-the-art image segmentation technique. CNNs require large annotated datasets for training. Self-supervised learning strategies can leverage unlabeled data for training.
arXiv Detail & Related papers (2021-05-24T12:27:06Z)
Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale Chest Computed Tomography Volumes [64.21642241351857]
We curated and analyzed a chest computed tomography (CT) data set of 36,316 volumes from 19,993 unique patients. We developed a rule-based method for automatically extracting abnormality labels from free-text radiology reports. We also developed a model for multi-organ, multi-disease classification of chest CT volumes.
arXiv Detail & Related papers (2020-02-12T00:59:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.