A Synthetic Data-Driven Radiology Foundation Model for Pan-tumor Clinical Diagnosis
- URL: http://arxiv.org/abs/2502.06171v2
- Date: Mon, 20 Oct 2025 16:48:26 GMT
- Title: A Synthetic Data-Driven Radiology Foundation Model for Pan-tumor Clinical Diagnosis
- Authors: Wenhui Lei, Hanyu Chen, Zitian Zhang, Luyang Luo, Qiong Xiao, Yannian Gu, Peng Gao, Yankai Jiang, Ci Wang, Guangtao Wu, Tongjia Xu, Yingjie Zhang, Pranav Rajpurkar, Xiaofan Zhang, Shaoting Zhang, Zhenning Wang,
- Abstract summary: PASTA is a pan-tumor radiology foundation model built on PASTA-Gen.<n> PASTA achieves state-of-the-art performance on 45 of 46 oncology tasks.
- Score: 21.212976987658415
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI-assisted imaging made substantial advances in tumor diagnosis and management. However, a major barrier to developing robust oncology foundation models is the scarcity of large-scale, high-quality annotated datasets, which are limited by privacy restrictions and the high cost of manual labeling. To address this gap, we present PASTA, a pan-tumor radiology foundation model built on PASTA-Gen, a synthetic data framework that generated 30,000 3D CT scans with pixel-level lesion masks and structured reports of tumors across ten organ systems. Leveraging this resource, PASTA achieves state-of-the-art performance on 45 of 46 oncology tasks, including non-contrast CT tumor screening, lesion segmentation, structured reporting, tumor staging, survival prediction, and MRI-modality transfer. To assess clinical applicability, we developed PASTA-AID, a clinical decision support system, and ran a retrospective simulated clinical trial across two scenarios. For pan-tumor screening on plain CT with fixed reading time, PASTA-AID increased radiologists' throughput by 11.1-25.1% and improved sensitivity by 17.0-31.4% and precision by 10.5-24.9%; additionally, in a diagnosis-aid workflow, it reduced segmentation time by up to 78.2% and reporting time by up to 36.5%. Beyond gains in accuracy and efficiency, PASTA-AID narrowed the expertise gap, enabling less-experienced radiologists to approach expert-level performance. Together, this work establishes an end-to-end, synthetic data-driven pipeline spanning data generation, model development, and clinical validation, thereby demonstrating substantial potential for pan-tumor research and clinical translation.
Related papers
- A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice [83.11942224668127]
Janus-Pro-CXR (1B) is a chest X-ray interpretation system based on DeepSeek Janus-Pro model.<n>Our system outperforms state-of-the-art X-ray report generation models in automated report generation.
arXiv Detail & Related papers (2025-12-23T13:26:13Z) - An Explainable Hybrid AI Framework for Enhanced Tuberculosis and Symptom Detection [55.35661671061754]
Tuberculosis remains a critical global health issue, particularly in resource-limited and remote areas.<n>We propose a framework which enhances disease and symptom detection on chest X-rays by integrating two supervised heads and a self-supervised head.<n>Our model achieves an accuracy of 98.85% for distinguishing between COVID-19, tuberculosis, and normal cases, and a macro-F1 score of 90.09% for multilabel symptom detection.
arXiv Detail & Related papers (2025-10-21T17:18:55Z) - A Disease-Centric Vision-Language Foundation Model for Precision Oncology in Kidney Cancer [54.58205672910646]
RenalCLIP is a visual-language foundation model for characterization, diagnosis and prognosis of renal mass.<n>It achieved better performance and superior generalizability across 10 core tasks spanning the full clinical workflow of kidney cancer.
arXiv Detail & Related papers (2025-08-22T17:48:19Z) - HistoViT: Vision Transformer for Accurate and Scalable Histopathological Cancer Diagnosis [1.5939351525664014]
We propose a transformer-based deep learning framework for multi-class tumor classification.<n>Our method addresses key limitations of conventional convolutional neural networks.<n>Our approach classification achieves accuracies of 99.32%, 96.92%, 95.28%, and 96.94% for breast, prostate, bone, and cervical cancers respectively.
arXiv Detail & Related papers (2025-08-15T03:10:52Z) - CT Radiomics-Based Explainable Machine Learning Model for Accurate Differentiation of Malignant and Benign Endometrial Tumors: A Two-Center Study [11.010138196822888]
We developed a CT radiomics-based explainable machine learning model for precise diagnosing malignancy and benignity in endometrial cancer patients.<n>The diagnostic performance of the model was evaluated by using sensitivity, specificity, accuracy, precision, F1 score, AUROC, and AUPRC.<n>The Random Forest model emerged as the optimal choice for diagnosing EC, with a training AUROC of 1.00 and a testing AUROC of 0.96.
arXiv Detail & Related papers (2025-06-22T17:31:06Z) - RadFabric: Agentic AI System with Reasoning Capability for Radiology [61.25593938175618]
RadFabric is a multi agent, multimodal reasoning framework that unifies visual and textual analysis for comprehensive CXR interpretation.<n>System employs specialized CXR agents for pathology detection, an Anatomical Interpretation Agent to map visual findings to precise anatomical structures, and a Reasoning Agent powered by large multimodal reasoning models to synthesize visual, anatomical, and clinical data into transparent and evidence based diagnoses.
arXiv Detail & Related papers (2025-06-17T03:10:33Z) - PathBench: A comprehensive comparison benchmark for pathology foundation models towards precision oncology [33.51485504161335]
We present PathBench, the first comprehensive benchmark for pathology foundation models (PFMs)<n>Our framework incorporates large-scale data, enabling objective comparison of PFMs.<n>We have collected 15,888 WSIs from 8,549 patients across 10 hospitals, encompassing over 64 diagnosis and prognosis tasks.
arXiv Detail & Related papers (2025-05-26T16:42:22Z) - LymphAtlas- A Unified Multimodal Lymphoma Imaging Repository Delivering AI-Enhanced Diagnostic Insight [3.746123328463508]
This study integrates PET metabolic information with CT anatomical structures to establish a 3D multimodal segmentation dataset for lymphoma based on PET/CT examinations.
We retrospectively collected 483 examination datasets acquired between March 2011 and May 2024, involving 220 patients.
arXiv Detail & Related papers (2025-04-29T06:10:12Z) - FundusGAN: A Hierarchical Feature-Aware Generative Framework for High-Fidelity Fundus Image Generation [35.46876389599076]
FundusGAN is a novel hierarchical feature-aware generative framework specifically designed for high-fidelity fundus image synthesis.
We show that FundusGAN consistently outperforms state-of-the-art methods across multiple metrics.
arXiv Detail & Related papers (2025-03-22T18:08:07Z) - ScaleMAI: Accelerating the Development of Trusted Datasets and AI Models [46.80682547774335]
We propose ScaleMAI, an agent of AI-integrated data curation and annotation.<n>First, ScaleMAI creates a dataset of 25,362 CT scans, including per-voxel annotations for benign/malignant tumors and 24 anatomical structures.<n>Second, through progressive human-in-the-loop iterations, ScaleMAI provides Flagship AI Model that can approach the proficiency of expert annotators in detecting pancreatic tumors.
arXiv Detail & Related papers (2025-01-06T22:12:00Z) - ONCOPILOT: A Promptable CT Foundation Model For Solid Tumor Evaluation [3.8763197858217935]
ONCOPILOT is an interactive radiological foundation model trained on approximately 7,500 CT scans covering the whole body.
It performs 3D tumor segmentation using visual prompts like point-click and bounding boxes, outperforming state-of-the-art models.
ONCOPILOT also accelerates measurement processes and reduces inter-reader variability.
arXiv Detail & Related papers (2024-10-10T13:36:49Z) - 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans.
Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z) - AutoPET Challenge: Tumour Synthesis for Data Augmentation [26.236831356731017]
We adapt the DiffTumor method, originally designed for CT images, to generate synthetic PET-CT images with lesions.
Our approach trains the generative model on the AutoPET dataset and uses it to expand the training data.
Our findings show that the model trained on the augmented dataset achieves a higher Dice score, demonstrating the potential of our data augmentation approach.
arXiv Detail & Related papers (2024-09-12T14:23:19Z) - MGH Radiology Llama: A Llama 3 70B Model for Radiology [50.42811030970618]
This paper presents an advanced radiology-focused large language model: MGH Radiology Llama.<n>It is developed using the Llama 3 70B model, building upon previous domain-specific models like Radiology-GPT and Radiology-Llama2.<n>Our evaluation, incorporating both traditional metrics and a GPT-4-based assessment, highlights the enhanced performance of this work over general-purpose LLMs.
arXiv Detail & Related papers (2024-08-13T01:30:03Z) - Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports [51.45762396192655]
Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence for computer vision.
This study evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets.
arXiv Detail & Related papers (2024-07-08T09:08:42Z) - Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - PathLDM: Text conditioned Latent Diffusion Model for Histopathology [62.970593674481414]
We introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images.
Our approach fuses image and textual data to enhance the generation process.
We achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.
arXiv Detail & Related papers (2023-09-01T22:08:32Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Harvesting, Detecting, and Characterizing Liver Lesions from Large-scale
Multi-phase CT Data via Deep Dynamic Texture Learning [24.633802585888812]
We propose a fully-automated and multi-stage liver tumor characterization framework for dynamic contrast computed tomography (CT)
Our system comprises four sequential processes of tumor proposal detection, tumor harvesting, primary tumor site selection, and deep texture-based tumor characterization.
arXiv Detail & Related papers (2020-06-28T19:55:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.