A Data-Efficient Pan-Tumor Foundation Model for Oncology CT Interpretation
- URL: http://arxiv.org/abs/2502.06171v1
- Date: Mon, 10 Feb 2025 05:45:03 GMT
- Title: A Data-Efficient Pan-Tumor Foundation Model for Oncology CT Interpretation
- Authors: Wenhui Lei, Hanyu Chen, Zitian Zhang, Luyang Luo, Qiong Xiao, Yannian Gu, Peng Gao, Yankai Jiang, Ci Wang, Guangtao Wu, Tongjia Xu, Yingjie Zhang, Xiaofan Zhang, Pranav Rajpurkar, Shaoting Zhang, Zhenning Wang,
- Abstract summary: PASTA is a pan-tumor CT foundation model that achieves state-of-the-art performance on 45 of 46 representative oncology tasks.<n> PASTA-Gen produces a comprehensive dataset of 30,000 CT scans with pixel-level annotated lesions and paired structured reports.
- Score: 17.993838581176902
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence-assisted imaging analysis has made substantial strides in tumor diagnosis and management. Here we present PASTA, a pan-tumor CT foundation model that achieves state-of-the-art performance on 45 of 46 representative oncology tasks -- including lesion segmentation, tumor detection in plain CT, tumor staging, survival prediction, structured report generation, and cross-modality transfer learning, significantly outperforming the second-best models on 35 tasks. This remarkable advancement is driven by our development of PASTA-Gen, an innovative synthetic tumor generation framework that produces a comprehensive dataset of 30,000 CT scans with pixel-level annotated lesions and paired structured reports, encompassing malignancies across ten organs and five benign lesion types. By leveraging this rich, high-quality synthetic data, we overcome a longstanding bottleneck in the development of CT foundation models -- specifically, the scarcity of publicly available, high-quality annotated datasets due to privacy constraints and the substantial labor required for scaling precise data annotation. Encouragingly, PASTA demonstrates exceptional data efficiency with promising practical value, markedly improving performance on various tasks with only a small amount of real-world data. The open release of both the synthetic dataset and PASTA foundation model effectively addresses the challenge of data scarcity, thereby advancing oncological research and clinical translation.
Related papers
- LymphAtlas- A Unified Multimodal Lymphoma Imaging Repository Delivering AI-Enhanced Diagnostic Insight [3.746123328463508]
This study integrates PET metabolic information with CT anatomical structures to establish a 3D multimodal segmentation dataset for lymphoma based on PET/CT examinations.
We retrospectively collected 483 examination datasets acquired between March 2011 and May 2024, involving 220 patients.
arXiv Detail & Related papers (2025-04-29T06:10:12Z) - FundusGAN: A Hierarchical Feature-Aware Generative Framework for High-Fidelity Fundus Image Generation [35.46876389599076]
FundusGAN is a novel hierarchical feature-aware generative framework specifically designed for high-fidelity fundus image synthesis.
We show that FundusGAN consistently outperforms state-of-the-art methods across multiple metrics.
arXiv Detail & Related papers (2025-03-22T18:08:07Z) - ScaleMAI: Accelerating the Development of Trusted Datasets and AI Models [46.80682547774335]
We propose ScaleMAI, an agent of AI-integrated data curation and annotation.<n>First, ScaleMAI creates a dataset of 25,362 CT scans, including per-voxel annotations for benign/malignant tumors and 24 anatomical structures.<n>Second, through progressive human-in-the-loop iterations, ScaleMAI provides Flagship AI Model that can approach the proficiency of expert annotators in detecting pancreatic tumors.
arXiv Detail & Related papers (2025-01-06T22:12:00Z) - ONCOPILOT: A Promptable CT Foundation Model For Solid Tumor Evaluation [3.8763197858217935]
ONCOPILOT is an interactive radiological foundation model trained on approximately 7,500 CT scans covering the whole body.
It performs 3D tumor segmentation using visual prompts like point-click and bounding boxes, outperforming state-of-the-art models.
ONCOPILOT also accelerates measurement processes and reduces inter-reader variability.
arXiv Detail & Related papers (2024-10-10T13:36:49Z) - 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans.
Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z) - AutoPET Challenge: Tumour Synthesis for Data Augmentation [26.236831356731017]
We adapt the DiffTumor method, originally designed for CT images, to generate synthetic PET-CT images with lesions.
Our approach trains the generative model on the AutoPET dataset and uses it to expand the training data.
Our findings show that the model trained on the augmented dataset achieves a higher Dice score, demonstrating the potential of our data augmentation approach.
arXiv Detail & Related papers (2024-09-12T14:23:19Z) - Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports [51.45762396192655]
Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence for computer vision.
This study evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets.
arXiv Detail & Related papers (2024-07-08T09:08:42Z) - Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - PathLDM: Text conditioned Latent Diffusion Model for Histopathology [62.970593674481414]
We introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images.
Our approach fuses image and textual data to enhance the generation process.
We achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.
arXiv Detail & Related papers (2023-09-01T22:08:32Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.