GuideGen: A Text-guided Framework for Joint CT Volume and Anatomical
structure Generation
- URL: http://arxiv.org/abs/2403.07247v1
- Date: Tue, 12 Mar 2024 02:09:39 GMT
- Title: GuideGen: A Text-guided Framework for Joint CT Volume and Anatomical
structure Generation
- Authors: Linrui Dai, Rongzhao Zhang, Zhongzhen Huang, Xiaofan Zhang
- Abstract summary: We present textbfGuideGen: a pipeline that jointly generates CT images and tissue masks for abdominal organs and colorectal cancer conditioned on a text prompt.
Our pipeline guarantees high fidelity and variability as well as exact alignment between generated CT volumes and tissue masks.
- Score: 2.062999694458006
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The annotation burden and extensive labor for gathering a large medical
dataset with images and corresponding labels are rarely cost-effective and
highly intimidating. This results in a lack of abundant training data that
undermines downstream tasks and partially contributes to the challenge image
analysis faces in the medical field. As a workaround, given the recent success
of generative neural models, it is now possible to synthesize image datasets at
a high fidelity guided by external constraints. This paper explores this
possibility and presents \textbf{GuideGen}: a pipeline that jointly generates
CT images and tissue masks for abdominal organs and colorectal cancer
conditioned on a text prompt. Firstly, we introduce Volumetric Mask Sampler to
fit the discrete distribution of mask labels and generate low-resolution 3D
tissue masks. Secondly, our Conditional Image Generator autoregressively
generates CT slices conditioned on a corresponding mask slice to incorporate
both style information and anatomical guidance. This pipeline guarantees high
fidelity and variability as well as exact alignment between generated CT
volumes and tissue masks. Both qualitative and quantitative experiments on 3D
abdominal CTs demonstrate a high performance of our proposed pipeline, thereby
proving our method can serve as a dataset generator and provide potential
benefits to downstream tasks. It is hoped that our work will offer a promising
solution on the multimodality generation of CT and its anatomical mask. Our
source code is publicly available at
https://github.com/OvO1111/JointImageGeneration.
Related papers
- 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans.
Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z) - RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis [56.57177181778517]
RadGenome-Chest CT is a large-scale, region-guided 3D chest CT interpretation dataset based on CT-RATE.
We leverage the latest powerful universal segmentation and large language models to extend the original datasets.
arXiv Detail & Related papers (2024-04-25T17:11:37Z) - CT-GLIP: 3D Grounded Language-Image Pretraining with CT Scans and Radiology Reports for Full-Body Scenarios [53.94122089629544]
We introduce CT-GLIP (Grounded Language-Image Pretraining with CT scans), a novel method that constructs organ-level image-text pairs to enhance multimodal contrastive learning.
Our method, trained on a multimodal CT dataset comprising 44,011 organ-level vision-text pairs from 17,702 patients across 104 organs, demonstrates it can identify organs and abnormalities in a zero-shot manner using natural languages.
arXiv Detail & Related papers (2024-04-23T17:59:01Z) - A Unified Multi-Phase CT Synthesis and Classification Framework for
Kidney Cancer Diagnosis with Incomplete Data [18.15801599933636]
We propose a unified framework for kidney cancer diagnosis with incomplete multi-phase CT.
It simultaneously recovers missing CT images and classifies cancer subtypes using the completed set of images.
The proposed framework is based on fully 3D convolutional neural networks.
arXiv Detail & Related papers (2023-12-09T11:34:14Z) - MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images [22.455833806331384]
This paper introduces an innovative methodology for producing high-quality 3D lung CT images guided by textual information.
Current state-of-the-art approaches are limited to low-resolution outputs and underutilize radiology reports' abundant information.
arXiv Detail & Related papers (2023-10-05T14:16:22Z) - Towards Unifying Anatomy Segmentation: Automated Generation of a
Full-body CT Dataset via Knowledge Aggregation and Anatomical Guidelines [113.08940153125616]
We generate a dataset of whole-body CT scans with $142$ voxel-level labels for 533 volumes providing comprehensive anatomical coverage.
Our proposed procedure does not rely on manual annotation during the label aggregation stage.
We release our trained unified anatomical segmentation model capable of predicting $142$ anatomical structures on CT data.
arXiv Detail & Related papers (2023-07-25T09:48:13Z) - GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes [2.410738584733268]
GenerateCT is the first approach to generating 3D medical imaging conditioned on free-form medical text prompts.
We benchmarked GenerateCT against cutting-edge methods, demonstrating its superiority across all key metrics.
GenerateCT enables the scaling of synthetic training datasets to arbitrary sizes.
arXiv Detail & Related papers (2023-05-25T13:16:39Z) - Medical Image Captioning via Generative Pretrained Transformers [57.308920993032274]
We combine two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records.
The proposed model is tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO.
arXiv Detail & Related papers (2022-09-28T10:27:10Z) - A unified 3D framework for Organs at Risk Localization and Segmentation
for Radiation Therapy Planning [56.52933974838905]
Current medical workflow requires manual delineation of organs-at-risk (OAR)
In this work, we aim to introduce a unified 3D pipeline for OAR localization-segmentation.
Our proposed framework fully enables the exploitation of 3D context information inherent in medical imaging.
arXiv Detail & Related papers (2022-03-01T17:08:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.