TRACE: Temporally Reliable Anatomically-Conditioned 3D CT Generation with Enhanced Efficiency
- URL: http://arxiv.org/abs/2507.00802v1
- Date: Tue, 01 Jul 2025 14:35:39 GMT
- Title: TRACE: Temporally Reliable Anatomically-Conditioned 3D CT Generation with Enhanced Efficiency
- Authors: Minye Shao, Xingyu Miao, Haoran Duan, Zeyu Wang, Jingkun Chen, Yawen Huang, Xian Wu, Jingjing Deng, Yang Long, Yefeng Zheng,
- Abstract summary: TRACE is a framework that generates 3D medical images with temporal alignment.<n>An overlapping-frame frame pairs pairs into a flexible length sequence, reconstructed into atemporally and anatomically aligned 3D volume.
- Score: 40.82927972746919
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D medical image generation is essential for data augmentation and patient privacy, calling for reliable and efficient models suited for clinical practice. However, current methods suffer from limited anatomical fidelity, restricted axial length, and substantial computational cost, placing them beyond reach for regions with limited resources and infrastructure. We introduce TRACE, a framework that generates 3D medical images with spatiotemporal alignment using a 2D multimodal-conditioned diffusion approach. TRACE models sequential 2D slices as video frame pairs, combining segmentation priors and radiology reports for anatomical alignment, incorporating optical flow to sustain temporal coherence. During inference, an overlapping-frame strategy links frame pairs into a flexible length sequence, reconstructed into a spatiotemporally and anatomically aligned 3D volume. Experimental results demonstrate that TRACE effectively balances computational efficiency with preserving anatomical fidelity and spatiotemporal consistency. Code is available at: https://github.com/VinyehShaw/TRACE.
Related papers
- ClinicalFMamba: Advancing Clinical Assessment using Mamba-based Multimodal Neuroimaging Fusion [7.0879234284391455]
Multimodal medical image fusion integrates complementary information from different imaging modalities to enhance diagnostic accuracy and treatment planning.<n>CNNs excel at local feature extraction but struggle to model global context effectively.<n>Transformers achieve superior long-range modeling at the cost of quadratic computational complexity.<n>Recent State Space Models (SSMs) offer a promising alternative.<n>We propose ClinicalFMamba, a novel end-to-end CNN-Mamba hybrid architecture.
arXiv Detail & Related papers (2025-08-05T02:25:53Z) - Latent Space Consistency for Sparse-View CT Reconstruction [10.057432803124167]
Latent Diffusion Model (LDM) has demonstrated promising potential in the domain of 3D CT reconstruction.<n>Cross-modal feature contrastive learning is used to efficiently extract latent 3D information from 2D Xray images.<n>Results indicate that CLS-DM outperforms classical and state-of-the-art generative models in terms of standard voxel-level metrics.
arXiv Detail & Related papers (2025-07-15T10:02:19Z) - ReCoGNet: Recurrent Context-Guided Network for 3D MRI Prostate Segmentation [11.248082139905865]
We propose a hybrid architecture that models MRI sequences as annotated data.<n>Our method uses a deep, preserving pretrained DeepVLab3 backbone to extract high-level semantic features from each MRI slice and a recurrent convolutional head, built with ConvLSTM layers, to integrate information across slices.<n>Compared to state-of-the-art 2D and 3D segmentation models, our approach demonstrates superior performance in terms of precision, recall, Intersection over Union (IoU), Dice Similarity Coefficient (DSC) and robustness.
arXiv Detail & Related papers (2025-06-24T14:56:55Z) - Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining [0.8714814768600079]
We introduce a novel architecture for Text-to-CT generation that combines a latent diffusion model with a 3D contrastive vision-language pretraining scheme.<n>Our method offers a scalable and controllable solution for synthesizing clinically meaningful CT volumes from text.
arXiv Detail & Related papers (2025-05-31T16:41:55Z) - Abnormality-Driven Representation Learning for Radiology Imaging [0.8321462983924758]
We introduce lesion-enhanced contrastive learning (LeCL), a novel approach to obtain visual representations driven by abnormalities in 2D axial slices across different locations of the CT scans.
We evaluate our approach across three clinical tasks: tumor lesion location, lung disease detection, and patient staging, benchmarking against four state-of-the-art foundation models.
arXiv Detail & Related papers (2024-11-25T13:53:26Z) - Epicardium Prompt-guided Real-time Cardiac Ultrasound Frame-to-volume Registration [50.602074919305636]
This paper introduces a lightweight end-to-end Cardiac Ultrasound frame-to-volume Registration network, termed CU-Reg.<n>We use epicardium prompt-guided anatomical clues to reinforce the interaction of 2D sparse and 3D dense features, followed by a voxel-wise local-global aggregation of enhanced features.
arXiv Detail & Related papers (2024-06-20T17:47:30Z) - Spatiotemporal Modeling Encounters 3D Medical Image Analysis:
Slice-Shift UNet with Multi-View Fusion [0.0]
We propose a new 2D-based model dubbed Slice SHift UNet which encodes three-dimensional features at 2D CNN's complexity.
More precisely multi-view features are collaboratively learned by performing 2D convolutions along the three planes of a volume.
The effectiveness of our approach is validated in Multi-Modality Abdominal Multi-Organ axis (AMOS) and Multi-Atlas Labeling Beyond the Cranial Vault (BTCV) datasets.
arXiv Detail & Related papers (2023-07-24T14:53:23Z) - 3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation [52.699139151447945]
We propose a novel adaptation method for transferring the segment anything model (SAM) from 2D to 3D for promptable medical image segmentation.
Our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation.
arXiv Detail & Related papers (2023-06-23T12:09:52Z) - Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction [53.93674177236367]
Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging.
Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image.
This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses.
We introduce a novel geometry-aware encoder-decoder framework to solve this problem.
arXiv Detail & Related papers (2023-03-26T14:38:42Z) - 3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers [101.44668514239959]
We propose a hybrid encoder-decoder framework that efficiently computes spatial and temporal attentions in parallel.
We also introduce a semantic clutter-background adversarial loss during training that aids in the region of mitochondria instances from the background.
arXiv Detail & Related papers (2023-03-21T17:58:49Z) - The entire network structure of Crossmodal Transformer [4.605531191013731]
The proposed approach first deep learns skeletal features from 2D X-ray and 3D CT images.
As a result, the well-trained network can directly predict the spatial correspondence between arbitrary 2D X-ray and 3D CT.
arXiv Detail & Related papers (2021-04-29T11:47:31Z) - Revisiting 3D Context Modeling with Supervised Pre-training for
Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices.
With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset.
The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.