X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography
- URL: http://arxiv.org/abs/2505.15235v2
- Date: Mon, 26 May 2025 14:57:30 GMT
- Title: X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography
- Authors: Yifan Liu, Wuyang Li, Weihao Yu, Chenxin Li, Alexandre Alahi, Max Meng, Yixuan Yuan,
- Abstract summary: Computed Tomography serves as an indispensable tool in clinical, providing non-invasive visualization of internal anatomical structures.<n>Existing CT reconstruction works are limited to small-capacity model architecture and inflexible volume representation.<n>We present X-GRM, a large feedforward model for reconstructing 3D CT volumes from sparse-view 2D X-ray projections.
- Score: 89.84588038174721
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Computed Tomography serves as an indispensable tool in clinical workflows, providing non-invasive visualization of internal anatomical structures. Existing CT reconstruction works are limited to small-capacity model architecture and inflexible volume representation. In this work, we present X-GRM (X-ray Gaussian Reconstruction Model), a large feedforward model for reconstructing 3D CT volumes from sparse-view 2D X-ray projections. X-GRM employs a scalable transformer-based architecture to encode sparse-view X-ray inputs, where tokens from different views are integrated efficiently. Then, these tokens are decoded into a novel volume representation, named Voxel-based Gaussian Splatting (VoxGS), which enables efficient CT volume extraction and differentiable X-ray rendering. This combination of a high-capacity model and flexible volume representation, empowers our model to produce high-quality reconstructions from various testing inputs, including in-domain and out-domain X-ray projections. Our codes are available at: https://github.com/CUHK-AIM-Group/X-GRM.
Related papers
- X-Field: A Physically Grounded Representation for 3D X-ray Reconstruction [25.13707706037451]
X-ray imaging is indispensable in medical diagnostics, yet its use is tightly regulated due to potential health risks.<n>Recent research focuses on generating novel views from sparse inputs and reconstructing Computed Tomography (CT) volumes.<n>We introduce X-Field, the first 3D representation specifically designed for X-ray imaging.
arXiv Detail & Related papers (2025-03-11T16:31:56Z) - X-LRM: X-ray Large Reconstruction Model for Extremely Sparse-View Computed Tomography Recovery in One Second [52.11676689269379]
Sparse-view 3D CT reconstruction aims to recover structures from a limited number of 2D X-ray projections.<n>Existing feedforward methods are constrained by the limited capacity of CNN-based architectures and the scarcity of large-scale training datasets.<n>We propose an X-ray Large Reconstruction Model (X-LRM) for extremely sparse-view (10 views) CT reconstruction.
arXiv Detail & Related papers (2025-03-09T01:39:59Z) - Fan-Beam CT Reconstruction for Unaligned Sparse-View X-ray Baggage Dataset [0.0]
We present a calibration and reconstruction method using an unaligned sparse multi-view X-ray baggage dataset.<n>Our approach integrates multi-spectral neural attenuation field reconstruction with Linear pushbroom (LPB) camera model pose optimization.
arXiv Detail & Related papers (2024-12-04T05:16:54Z) - Differentiable Voxel-based X-ray Rendering Improves Sparse-View 3D CBCT Reconstruction [4.941613865666241]
We present DiffVox, a self-supervised framework for Cone-Beam Computed Tomography (CBCT) reconstruction.<n>As a result, we reconstruct high-fidelity 3D CBCT volumes from fewer X-rays, potentially reducing ionizing radiation exposure and improving diagnostic utility.
arXiv Detail & Related papers (2024-11-28T15:49:08Z) - R2GenCSR: Retrieving Context Samples for Large Language Model based X-ray Medical Report Generation [7.4871243017824165]
This paper proposes a novel context-guided efficient X-ray medical report generation framework.
Specifically, we introduce the Mamba as the vision backbone with linear complexity, and the performance obtained is comparable to that of the strong Transformer model.
arXiv Detail & Related papers (2024-08-19T07:15:11Z) - X-Recon: Learning-based Patient-specific High-Resolution CT Reconstruction from Orthogonal X-Ray Images [14.04604990570727]
X-Recon is a reconstruction network based on ortho-lateral chest X-ray images.
PTX-Seg is a zero-shot pneumothorax segmentation algorithm.
The reconstruction metrics achieved state-of-the-art performance in terms of several metrics including peak signal-to-noise ratio.
arXiv Detail & Related papers (2024-07-22T03:55:36Z) - R$^2$-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction [53.19869886963333]
3D Gaussian splatting (3DGS) has shown promising results in rendering image and surface reconstruction.
This paper introduces R2$-Gaussian, the first 3DGS-based framework for sparse-view tomographic reconstruction.
arXiv Detail & Related papers (2024-05-31T08:39:02Z) - Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis [88.86777314004044]
We propose a 3D Gaussian splatting-based framework, namely X-Gaussian, for X-ray novel view visualization.
Experiments show that our X-Gaussian outperforms state-of-the-art methods by 6.5 dB while enjoying less than 15% training time and over 73x inference speed.
arXiv Detail & Related papers (2024-03-07T00:12:08Z) - XProspeCT: CT Volume Generation from Paired X-Rays [0.0]
We build on previous research to convert X-ray images into simulated CT volumes.
Model variations include UNet architectures, custom connections, activation functions, loss functions, and a novel back projection approach.
arXiv Detail & Related papers (2024-02-11T21:57:49Z) - XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models [72.8965643836841]
We introduce XrayGPT, a novel conversational medical vision-language model.<n>It can analyze and answer open-ended questions about chest radiographs.<n>We generate 217k interactive and high-quality summaries from free-text radiology reports.
arXiv Detail & Related papers (2023-06-13T17:59:59Z) - Hierarchical Amortized Training for Memory-efficient High Resolution 3D
GAN [52.851990439671475]
We propose a novel end-to-end GAN architecture that can generate high-resolution 3D images.
We achieve this goal by using different configurations between training and inference.
Experiments on 3D thorax CT and brain MRI demonstrate that our approach outperforms state of the art in image generation.
arXiv Detail & Related papers (2020-08-05T02:33:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.