Swin-X2S: Reconstructing 3D Shape from 2D Biplanar X-ray with Swin Transformers
- URL: http://arxiv.org/abs/2501.05961v1
- Date: Fri, 10 Jan 2025 13:41:10 GMT
- Title: Swin-X2S: Reconstructing 3D Shape from 2D Biplanar X-ray with Swin Transformers
- Authors: Kuan Liu, Zongyuan Ying, Jie Jin, Dongyan Li, Ping Huang, Wenjian Wu, Zhe Chen, Jin Qi, Yong Lu, Lianfu Deng, Bo Chen,
- Abstract summary: Swin-X2S is an end-to-end deep learning method for reconstructing 3D segmentation and labeling from 2D X-ray images.
A dimension-expanding module is introduced to bridge the encoder and decoder, ensuring a smooth conversion from 2D pixels to 3D voxels.
- Score: 8.357602965532923
- License:
- Abstract: The conversion from 2D X-ray to 3D shape holds significant potential for improving diagnostic efficiency and safety. However, existing reconstruction methods often rely on hand-crafted features, manual intervention, and prior knowledge, resulting in unstable shape errors and additional processing costs. In this paper, we introduce Swin-X2S, an end-to-end deep learning method for directly reconstructing 3D segmentation and labeling from 2D biplanar orthogonal X-ray images. Swin-X2S employs an encoder-decoder architecture: the encoder leverages 2D Swin Transformer for X-ray information extraction, while the decoder employs 3D convolution with cross-attention to integrate structural features from orthogonal views. A dimension-expanding module is introduced to bridge the encoder and decoder, ensuring a smooth conversion from 2D pixels to 3D voxels. We evaluate proposed method through extensive qualitative and quantitative experiments across nine publicly available datasets covering four anatomies (femur, hip, spine, and rib), with a total of 54 categories. Significant improvements over previous methods have been observed not only in the segmentation and labeling metrics but also in the clinically relevant parameters that are of primary concern in practical applications, which demonstrates the promise of Swin-X2S to provide an effective option for anatomical shape reconstruction in clinical scenarios. Code implementation is available at: \url{https://github.com/liukuan5625/Swin-X2S}.
Related papers
- 3DPX: Single Panoramic X-ray Analysis Guided by 3D Oral Structure Reconstruction [19.164694943725202]
Panoramic X-ray (PX) is a prevalent modality in dentistry practice owing to its wide availability and low cost.
As a 2D projection of a 3D structure, PX suffers from anatomical information loss and PX diagnosis is limited compared to that with 3D imaging modalities.
2D-to-3D reconstruction methods have been explored for the ability to synthesize the absent 3D anatomical information from 2D PX for use in PX image analysis.
arXiv Detail & Related papers (2024-09-27T12:44:06Z) - 3DPX: Progressive 2D-to-3D Oral Image Reconstruction with Hybrid MLP-CNN Networks [16.931777224277347]
Panoramic X-ray (PX) is a prevalent modality in dental practice for its wide availability and low cost.
As a 2D projection image, PX does not contain 3D anatomical information.
We propose a progressive hybrid Multilayer Perceptron (MLP)-CNN pyra-mid network (3DPX) for 2D-to-3D oral PX reconstruction.
arXiv Detail & Related papers (2024-08-02T14:28:10Z) - Cross-Dimensional Medical Self-Supervised Representation Learning Based on a Pseudo-3D Transformation [68.60747298865394]
We propose a new cross-dimensional SSL framework based on a pseudo-3D transformation (CDSSL-P3D)
Specifically, we introduce an image transformation based on the im2col algorithm, which converts 2D images into a format consistent with 3D data.
This transformation enables seamless integration of 2D and 3D data, and facilitates cross-dimensional self-supervised learning for 3D medical image analysis.
arXiv Detail & Related papers (2024-06-03T02:57:25Z) - R$^2$-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction [53.19869886963333]
3D Gaussian splatting (3DGS) has shown promising results in rendering image and surface reconstruction.
This paper introduces R2$-Gaussian, the first 3DGS-based framework for sparse-view tomographic reconstruction.
arXiv Detail & Related papers (2024-05-31T08:39:02Z) - NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized
Device Coordinates Space [77.6067460464962]
Monocular 3D Semantic Scene Completion (SSC) has garnered significant attention in recent years due to its potential to predict complex semantics and geometry shapes from a single image, requiring no 3D inputs.
We identify several critical issues in current state-of-the-art methods, including the Feature Ambiguity of projected 2D features in the ray to the 3D space, the Pose Ambiguity of the 3D convolution, and the Imbalance in the 3D convolution across different depth levels.
We devise a novel Normalized Device Coordinates scene completion network (NDC-Scene) that directly extends the 2
arXiv Detail & Related papers (2023-09-26T02:09:52Z) - Multi-dimension unified Swin Transformer for 3D Lesion Segmentation in
Multiple Anatomical Locations [1.7413461132662074]
We propose a novel model, denoted a multi-dimension unified Swin transformer (MDU-ST) for 3D lesion segmentation.
The network's performance is evaluated by the Dice similarity coefficient (DSC) and Hausdorff distance (HD) using an internal 3D lesion dataset.
The proposed method can be used to conduct automated 3D lesion segmentation to assist radiomics and tumor growth modeling studies.
arXiv Detail & Related papers (2023-09-04T21:24:00Z) - Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction [53.93674177236367]
Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging.
Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image.
This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses.
We introduce a novel geometry-aware encoder-decoder framework to solve this problem.
arXiv Detail & Related papers (2023-03-26T14:38:42Z) - CNN-based real-time 2D-3D deformable registration from a single X-ray
projection [2.1198879079315573]
This paper presents a method for real-time 2D-3D non-rigid registration using a single fluoroscopic image.
A dataset composed of displacement fields and 2D projections of the anatomy is generated from a preoperative scan.
A neural network is trained to recover the unknown 3D displacement field from a single projection image.
arXiv Detail & Related papers (2022-12-15T09:57:19Z) - IGCN: Image-to-graph Convolutional Network for 2D/3D Deformable
Registration [1.2246649738388387]
We propose an image-to-graph convolutional network that achieves deformable registration of a 3D organ mesh for a single-viewpoint 2D projection image.
We show shape prediction considering relationships among multiple organs can be used to predict respiratory motion and deformation from radiographs with clinically acceptable accuracy.
arXiv Detail & Related papers (2021-10-31T12:48:37Z) - FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection [78.00922683083776]
It is non-trivial to make a general adapted 2D detector work in this 3D task.
In this technical report, we study this problem with a practice built on fully convolutional single-stage detector.
Our solution achieves 1st place out of all the vision-only methods in the nuScenes 3D detection challenge of NeurIPS 2020.
arXiv Detail & Related papers (2021-04-22T09:35:35Z) - Revisiting 3D Context Modeling with Supervised Pre-training for
Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices.
With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset.
The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.