The entire network structure of Crossmodal Transformer
- URL: http://arxiv.org/abs/2104.14273v1
- Date: Thu, 29 Apr 2021 11:47:31 GMT
- Title: The entire network structure of Crossmodal Transformer
- Authors: Meng Li, Changyan Lin, Lixia Shu, Xin Pu, Yi Chen, Heng Wu, Jiasong
Li, Hongshuai Cao
- Abstract summary: The proposed approach first deep learns skeletal features from 2D X-ray and 3D CT images.
As a result, the well-trained network can directly predict the spatial correspondence between arbitrary 2D X-ray and 3D CT.
- Score: 4.605531191013731
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Since the mapping relationship between definitized intra-interventional 2D
X-ray and undefined pre-interventional 3D Computed Tomography(CT) is uncertain,
auxiliary positioning devices or body markers, such as medical implants, are
commonly used to determine this relationship. However, such approaches can not
be widely used in clinical due to the complex realities. To determine the
mapping relationship, and achieve a initializtion post estimation of human body
without auxiliary equipment or markers, a cross-modal matching transformer
network is proposed to matching 2D X-ray and 3D CT images directly. The
proposed approach first deep learns skeletal features from 2D X-ray and 3D CT
images. The features are then converted into 1D X-ray and CT representation
vectors, which are combined using a multi-modal transformer. As a result, the
well-trained network can directly predict the spatial correspondence between
arbitrary 2D X-ray and 3D CT. The experimental results show that when combining
our approach with the conventional approach, the achieved accuracy and speed
can meet the basic clinical intervention needs, and it provides a new direction
for intra-interventional registration.
Related papers
- Intraoperative Registration by Cross-Modal Inverse Neural Rendering [61.687068931599846]
We present a novel approach for 3D/2D intraoperative registration during neurosurgery via cross-modal inverse neural rendering.
Our approach separates implicit neural representation into two components, handling anatomical structure preoperatively and appearance intraoperatively.
We tested our method on retrospective patients' data from clinical cases, showing that our method outperforms state-of-the-art while meeting current clinical standards for registration.
arXiv Detail & Related papers (2024-09-18T13:40:59Z) - Intraoperative 2D/3D Image Registration via Differentiable X-ray Rendering [5.617649111108429]
We present DiffPose, a self-supervised approach that leverages patient-specific simulation and differentiable physics-based rendering to achieve accurate 2D/3D registration without relying on manually labeled data.
DiffPose achieves sub-millimeter accuracy across surgical datasets at intraoperative speeds, improving upon existing unsupervised methods by an order of magnitude and even outperforming supervised baselines.
arXiv Detail & Related papers (2023-12-11T13:05:54Z) - X-Ray to CT Rigid Registration Using Scene Coordinate Regression [1.1687067206676627]
This paper proposes a fully automatic registration method that is robust to extreme viewpoints.
It is based on a fully convolutional neural network (CNN) that regresses the overlapping coordinates for a given X-ray image.
The proposed method achieved an average mean target registration error (mTRE) of 3.79 mm in the 50th percentile of the simulated test dataset and projected mTRE of 9.65 mm in the 50th percentile of real fluoroscopic images for pelvis registration.
arXiv Detail & Related papers (2023-11-25T17:48:46Z) - CNN-based real-time 2D-3D deformable registration from a single X-ray
projection [2.1198879079315573]
This paper presents a method for real-time 2D-3D non-rigid registration using a single fluoroscopic image.
A dataset composed of displacement fields and 2D projections of the anatomy is generated from a preoperative scan.
A neural network is trained to recover the unknown 3D displacement field from a single projection image.
arXiv Detail & Related papers (2022-12-15T09:57:19Z) - View-Disentangled Transformer for Brain Lesion Detection [50.4918615815066]
We propose a novel view-disentangled transformer to enhance the extraction of MRI features for more accurate tumour detection.
First, the proposed transformer harvests long-range correlation among different positions in a 3D brain scan.
Second, the transformer models a stack of slice features as multiple 2D views and enhance these features view-by-view.
Third, we deploy the proposed transformer module in a transformer backbone, which can effectively detect the 2D regions surrounding brain lesions.
arXiv Detail & Related papers (2022-09-20T11:58:23Z) - IGCN: Image-to-graph Convolutional Network for 2D/3D Deformable
Registration [1.2246649738388387]
We propose an image-to-graph convolutional network that achieves deformable registration of a 3D organ mesh for a single-viewpoint 2D projection image.
We show shape prediction considering relationships among multiple organs can be used to predict respiratory motion and deformation from radiographs with clinically acceptable accuracy.
arXiv Detail & Related papers (2021-10-31T12:48:37Z) - 3D Reconstruction of Curvilinear Structures with Stereo Matching
DeepConvolutional Neural Networks [52.710012864395246]
We propose a fully automated pipeline for both detection and matching of curvilinear structures in stereo pairs.
We mainly focus on 3D reconstruction of dislocations from stereo pairs of TEM images.
arXiv Detail & Related papers (2021-10-14T23:05:47Z) - Revisiting 3D Context Modeling with Supervised Pre-training for
Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices.
With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset.
The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z) - XraySyn: Realistic View Synthesis From a Single Radiograph Through CT
Priors [118.27130593216096]
A radiograph visualizes the internal anatomy of a patient through the use of X-ray, which projects 3D information onto a 2D plane.
To the best of our knowledge, this is the first work on radiograph view synthesis.
We show that by gaining an understanding of radiography in 3D space, our method can be applied to radiograph bone extraction and suppression without groundtruth bone labels.
arXiv Detail & Related papers (2020-12-04T05:08:53Z) - Tattoo tomography: Freehand 3D photoacoustic image reconstruction with
an optical pattern [49.240017254888336]
Photoacoustic tomography (PAT) is a novel imaging technique that can resolve both morphological and functional tissue properties.
A current drawback is the limited field-of-view provided by the conventionally applied 2D probes.
We present a novel approach to 3D reconstruction of PAT data that does not require an external tracking system.
arXiv Detail & Related papers (2020-11-10T09:27:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.