The Impact of Machine Learning on 2D/3D Registration for Image-guided
Interventions: A Systematic Review and Perspective
- URL: http://arxiv.org/abs/2108.02238v1
- Date: Wed, 4 Aug 2021 18:31:29 GMT
- Title: The Impact of Machine Learning on 2D/3D Registration for Image-guided
Interventions: A Systematic Review and Perspective
- Authors: Mathias Unberath, Cong Gao, Yicheng Hu, Max Judish, Russell H Taylor,
Mehran Armand, Robert Grupp
- Abstract summary: Image-based navigation is widely considered the next frontier of minimally invasive surgery.
2D/3D registration is a technique to estimate the spatial relationships between 3D structures and 2D images.
Recent advent of machine learning-based approaches to imaging problems holds promise for solving some of the notorious challenges in 2D/3D registration.
- Score: 6.669432838047949
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image-based navigation is widely considered the next frontier of minimally
invasive surgery. It is believed that image-based navigation will increase the
access to reproducible, safe, and high-precision surgery as it may then be
performed at acceptable costs and effort. This is because image-based
techniques avoid the need of specialized equipment and seamlessly integrate
with contemporary workflows. Further, it is expected that image-based
navigation will play a major role in enabling mixed reality environments and
autonomous, robotic workflows. A critical component of image guidance is 2D/3D
registration, a technique to estimate the spatial relationships between 3D
structures, e.g., volumetric imagery or tool models, and 2D images thereof,
such as fluoroscopy or endoscopy. While image-based 2D/3D registration is a
mature technique, its transition from the bench to the bedside has been
restrained by well-known challenges, including brittleness of the optimization
objective, hyperparameter selection, and initialization, difficulties around
inconsistencies or multiple objects, and limited single-view performance. One
reason these challenges persist today is that analytical solutions are likely
inadequate considering the complexity, variability, and high-dimensionality of
generic 2D/3D registration problems. The recent advent of machine
learning-based approaches to imaging problems that, rather than specifying the
desired functional mapping, approximate it using highly expressive parametric
models holds promise for solving some of the notorious challenges in 2D/3D
registration. In this manuscript, we review the impact of machine learning on
2D/3D registration to systematically summarize the recent advances made by
introduction of this novel technology. Grounded in these insights, we then
offer our perspective on the most pressing needs, significant open problems,
and possible next steps.
Related papers
- Explainable AI for Collaborative Assessment of 2D/3D Registration Quality [50.65650507103078]
We propose the first artificial intelligence framework trained specifically for 2D/3D registration quality verification.<n>Our explainable AI (XAI) approach aims to enhance informed decision-making for human operators.
arXiv Detail & Related papers (2025-07-23T15:28:57Z) - Toward a Real-Time Framework for Accurate Monocular 3D Human Pose Estimation with Geometric Priors [0.0]
We propose a framework that combines real-time 2D keypoint detection with geometry-aware 2D-to-3D lifting.<n>We discuss how these ingredients can enable fast, personalized, and accurate 3D pose estimation from monocular images without requiring specialized hardware.
arXiv Detail & Related papers (2025-07-21T08:18:23Z) - Robust and Accurate Multi-view 2D/3D Image Registration with Differentiable X-ray Rendering and Dual Cross-view Constraints [45.57808049168089]
We propose a novel multi-view 2D/3D rigid registration approach comprising two stages.<n>In the first stage, a combined loss function is designed, incorporating both the differences between predicted and ground-truth poses.<n>In the second stage, test-time optimization is performed to refine the estimated poses from the coarse stage.
arXiv Detail & Related papers (2025-06-27T12:57:58Z) - DeProPose: Deficiency-Proof 3D Human Pose Estimation via Adaptive Multi-View Fusion [57.83515140886807]
We introduce the task of Deficiency-Aware 3D Pose Estimation.
DeProPose is a flexible method that simplifies the network architecture to reduce training complexity.
We have developed a novel 3D human pose estimation dataset.
arXiv Detail & Related papers (2025-02-23T03:22:54Z) - Cross-D Conv: Cross-Dimensional Transferable Knowledge Base via Fourier Shifting Operation [3.69758875412828]
Cross-D Conv operation bridges the dimensional gap by learning the phase shifting in the Fourier domain.
Our method enables seamless weight transfer between 2D and 3D convolution operations, effectively facilitating cross-dimensional learning.
arXiv Detail & Related papers (2024-11-02T13:03:44Z) - Markerless Multi-view 3D Human Pose Estimation: a survey [0.49157446832511503]
3D human pose estimation involves reconstructing the human skeleton by detecting the body joints.<n> Accurate and efficient solutions are required for several real-world applications including animation, human-robot interaction, surveillance, and sports.<n>However, challenges such as occlusions, 2D pose mismatches, random camera perspectives, and limited 3D labelled data have been hampering the models' performance.
arXiv Detail & Related papers (2024-07-04T10:44:35Z) - Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes [65.22070581594426]
"Implicit-Zoo" is a large-scale dataset requiring thousands of GPU training days to facilitate research and development in this field.
We showcase two immediate benefits as it enables to: (1) learn token locations for transformer models; (2) directly regress 3D cameras poses of 2D images with respect to NeRF models.
This in turn leads to an improved performance in all three task of image classification, semantic segmentation, and 3D pose regression, thereby unlocking new avenues for research.
arXiv Detail & Related papers (2024-06-25T10:20:44Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.
Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - Lifting by Image -- Leveraging Image Cues for Accurate 3D Human Pose
Estimation [10.374944534302234]
"lifting from 2D pose" method has been the dominant approach to 3D Human Pose Estimation (3DHPE)
Rich semantic and texture information in images can contribute to a more accurate "lifting" procedure.
In this paper, we give new insight into the cause of poor generalization problems and the effectiveness of image features.
arXiv Detail & Related papers (2023-12-25T07:50:58Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion
Probabilistic Model [25.223801390996435]
This paper focuses on reconstructing a 3D pose from a single 2D keypoint detection.
We build a novel diffusion-based framework to effectively sample diverse 3D poses from an off-the-shelf 2D detector.
We evaluate our method on the widely adopted Human3.6M and HumanEva-I datasets.
arXiv Detail & Related papers (2022-12-06T07:22:20Z) - State of the Art in Dense Monocular Non-Rigid 3D Reconstruction [100.9586977875698]
3D reconstruction of deformable (or non-rigid) scenes from a set of monocular 2D image observations is a long-standing and actively researched area of computer vision and graphics.
This survey focuses on state-of-the-art methods for dense non-rigid 3D reconstruction of various deformable objects and composite scenes from monocular videos or sets of monocular views.
arXiv Detail & Related papers (2022-10-27T17:59:53Z) - RiCS: A 2D Self-Occlusion Map for Harmonizing Volumetric Objects [68.85305626324694]
Ray-marching in Camera Space (RiCS) is a new method to represent the self-occlusions of foreground objects in 3D into a 2D self-occlusion map.
We show that our representation map not only allows us to enhance the image quality but also to model temporally coherent complex shadow effects.
arXiv Detail & Related papers (2022-05-14T05:35:35Z) - Multi-Objective Dual Simplex-Mesh Based Deformable Image Registration
for 3D Medical Images -- Proof of Concept [0.7734726150561088]
This work introduces the first method for multi-objective 3D deformable image registration, using a 3D dual-dynamic grid transformation model based on simplex meshes.
Our proof-of-concept prototype shows promising results on synthetic and clinical 3D registration problems.
arXiv Detail & Related papers (2022-02-22T16:07:29Z) - Attention-Guided Version of 2D UNet for Automatic Brain Tumor
Segmentation [2.371982686172067]
Gliomas are the most common and aggressive among brain tumors, which cause a short life expectancy in their highest grade.
Deep convolutional neural networks (DCNNs) have achieved a remarkable performance in brain tumor segmentation.
However, this task is still difficult owing to high varying intensity and appearance of gliomas.
arXiv Detail & Related papers (2020-04-04T20:09:06Z) - Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the
Wild [101.70320427145388]
We propose a weakly-supervised approach that does not require 3D annotations and learns to estimate 3D poses from unlabeled multi-view data.
We evaluate our proposed approach on two large scale datasets.
arXiv Detail & Related papers (2020-03-17T08:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.