SE(3) Diffusion Model-based Point Cloud Registration for Robust 6D
Object Pose Estimation
- URL: http://arxiv.org/abs/2310.17359v1
- Date: Thu, 26 Oct 2023 12:47:26 GMT
- Title: SE(3) Diffusion Model-based Point Cloud Registration for Robust 6D
Object Pose Estimation
- Authors: Haobo Jiang, Mathieu Salzmann, Zheng Dang, Jin Xie, and Jian Yang
- Abstract summary: We introduce an SE(3) diffusion model-based point cloud registration framework for 6D object pose estimation in real-world scenarios.
Our approach formulates the 3D registration task as a denoising diffusion process, which progressively refines the pose of the source point cloud.
Experiments demonstrate that our diffusion registration framework presents outstanding pose estimation performance on the real-world TUD-L, LINEMOD, and Occluded-LINEMOD datasets.
- Score: 66.16525145765604
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce an SE(3) diffusion model-based point cloud
registration framework for 6D object pose estimation in real-world scenarios.
Our approach formulates the 3D registration task as a denoising diffusion
process, which progressively refines the pose of the source point cloud to
obtain a precise alignment with the model point cloud. Training our framework
involves two operations: An SE(3) diffusion process and an SE(3) reverse
process. The SE(3) diffusion process gradually perturbs the optimal rigid
transformation of a pair of point clouds by continuously injecting noise
(perturbation transformation). By contrast, the SE(3) reverse process focuses
on learning a denoising network that refines the noisy transformation
step-by-step, bringing it closer to the optimal transformation for accurate
pose estimation. Unlike standard diffusion models used in linear Euclidean
spaces, our diffusion model operates on the SE(3) manifold. This requires
exploiting the linear Lie algebra $\mathfrak{se}(3)$ associated with SE(3) to
constrain the transformation transitions during the diffusion and reverse
processes. Additionally, to effectively train our denoising network, we derive
a registration-specific variational lower bound as the optimization objective
for model learning. Furthermore, we show that our denoising network can be
constructed with a surrogate registration model, making our approach applicable
to different deep registration networks. Extensive experiments demonstrate that
our diffusion registration framework presents outstanding pose estimation
performance on the real-world TUD-L, LINEMOD, and Occluded-LINEMOD datasets.
Related papers
- Textured 3D Regenerative Morphing with 3D Diffusion Prior [29.7508625572437]
Textured 3D morphing creates smooth and plausible sequences between two 3D objects.
Previous methods rely on establishing point-to-point correspondences and determining smooth deformation trajectories.
We propose a method for 3D regenerative morphing using a 3D diffusion prior.
arXiv Detail & Related papers (2025-02-20T07:02:22Z) - A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision [65.33043028101471]
We introduce a diffusion model for Gaussian Splats, SplatDiffusion, to enable generation of three-dimensional structures from single images.
Existing methods rely on deterministic, feed-forward predictions, which limit their ability to handle the inherent ambiguity of 3D inference from 2D data.
arXiv Detail & Related papers (2024-12-01T00:29:57Z) - 3D Equivariant Pose Regression via Direct Wigner-D Harmonics Prediction [50.07071392673984]
Existing methods learn 3D rotations parametrized in the spatial domain using angles or quaternions.
We propose a frequency-domain approach that directly predicts Wigner-D coefficients for 3D rotation regression.
Our method achieves state-of-the-art results on benchmarks such as ModelNet10-SO(3) and PASCAL3D+.
arXiv Detail & Related papers (2024-11-01T12:50:38Z) - Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration [2.814748676983944]
We propose a graph neural network model embedded with a local Spherical Euclidean 3D equivariance property through SE(3) message passing based propagation.
Our model is composed mainly of a descriptor module, equivariant graph layers, match similarity, and the final regression layers.
Experiments conducted on the 3DMatch and KITTI datasets exhibit the compelling and robust performance of our model compared to state-of-the-art approaches.
arXiv Detail & Related papers (2024-10-08T06:48:01Z) - OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control [66.03885917320189]
OrientDream is a camera orientation conditioned framework for efficient and multi-view consistent 3D generation from textual prompts.
Our strategy emphasizes the implementation of an explicit camera orientation conditioned feature in the pre-training of a 2D text-to-image diffusion module.
Our experiments reveal that our method not only produces high-quality NeRF models with consistent multi-view properties but also achieves an optimization speed significantly greater than existing methods.
arXiv Detail & Related papers (2024-06-14T13:16:18Z) - ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations.
We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z) - 6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation [16.242361975225066]
Estimating the 6D object pose from a single RGB image often involves noise and indeterminacy.
We propose a novel diffusion-based framework to handle the noise and indeterminacy in object pose estimation.
arXiv Detail & Related papers (2023-12-29T05:28:35Z) - DiffusionPCR: Diffusion Models for Robust Multi-Step Point Cloud
Registration [73.37538551605712]
Point Cloud Registration (PCR) estimates the relative rigid transformation between two point clouds.
We propose formulating PCR as a denoising diffusion probabilistic process, mapping noisy transformations to the ground truth.
Our experiments showcase the effectiveness of our DiffusionPCR, yielding state-of-the-art registration recall rates (95.3%/81.6%) on 3D and 3DLoMatch.
arXiv Detail & Related papers (2023-12-05T18:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.