Related papers: SE(3) Diffusion Model-based Point Cloud Registration for Robust 6D Object Pose Estimation

SE(3) Diffusion Model-based Point Cloud Registration for Robust 6D Object Pose Estimation

URL: http://arxiv.org/abs/2310.17359v1
Date: Thu, 26 Oct 2023 12:47:26 GMT
Title: SE(3) Diffusion Model-based Point Cloud Registration for Robust 6D Object Pose Estimation
Authors: Haobo Jiang, Mathieu Salzmann, Zheng Dang, Jin Xie, and Jian Yang
Abstract summary: We introduce an SE(3) diffusion model-based point cloud registration framework for 6D object pose estimation in real-world scenarios. Our approach formulates the 3D registration task as a denoising diffusion process, which progressively refines the pose of the source point cloud. Experiments demonstrate that our diffusion registration framework presents outstanding pose estimation performance on the real-world TUD-L, LINEMOD, and Occluded-LINEMOD datasets.
Score: 66.16525145765604
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we introduce an SE(3) diffusion model-based point cloud registration framework for 6D object pose estimation in real-world scenarios. Our approach formulates the 3D registration task as a denoising diffusion process, which progressively refines the pose of the source point cloud to obtain a precise alignment with the model point cloud. Training our framework involves two operations: An SE(3) diffusion process and an SE(3) reverse process. The SE(3) diffusion process gradually perturbs the optimal rigid transformation of a pair of point clouds by continuously injecting noise (perturbation transformation). By contrast, the SE(3) reverse process focuses on learning a denoising network that refines the noisy transformation step-by-step, bringing it closer to the optimal transformation for accurate pose estimation. Unlike standard diffusion models used in linear Euclidean spaces, our diffusion model operates on the SE(3) manifold. This requires exploiting the linear Lie algebra $\mathfrak{se}(3)$ associated with SE(3) to constrain the transformation transitions during the diffusion and reverse processes. Additionally, to effectively train our denoising network, we derive a registration-specific variational lower bound as the optimization objective for model learning. Furthermore, we show that our denoising network can be constructed with a surrogate registration model, making our approach applicable to different deep registration networks. Extensive experiments demonstrate that our diffusion registration framework presents outstanding pose estimation performance on the real-world TUD-L, LINEMOD, and Occluded-LINEMOD datasets.

Related papers

SHaDe: Compact and Consistent Dynamic 3D Reconstruction via Tri-Plane Deformation and Latent Diffusion [0.0]
We present a novel framework for dynamic 3D scene reconstruction that integrates three key components.<n>An explicit tri-plane deformation field, a view-conditioned canonical field with spherical harmonics (SH) attention, and a temporally-aware latent diffusion prior.<n>Our method encodes 4D scenes using three 2D feature planes that evolve over time, enabling efficient compact representation.
arXiv Detail & Related papers (2025-05-22T11:25:38Z)
Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration [33.8118117906136]
We introduce an innovative paradigm that leverages a diffusion model in matrix space for robust matching matrix estimation. Specifically, we apply the diffusion model in the doubly matrix space for 3D-3D and 2D-3D registration tasks. For all three registration tasks, we provide adaptive matching matrix embedding implementations tailored to the specific characteristics of each task.
arXiv Detail & Related papers (2025-03-06T06:13:27Z)
Textured 3D Regenerative Morphing with 3D Diffusion Prior [29.7508625572437]
Textured 3D morphing creates smooth and plausible sequences between two 3D objects. Previous methods rely on establishing point-to-point correspondences and determining smooth deformation trajectories. We propose a method for 3D regenerative morphing using a 3D diffusion prior.
arXiv Detail & Related papers (2025-02-20T07:02:22Z)
A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision [65.33043028101471]
We introduce a diffusion model for Gaussian Splats, SplatDiffusion, to enable generation of three-dimensional structures from single images. Existing methods rely on deterministic, feed-forward predictions, which limit their ability to handle the inherent ambiguity of 3D inference from 2D data.
arXiv Detail & Related papers (2024-12-01T00:29:57Z)
3D Equivariant Pose Regression via Direct Wigner-D Harmonics Prediction [50.07071392673984]
Existing methods learn 3D rotations parametrized in the spatial domain using angles or quaternions. We propose a frequency-domain approach that directly predicts Wigner-D coefficients for 3D rotation regression. Our method achieves state-of-the-art results on benchmarks such as ModelNet10-SO(3) and PASCAL3D+.
arXiv Detail & Related papers (2024-11-01T12:50:38Z)
Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration [2.814748676983944]
We propose a graph neural network model embedded with a local Spherical Euclidean 3D equivariance property through SE(3) message passing based propagation. Our model is composed mainly of a descriptor module, equivariant graph layers, match similarity, and the final regression layers. Experiments conducted on the 3DMatch and KITTI datasets exhibit the compelling and robust performance of our model compared to state-of-the-art approaches.
arXiv Detail & Related papers (2024-10-08T06:48:01Z)
OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control [66.03885917320189]
OrientDream is a camera orientation conditioned framework for efficient and multi-view consistent 3D generation from textual prompts. Our strategy emphasizes the implementation of an explicit camera orientation conditioned feature in the pre-training of a 2D text-to-image diffusion module. Our experiments reveal that our method not only produces high-quality NeRF models with consistent multi-view properties but also achieves an optimization speed significantly greater than existing methods.
arXiv Detail & Related papers (2024-06-14T13:16:18Z)
IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images [50.4538089115248]
Generalizable 3D object reconstruction from single-view RGB-D images remains a challenging task. We propose a novel approach, IPoD, which harmonizes implicit field learning with point diffusion. Experiments conducted on the CO3D-v2 dataset affirm the superiority of IPoD, achieving 7.8% improvement in F-score and 28.6% in Chamfer distance over existing methods.
arXiv Detail & Related papers (2024-03-30T07:17:37Z)
ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations. We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z)
6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation [16.242361975225066]
Estimating the 6D object pose from a single RGB image often involves noise and indeterminacy. We propose a novel diffusion-based framework to handle the noise and indeterminacy in object pose estimation.
arXiv Detail & Related papers (2023-12-29T05:28:35Z)
DiffusionPCR: Diffusion Models for Robust Multi-Step Point Cloud Registration [73.37538551605712]
Point Cloud Registration (PCR) estimates the relative rigid transformation between two point clouds. We propose formulating PCR as a denoising diffusion probabilistic process, mapping noisy transformations to the ground truth. Our experiments showcase the effectiveness of our DiffusionPCR, yielding state-of-the-art registration recall rates (95.3%/81.6%) on 3D and 3DLoMatch.
arXiv Detail & Related papers (2023-12-05T18:59:41Z)
3DifFusionDet: Diffusion Model for 3D Object Detection with Robust LiDAR-Camera Fusion [6.914463996768285]
3DifFusionDet structures 3D object detection as a denoising diffusion process from noisy 3D boxes to target boxes. Under the feature align strategy, the progressive refinement method could make a significant contribution to robust LiDAR-Camera fusion. Experiments on KITTI, a benchmark for real-world traffic object identification, revealed that 3DifFusionDet is able to perform favorably in comparison to earlier, well-respected detectors.
arXiv Detail & Related papers (2023-11-07T05:53:09Z)
Diffusion-based 3D Object Detection with Random Boxes [58.43022365393569]
Existing anchor-based 3D detection methods rely on empiricals setting of anchors, which makes the algorithms lack elegance. Our proposed Diff3Det migrates the diffusion model to proposal generation for 3D object detection by considering the detection boxes as generative targets. In the inference stage, the model progressively refines a set of random boxes to the prediction results.
arXiv Detail & Related papers (2023-09-05T08:49:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.