Multi-Model 3D Registration: Finding Multiple Moving Objects in
Cluttered Point Clouds
- URL: http://arxiv.org/abs/2402.10865v1
- Date: Fri, 16 Feb 2024 18:01:43 GMT
- Title: Multi-Model 3D Registration: Finding Multiple Moving Objects in
Cluttered Point Clouds
- Authors: David Jin, Sushrut Karmalkar, Harry Zhang, Luca Carlone
- Abstract summary: We investigate a variation of the 3D registration problem, named multi-model 3D registration.
In the multi-model registration problem, we are given two point clouds picturing a set of objects at different poses.
We want to simultaneously reconstruct how all objects moved between the two point clouds.
- Score: 23.923838486208524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate a variation of the 3D registration problem, named multi-model
3D registration. In the multi-model registration problem, we are given two
point clouds picturing a set of objects at different poses (and possibly
including points belonging to the background) and we want to simultaneously
reconstruct how all objects moved between the two point clouds. This setup
generalizes standard 3D registration where one wants to reconstruct a single
pose, e.g., the motion of the sensor picturing a static scene. Moreover, it
provides a mathematically grounded formulation for relevant robotics
applications, e.g., where a depth sensor onboard a robot perceives a dynamic
scene and has the goal of estimating its own motion (from the static portion of
the scene) while simultaneously recovering the motion of all dynamic objects.
We assume a correspondence-based setup where we have putative matches between
the two point clouds and consider the practical case where these
correspondences are plagued with outliers. We then propose a simple approach
based on Expectation-Maximization (EM) and establish theoretical conditions
under which the EM approach converges to the ground truth. We evaluate the
approach in simulated and real datasets ranging from table-top scenes to
self-driving scenarios and demonstrate its effectiveness when combined with
state-of-the-art scene flow methods to establish dense correspondences.
Related papers
- EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
EgoGaussian is a method capable of simultaneously reconstructing 3D scenes and dynamically tracking 3D object motion from RGB egocentric input alone.
We show significant improvements in terms of both dynamic object and background reconstruction quality compared to the state-of-the-art.
arXiv Detail & Related papers (2024-06-28T10:39:36Z) - Mixed Diffusion for 3D Indoor Scene Synthesis [55.94569112629208]
We present MiDiffusion, a novel mixed discrete-continuous diffusion model architecture.
We represent a scene layout by a 2D floor plan and a set of objects, each defined by its category, location, size, and orientation.
Our experimental results demonstrate that MiDiffusion substantially outperforms state-of-the-art autoregressive and diffusion models in floor-conditioned 3D scene synthesis.
arXiv Detail & Related papers (2024-05-31T17:54:52Z) - ICGNet: A Unified Approach for Instance-Centric Grasping [42.92991092305974]
We introduce an end-to-end architecture for object-centric grasping.
We show the effectiveness of the proposed method by extensively evaluating it against state-of-the-art methods on synthetic datasets.
arXiv Detail & Related papers (2024-01-18T12:41:41Z) - UniQuadric: A SLAM Backend for Unknown Rigid Object 3D Tracking and
Light-Weight Modeling [7.626461564400769]
We propose a novel SLAM backend that unifies ego-motion tracking, rigid object motion tracking, and modeling.
Our system showcases the potential application of object perception in complex dynamic scenes.
arXiv Detail & Related papers (2023-09-29T07:50:09Z) - ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - Attentive and Contrastive Learning for Joint Depth and Motion Field
Estimation [76.58256020932312]
Estimating the motion of the camera together with the 3D structure of the scene from a monocular vision system is a complex task.
We present a self-supervised learning framework for 3D object motion field estimation from monocular videos.
arXiv Detail & Related papers (2021-10-13T16:45:01Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection
Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.
Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z) - MoreFusion: Multi-object Reasoning for 6D Pose Estimation from
Volumetric Fusion [19.034317851914725]
We present a system which can estimate the accurate poses of multiple known objects in contact and occlusion from real-time, embodied multi-view vision.
Our approach makes 3D object pose proposals from single RGB-D views, accumulates pose estimates and non-parametric occupancy information from multiple views as the camera moves.
We verify the accuracy and robustness of our approach experimentally on 2 object datasets: YCB-Video, and our own challenging Cluttered YCB-Video.
arXiv Detail & Related papers (2020-04-09T02:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.